Offline Signature Identification Using Deep Learning and Euclidean Distance

Hand signature is one of the human characteristics that humans have since birth, which can be used for identity recognition. A high accuracy signature recognition is needed to identify the correct owner of the signature. This study presents signature identification using a combination method between Deep Learning and Euclidean Distance. This study uses three different signature datasets are used in this study which consists of SigComp2009, SigComp2011, and private dataset. First, signature images are preprocessed using binary image conversion, Region of Interest, and thinning. The preprocessed image then has its feature extracted using DenseNet201 and further identified using Euclidean Distance. Several testing scenarios are also applied to measure proposed method robustness, such as using various Pretrained Deep Learning, dataset augmentation, and dataset split ratio modifiers. The best accuracy achieved is 99.44%, with a high precision rate.


Introduction
Signature is human identifier biometrics that is well known and recognized as a tool for identifying a person [1]. A signature is a handwriting or hand stroke with a unique writing style, such as a line of stroke that resembles the name of the signature owner or symbol used as proof of an individual's identity. The signature was recognized as a biometric feature after UNCITRAL established the first digital signature law in the early 90s.
Signature Recognition can be classified into two main groups, which consist of online signature and offline signature. Online signature recorded by using touch screens panels like smartphones or tablets. The recorded signature then has its feature extracted, such as pressure points and the path or steps taken while creating the signature. Offline signature only needs scanning process on the signature image and remove the needed feature based on the scanned image [2].
Offline Signature identification is considered more difficult than online signature since offline signature does not have a dynamic feature that is present on online signature [1]. Offline signatures depend only on the capture signature shape available from the signature image, while online signatures can use various features such as pressure points and velocity of the drawn signature [3].
Signature is used as identity when making a transaction on an online market or e-commerce. Signature is also used as an attendance mark on the high amount of workspace, which is why research on signature identification has recently gotten a lot of attention. Various methods are used to identify a signature, such as research [3] conducted using binary image conversion and image morphology which consist of erosion and dilation as image preprocessing. Convolutional Neural Network is used as both training and identification methods. This study offers a 92% accuracy average as the final result using the dataset from SigComp2011.
This study used Convolutional Neural networks as feature extraction and identification methods. Convolutional Neural Network is also used in this study [4], where the study used median filter, extracted signature line, and centering as image preprocessing. The highest result achieved in this study is 73% accuracy in predicting grayscale signature by using 7:3 training data and testing split data ratio.
Study [5] conducted a study using a random forest classifier to identify handwritten signature and binary image conversion as image preprocessing. This study also implemented various classification methods by using the SigComp2009 Dataset. The highest accuracy obtained is 60%. The problem in this study is that the proposed method is too flexible and has a high chance of false results.
Study [6] used combination methods for signature recognition, such as Principal Components Analysis (PCA) as feature extraction and Euclidean Distance as classification methods. Image preprocessed by using Gray Level Thresholding. Study [6] achieved a 95% accurate result. This is achieved by using a private dataset that consists of two writer classes. The dataset used is too small and need more writer classes Six years later, Study [7] continued the previous study [6] and conducted a similar study using different methods and datasets. This study used ten writer classes as its dataset and used gray scaling as image preprocessing. The preprocessed image then has its dimension changed into 100x100 px and 50x50px, which further has its feature extracted using Gray Level Co-Occurrence (GLCM). The extracted feature is used as an identification process with Euclidean Distance. Study [7] obtained 67.5% accuracy as the study's highest result by splitting the dataset by 3:2 ratio of training and testing data. This study still needs further improvement for a better result, both on the feature extraction process and the amount of dataset used.
The proposed study used the combined method from previous studies, starting from image preprocessing consisting of image conversion, Region of Interest area, and image thinning. One of the signature dataset used is SigComp2011 that is also used in the study [3], while feature extraction is done by using Pretrained Deep Learning, and image classifier using Euclidean Distance similar to study [6] and [7]. The result of this study is a better performance signature identification system using the combined method from a previous study and its performance in several testing scenarios.

Research Methods
This study focused on improving system performance by combining several methods mentioned or tested in the previous study. These methods include using DenseNet201 as a feature extraction method while using Euclidean Distance as an identification method, with various image preprocessing steps to increase the system's final performance further. The general process of the proposed methods is shown in Figure 1.

Figure 1. General Process
Signature Identification was done by doing two separate processes, such as making a training signature feature database and the actual identification process. Both training signatures and testing signatures went through the same image preprocessing and feature extraction. Feature extraction is done by using DenseNet201. Input picture is set into 100x120 pixels, while extracted feature is adjusted into 17280 rows. Extracted training signatures feature saved as feature database and as comparison with test signature features for signature identification. The final result will show the predicted signature class or owner.

Datasets
This study used three different datasets that were also used by the previous research, which consist of ICDAR 2009 Signature Verification Competition (SigComp2009) [8], ICDAR 2011 Signature Verification Competition (SigComp2011) [9], and private dataset. Details of used datasets are shown in Table 1. 105 Datasets will be divided into training signatures dataset and testing signatures dataset. The private dataset is divided into ten training signature images and five testing signature images per class. SigComp2009 consists of 4 training signature images and eight testing signature images on each class, while SigComp2011 consists of 15 training signature images and nine testing signature images. Different proportion on the dataset is applied to find out the impact of modified dataset total to system performance.

Image Preprocessing
Both training and testing signature images will go through various image preprocessing methods. Image preprocessing needed to be done since original signature images are affected by the different conditions when captured, such as different lighting and noises from the scanning device [10]. Image preprocessing steps are shown in Figure 3.

Figure 3. Preprocessing Steps
The result of image preprocessing is shown in Figure 4. Original signature image (a) will be converted into grayscale image (b), then further converted into a binary image (c). The region of interest (ROI) method is applied to the binary signature image to reduce the background image (d). ROI can remove unused background from the image for the system to do better and faster processing [11].

Figure 4. Image Preprocessing Process
Image preprocessing then continued into thinning (e). Thinning is one of the morphological image operations used to remove foreground pixels from the binary image. Thinning can also be defined as reducing the image to some extend and preserved the points needed for image processing [12].

Feature Extraction
The preprocessed image gets its feature extracted using Pretrained Deep Learning. Pretrained Deep Learning is a series of neural networks used to classify the object. Pretrained Deep Learning is also called Transfer Learning and can save time since researchers do not need to train the models from scratch like traditional Convolutional Neural networks (CNN) [13]. CNN consists of neural networks with untrained weights and bias, which makes CNN take longer time to do the identification process [14]. There are various Pretrained Deep Learning architecture model, such as Inception [15], Xception [16], VGG [17], ResNet [18], MobileNet [19], and DenseNet [20]. Preferred Pretrained Deep Learning model must be loaded first so the model can extract the feature within images. Feature Extraction steps are shown in Figure 5.

Figure 5. Feature Extraction Steps
The output of feature extraction is a difference based on the initial input. Training signature images will get their feature extracted and saved on a single folder as a csv file, which will be used in the identification process. Testing signature images will get their feature extracted and directly compared with saved training signature features.

Signature Identification
Testing signature image feature will be compared to training signature feature that has been saved. Euclidean Distance will be used to calculate the similarity between both features. The identification process will be shown in Figure 6.

Figure 6. Identification Process
Euclidean Distance is a method to calculate distance between 2 points. Euclidean Distance draws a straight line between these 2 points [7] Euclidean Distance equation used in this study is shown below.

Result and Discussion
This study conducts several tests on different scenarios to measure the robustness and real case problems. The proposed method is tested using augmented training images, different ratio applications on the used dataset, and comparing three datasets mentioned in section 2.1.
The first test is conducted using the SigComp2011 [9] dataset to evaluate each Pretrained Deep Learning model. There are 11 Pretrained Deep Learning used on this test, which is shown in Table 2. For accuracy, the best Pretrained Deep Learning to use for this study is DenseNet201, which has 99.43% accuracy. MobileNet and VGG are not far behind, with 98.86% and 96.59% accuracy values, respectively. DenseNet201 provides the best result as DenseNet architecture has additional inputs from all preceding layers, making the network compact and thinner. This is beneficial since the signature dataset used is not a high-resolution image.
The next test is to add several distance-based measurements to compare with DenseNet201 as a feature extraction method. The distance method used are Manhattan Distance, Minkowski Distance, and Cosine Distance.

Table 3. Augmented Image Result
The test result shows that only Manhattan Distance has a different result. This result is because Manhattan Distance is only optimized for integer calculation, while the extracted feature has a float number.
The third test is using the Augmentation Dataset, which consists of brightness and rotation modification. These augmentations are used because these are the most relevant on signature real case problems. Brightness modifications have five values between the range of 0.5 to 0.9 of the original image brightness, while rotation modifications have ten values between the range of -10 to 10. The original dataset used on this test is SigComp2011 [9].

Table 4. Augmented Image Result
The test result on augmented training signature image is underwhelming since its lower than the normal test result, not to mention the amount of time consumed to augment the images and extract its feature. The highest accuracy was achieved by brightness augmentation, which gives a 99.43% accuracy value, the same as the highest accuracy achieved on the normal dataset.
The fourth test is to modify the split data ratio. Ratio split is used on all signature images of the used datasets to divide signature images into training data and testing data. The range of ratio split starts from 0.1 to 0.9, with a 0.1 increase value on each iteration. The dataset used on this test is a private dataset consisting of 400 images in 50 writer classes, while the Pretrained Deep Learning model used DenseNet201. This test is carried to find out the effect of different value data training and data testing used on system performance.  Table 6. Datasets Detail Table 6 show the detail of multiple datasets that used in this study. SigComp2009 has 78 writer classes which consist of 4 training signature images and eight testing signature images, while SigComp2011 has 20 writer classes which consist of 10 training signature images and four testing images. The private dataset has 50 classes and consists of 10 training signature images and five testing signature images.   Figure 9 represents the SigComp2011 ROC graph. Equal Error Rate is obtained on threshold 90 with 0,0057 value, and Genuine Acceptance Rate acquired is 99%. SigComp2011 has better results compared to SigComp2009 since SigComp2011 has fewer writer classes.  Total Training Image  Total Testing Image  SigComp2009  78  4  8  SigComp2011  20  15  4  Private  50 Table 7 represents the result of the signature identification test using the Receiver Operation Characteristic (ROC) approach. SigComp2011 dataset has 99% Genuine Acceptance Rate (GAR) value, while both SigComp20009 and Private dataset has GAR with 91% value. This result shows that the number of classes, used training, and testing signature images significantly impact identification accuracy.

Conclusion
This study proposed an offline signature identification using combination methods between Pretrained Deep Learning and Euclidean Distance. Pretrained Deep Learning is used as feature extraction, while Euclidean Distance is used as an identification method. Various Pretrained Deep Learning such as DenseNet, Inception, ResNet, VGG, Xception, and MobileNet are evaluated as a comparison for finding the best result. Several scenarios of testing are also conducted to measure the robustness of the proposed method in various conditions. The highest accuracy was measured using DenseNet201 as a feature extraction method, which gives a 99.43% accuracy value. This Pretrained Deep Learning is also used on other databases, such as Sigcomp2009 and private databases. The result of the test using those databases are both 91.00%