Журнал «Информационные технологии и вычислительные системы» - N. S. Skoryukina, E. A. Shalnova, V. V. Arlazarov "Method for Detecting False Responses of Localization and Identification Algorithms Using Global Features"

The paper presents a method for detecting false responses of localization and identification algorithms. The method considers matching image characteristics that cannot be described by local features stably and completely. It is proposed to use image zones containing such features, describe them and use them to assess the validity of the algorithm response. In the work we demonstrate how the algorithm works on ID documents. Possible features are images of the coats of arms and flags of countries, background filling and text unique to the considered document type. To illustrate the proposed algorithm, the MIDV-500 and MIDV-LAIT datasets were taken. The first is used to show that the rejector does not reject correct system responses, the second - that it rejects the incorrect ones. We test several methods of zone description. The experimental results show that false type selection decreases with the use of any description type and the local CNN-descriptor shows the best performance. The increase of classes with marked zones is shown to improve the filtration of false responses. The experiments show the improvement from by 13% with one type with zones to by 4 times with 10 types.

EDN CGRAFY

References

1. Arlazarov, V.L., Arlazarov, V.V., Bulatov, K.B., & et al. (2022). Mobile ID Document Recognition–Coarse-to-Fine Approach. Pattern Recognit, Image Anal, 32, 89–108 . doi: 10.1134/S1054661822010023.

2. Attivissimo, F., Giaquinto, N., Scarpetta, M., & Spadavecchia, M. (2019). An Automatic Reader of Identity Documents. 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), 3525-3530, doi: 10.1109/SMC.2019.8914438.

3. Das Neves, R. B., Felipe Vercosa, L., Macedo, D., Dantas Bezerra, B. L., & Zanchettin, C. (2020). A Fast Fully Octave Convolutional Neural Network for Document Image Segmentation. 2020 International Joint Conference on Neural Networks (IJCNN). doi: 10.1109/IJCNN48605.2020.9206711.

4. Augereau, O., Journet, N., & Domenger, J.-P. (2013). Semistructured document image matching and recognition. Document Recognition and Retrieval XX. doi:10.1117/12.2003911.

5. Awal, A. M., Ghanmi, N., Sicre, R., & Furon, T. (2017). Complex Document Classification and Localization Application on Identity Document Images. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). doi:10.1109/icdar.2017.77.

6. Skoryukina, N., Arlazarov, V., & Nikolaev, D. (2019). Fast Method of ID Documents Location and Type Identification for Mobile and Server Application. 2019 International Conference on Document Analysis and Recognition (ICDAR). doi:10.1109/icdar.2019.00141.

7. Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). Speeded-Up Robust Features (SURF). Computer Vision and Image Understanding, 110(3), 346–359. doi:10.1016/j.cviu.2007.09.014.

8. Lowe, D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2), 91–110. doi:10.1023/B:VISI.0000029664.99615.94.

9. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395. doi:10.1145/358669.358692.

10. Bin Fan, Qingqun Kong, Trzcinski, T., Zhiheng Wang, Chunhong Pan, & Fua, P. (2014). Receptive Fields Selection for Binary Feature Description. IEEE Transactions on Image Processing, 23(6), 2583–2595. doi:10.1109/TIP.2014.2317981.

11. Dalal, N., & Triggs, B. (n.d.). Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). doi:10.1109/CVPR.2005.177

12. Suárez, I., Sfeir, G., Buenaposada, J. M., & Baumela, L. (2020). BEBLID: Boosted Efficient Binary Local Image Descriptor. Pattern Recognition Letters. doi:10.1016/j.patrec.2020.04.005.

13. Bay, H., Ferraris, V., & Van Gool, L.(2005). Wide-Baseline Stereo Matching with Line Segments. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, 329-336. doi:10.1109/CVPR.2005.375.

14. Muja, M., & Lowe, D. G. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP (1), 2(331-340), 2. doi:10.5220/0001787803310340

15. Raguram, R., Chum, O., Pollefeys, M., Matas, J., & Frahm, J.-M. (2013). USAC: A Universal Framework for Random Sample Consensus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 2022–2038. doi:10.1109/TPAMI.2012.257

16. Skoryukina, N., Faradjev, I., Bulatov, K., & Arlazarov, V.V. (2020). Impact of geometrical restrictions in RANSAC sampling on the ID document classification. Twelfth International Conference on Machine Vision (ICMV 2019), 1143306. doi:10.1117/12.2559306

17. Chiron, G., Ghanmi, N., & Awal, A. M. (2021). ID documents matching and localization with multi-hypothesis constraints. 2020 25th International Conference on Pattern Recognition (ICPR). doi:10.1109/ICPR48806.2021.9412437.

18. Chiron, G., Ghanmi, N., & Awal, A. M. (2021). ID documents matching and localization with multi-hypothesis constraints. 2020 25th International Conference on Pattern Recognition (ICPR). doi:10.1109/ICPR48806.2021.9412437.

19. Rusiñol, M., & Lladós, J. (2009). Logo Spotting by a Bagof-words Approach for Document Categorization. 2009 10th International Conference on Document Analysis and Recognition. doi:10.1109/ICDAR.2009.103.

20. Arlazarov, V. V., Bulatov, K., Chernov, T., & Arlazarov, V. L. (2019). MIDV-500: A Dataset for Identity Document Analysis and Recognition on Mobile Devices in Video Stream. Computer Optics, 43(5), 818–824. doi:10.18287/2412-6179-2019-43-5-818-824.

21. Chernyshova, Y., Emelianova, E., Sheshkus, A., & Arlazarov, V.V. (2021). MIDV-LAIT: a challenging dataset for recognition of IDs with Perso-Arabic, Thai, and Indian scripts. The 16th International Conference on Document Analysis and Recognition (ICDAR). doi:10.1007/978-3-030-86331-9_17.

22. Trzcinski, T., Christoudias, M., Fua, P., & Lepetit, V. (2013). Boosting Binary Keypoint Descriptors. 2013 IEEE Conference on Computer Vision and Pattern Recognition. doi:10.1109/CVPR.2013.370.

23. Sheshkus, A., Chirvonaya, A., & Arlazarov, V.L.(2022). Tiny CNN for feature point description for document analysis: approach and dataset. Computer Optics, 46(3), 429-435. doi:10.18287/2412-6179-CO-1016.