Deep Learning models for Video based Facial Recognition Systems: A Survey

  IJCTT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© 2018 by IJCTT Journal
Volume-60 Number-3
Year of Publication : 2018
Authors : K.Sunitha
DOI :  10.14445/22312803/IJCTT-V60P122

MLA

K.Sunitha "Deep Learning models for Video based Facial Recognition Systems: A Survey". International Journal of Computer Trends and Technology (IJCTT) V60(3):144-150 June 2018. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract
Deep learning has recently achieved very promising results in a wide range of areas such as computer vision, speech recognition and natural language processing. It aims to learn hierarchical representations of data by using deep architecture models. Face recognition (FR) systems for video surveillance (VS) applications attempt to accurately detect the presence of target individuals over a distributed network of cameras. Specifically, in still-to-video FR application, a single high-quality reference still image captured with still camera under controlled conditions is employed to generate a facial model to be matched later against lower-quality faces captured with video cameras under uncontrolled conditions. Current video-based FR systems can perform well on controlled scenarios, while their performance is not satisfactory in uncontrolled scenarios mainly because of the differences between the source (enrollment) and the target (operational) domains. Most of the efforts in this area have been toward the design of robust video-based FR systems in unconstrained surveillance environments. deep learning architectures proposed in the literature based on triplet-loss function (e.g., cross-correlation matching CNN, trunk-branch ensemble CNN and HaarNet) and supervised autoencoders (e.g., canonical face representation CNN) are studied.

Reference
[1] Zheng, J., Patel, V.M., Chellappa, R.: Recent developments in video-based face recognition. In: Handbook of Biometrics for Forensic Science, pp. 149–175. Springer (2017)
[2] Bashbaghi, S., Granger, E., Sabourin, R., Bilodeau, G.A.: Robust watch-list screening using dynamic ensembles of svms based on multiple face representations. Machine Vision and Applications 28(1), 219–241 (2017)
[3] Gomerra, M., Granger, E., Radtke, P.V., Sabourin, R., Gorodnichy, D.O.: Partially-supervised learning from facial trajectories for face recognition in video surveillance. Information Fusion 24(0), 31–53 (2015)
[4] Pagano, C., Granger, E., Sabourin, R., Marcialis, G., Roli, F.: Adaptive ensembles for face recognition in changing video surveillance environments. Information Sciences 286, 75–101 (2014)
[5] Bashbaghi, S., Granger, E., Sabourin, R., Bilodeau, G.A.: Dynamic ensembles of exemplarsvms for still-to-video face recognition. Pattern Recognition 69, 61 – 81 (2017)
[6] Barr, J.R., Bowyer, K.W., Flynn, P.J., Biswas, S.: Face recognition from video: A review. International Journal of Pattern Recognition and Artificial Intelligence 26(05) (2012)
[7] Matta, F., Dugelay, J.L.: Person recognition using facial video information: A state of the art. Journal of Visual Languages and Computing 20(3), 180 – 187 (2009)
[8] Dewan, M.A.A., Granger, E., Marcialis, G.L., Sabourin, R., Roli, F.: Adaptive appearance model tracking for still-to-video face recognition. Pattern Recognition 49, 129 – 151 (2016)
[9] Huang, Z., Shan, S., Wang, R., Zhang, H., Lao, S., Kuerban, A., Chen, X.: A benchmark and comparative study of video-based face recognition on cox face database. IP, IEEE Trans on 24(12), 5967–5981 (2015)
[10] Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: CVPR (2014)
[11] Bashbaghi, S., Granger, E., Sabourin, R., Bilodeau, G.A.: Watch-list screening using ensembles based on multiple face representations. In: ICPR, pp. 4489–4494 (2014)
[12] Kamgar-Parsi, B., Lawson, W., Kamgar-Parsi, B.: Toward development of a face recognition system for watchlist surveillance. PAMI, IEEE Trans on 33(10), 1925–1937 (2011)
[13] Kan, M., Shan, S., Su, Y., Xu, D., Chen, X.: Adaptive discriminant learning for face recognition.Pattern Recognition 46(9), 2497–2509 (2013)
[14] Yang, M., Van Gool, L., Zhang, L.: Sparse variation dictionary learning for face recognition with a single training sample per person. In: ICCV (2013)
[15] Mokhayeri, F., Granger, E., Bilodeau, G.A.: Synthetic face generation under various operational conditions in video surveillance. In: ICIP (2015)
[16] Ma, A., Li, J., Yuen, P., Li, P.: Cross-domain person re-identification using domain adaptation ranking svms. IP, IEEE Trans on 24(5), 1599–1613 (2015)
[17] Chellappa, R., Chen, J., Ranjan, R., Sankaranarayanan, S., Kumar, A., Patel, V.M., Castillo, C.D.: Towards the design of an end-to-end automated system for image and video-based recognition. CoRR abs/1601.07883 (2016)
[18] Huang, G.B., Lee, H., Learned-Miller, E.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: CVPR (2012)
[19] Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: CVPR (2015) [20]. Sun, Y., Wang, X., Tang, X.: Hybrid deep learning for face verification. In: ICCV (2013)
[20] Sun, Y.,Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: CVPR (2014)
[21] Ding, C., Tao, D.: Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans on PAMI PP(99), 1–14 (2017). DOI 10.1109/TPAMI.2017.2700390
[22] Parchami, M., Bashbaghi, S., Granger, E.: Cnns with cross-correlation matching for face recognition in video surveillance using a single training sample per person. In: AVSS (2017)
[23] Parchami, M., Bashbaghi, S., Granger, E.: Video-based face recognition using ensemble of haar-like deep convolutional neural networks. In: IJCNN (2017)
[24] Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: BMVC (2015)
[25] Gao, S., Zhang, Y., Jia, K., Lu, J., Zhang, Y.: Single sample face recognition via learning deep supervised autoencoders. IEEE Transactions on Information Forensics and Security 10(10), 2108–2118 (2015)
[26] Parchami, M., Bashbaghi, S., Granger, E., Sayed, S.: Using deep autoencoders to learn robust domain-invariant representations for still-to-video face recognition. In: AVSS (2017)
[27] Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: NIPS (2014)
[28] Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)
[29] BRITZ, D. Understanding convolutional neural networks. In: WILDML [Online]. 2015.Available at: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/.
[30] TOBIAS, L., A. DUCOURNAU, F. ROUSSEAU,G. MERCIER and R. FABLET. Convolutional Neural Networks for object recognition on mobile devices: A case study. In: 23rd International Conference on Pattern Recognition (ICPR). Cancun: IEEE, 2016, pp. 3530–3535. ISBN 978-1-5090-4847-2. DOI: 10.1109/ICPR.2016.7900181.
[31] GUO, S., S. CHEN and Y. LI. Face recognition based on convolutional neural network and support vector machine. In: IEEE International Conference on Information and Automation (ICIA).Ningbo: IEEE, 2016, pp. 1787–1792. ISBN 978-1-5090-4102-2. DOI: 10.1109/ICInfA.2016.7832107.
[32] B. Huang, H. Lee, and E. G. Learned-Miller, “Learning hierarchical representations for face verification with convolutional deep belief networks,” in IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 2518–2525.
[33] T. Ojala, M. Pietik¨ainen, and D. Harwood, “A comparative study of texture measures with classification based on featured distributions,” Pattern Recognition, vol. 29, no. 1, pp. 51–59, 1996.
[34] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the gap to human-level performance in face verification,” in IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1701–1708.
[35] Y. Sun, X. Wang, and X. Tang, “Deep learning face representation from predicting 10, 000 classes,” in IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1891–1898.
[36] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[37] J. Lu, V. E. Liong, G. Wang, and P. Moulin, “Joint feature learning for face recognition,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 7, pp. 1371–1383, 2015.

Keywords
Deep Learning, Face Recognition, Video Surveillance, CNN.