Abstract
We present a method to recognize gestures made by Chinese traffic police based on the static and dynamic descriptor fusion for driver assistance systems and intelligent vehicles. Gesture recognition is made possible by combining the extracted static and dynamic features. First, the point cloud data of human upper body in each frame of input video is obtained to estimate the static descriptor with 2.5D gesture model. Then, the dynamic descriptor is estimated by computing the motion history image of the input RGB video sequence. Finally, the above two descriptors are fused and the mean structural similarity index is used to recognize the gestures made by Chinese traffic police. A comparative study and qualitative evaluation are proposed with other gesture recognition methods, which demonstrate that better recognition results can be obtained using the proposed method on a number of video sequences.















Similar content being viewed by others
References
Bradski G, Davis J (2000) Motion segmentation and pose recognition with motion history gradients. In: Proceedings of IEEE Workshop on Applications of Computer Vision, pp 174–184
Cai ZX, Guo F (2015) Max-covering scheme for gesture recognition of Chinese traffic police. Pattern Anal Applic 18(2):403–418
Eichner M, Ferrari V (2009) Better appearance models for pictorial structures. In: Proceeding of British Machine Vision Conference, London, UK, pp 1–11
Eichner M, Ferrari V (2012) Human pose co-estimation and applications. IEEE Trans Pattern Anal Mach Intell (PAMI) 34(11):2282–2288
Eichner M, Marin-Jimenez M, Zisserman A, Ferrari V (2012) 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int J Comput Vis (IJCV) 99(2):190–214
Ferrari V, Marin-Jimenez M, Zisserman A (2008) Progressive search space reduction for human pose estimation. In: Proceeding of IEEE Conference on Computer Vision & Pattern Recognition, Anchorage, AK, pp 1–8
Guo F, Cai ZX, Tang J (2011) Chinese traffic police gesture recognition in complex scene. In: Proceeding of the 2011 International Joint Conference of IEEE FCST-11, Los Alamitos, USA, pp 1505–1511
Guo F, Tang J, Cai ZX (2013) Automatic recognition of Chinese traffic police gesture based on max-covering scheme. Adv Inf Sci Serv Sci 5(1):428–436
Huang YM, Zhang GB, Li X, Da FP (2011) Improved emotion recognition with novel global utterance-level features. Appl Math Inf Sci 5(2):147–153
Johnson S, Everigham M (2011) Learning effective human pose estimation from inaccurate annotation. In: Proceeding of IEEE Conference on Computer Vision & Pattern Recognition, Colorado Springs, USA, pp 1465–1472
Kang H, Lee CW, Jung K (2004) Recognition-based gesture spotting in video games. Pattern Recogn Lett 25(15):1701–1714
Le QK, Pham CH, Le TH (2012) Road traffic control gesture recognition using depth images. IEEK Trans Smart Process Comput 1(1):1–7
Liu JG, Luo JB, Shan M (2009) Recognizing realistic actions from videos ‘in the wild’. In: Proceeding of IEEE Conference on Computer Vision & Pattern Recognition, Miami, FL, pp 1996–2003
Sapp B, Jordan C, Taskar B (2010) Adaptive pose prior for pictorial structures. In: Proceeding of IEEE Conference on Computer Vision & Pattern Recognition, San Francisco, USA pp 422–429
Singh M, Mandal M, Basu A (2005) Visual gesture recognition for ground air traffic control using the Radon transform. In: Proceeding of IEEE/RSJ International Conference on Intelligent Robots & Systems, Edmonton, Canada, pp 2586–2591
Smisek J, Jancosek M, Pajdla T (2011) 3D with Kinect. In: Proceedings of the 2011 I.E. International Conference on Computer Vision Workshops, Barcelona, Spain, pp 1154–1160
Song Y, Demirdjian D, Davis R (2011) Tracking body and hands for gesture recognition: NATOPS aircraft handling signal database. In: Proceeding of IEEE International Conference on Automatic Face & Gesture Recognition and Workshops, Santa Barbara, CA, pp 500–506
Suau X, Casas JR, Ruiz-Hidalgo J (2011) Real-time head and hand tracking based on 2.5D data. In: Proceedings of the 2011 I.E. International Conference on Multimedia and Expo, Barcelona, Spain, pp 1–6
Tang J, Luo J, Tjahjadi T, Gao Y (2014) 2.5D multi-view gait recognition based on point cloud registration. Sensors 14:6124–6143
Visual Geometry Group (2015) 2D articulated human pose estimation software v1.22, http://groups.inf.ed.ac.uk/calvin/articulated_human_pose_estimation_code/. Accessed 15 May 2015
Yuan T, Wang B (2010) Accelerometer-based Chinese traffic police gesture recognition system. Chin J Electron 19(2):270–274
Zhou W, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Zhou Z, Li ST, Sun B (2014) Extreme learning machine based hand posture recognition in color-depth image. In: Proceedings of Chinese Conference on Pattern Recognition, pp 1–10
Zhu Y, Fujimura K (2010) A Bayesian framework for human body pose tracking from depth image sequences. Sensors 10:5280–5293
Zou BJ, Chen S, Shi C et al (2009) Automatic reconstruction of 3D human motion pose from uncalibrated monocular video sequences based on markerless human motion tracking. Pattern Recogn 42:1559–1571
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (No. 61502537, 91220301), China Postdoctoral Science Foundation (No. 2014 M552154), Hunan Planned Projects for Key Scientific Research Funds (No. 2015WK3006), Postdoctoral Science Foundation of Central South University (No. 126648).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guo, F., Tang, J. & Wang, X. Gesture recognition of traffic police based on static and dynamic descriptor fusion. Multimed Tools Appl 76, 8915–8936 (2017). https://doi.org/10.1007/s11042-016-3497-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3497-9