Abstract
This paper presents a new method for human action recognition fusing depth images and skeletal maps. Each depth image is represented by 2D and 3D auto-correlation of gradients features. A feature using spatial and orientational auto-correlation is extracted from depth images. Mutual information is used to define the similarity of each frame in the skeleton sequence, and then extract the key frames from the skeleton sequence. The skeleton feature extracted from the key frames as complementary features to cope with the temporal information loss in depth images. Each set of feature is used as input to two extreme learning machine classifiers and assign different weight to each set of feature. Using different classifier weights provides more flexible to different features. The final class label is determined according to the fused result. Experiments conducted on MSR_Action3D depth action data set show the accuracy of this proposed method is 1.5% higher than the state-of-the-art action recognition methods.







Similar content being viewed by others

References
Chen C, Jafari R, Kehtarnavaz N. (2015) Action recognition from depth sequences using depth motion maps-based local binary patterns
Chen C, Hou ZJ, Zhang BC (2015) Gradient local auto-correlations and extreme learning machine for depth-based activity recognition
Chen C, Liu K, Kehtarnavaz N (2016) Real-time human action recognition based on depth motion maps. J Real-Time Image Proc 12(1):155–163
Chen C, Zhang B, Hou ZJ, Jiang J, Liu M, Yang Y (2017) Action recognition from depth sequences using weighted fusion of 2d and 3d auto-correlation of gradients features. Multimed Tools Appl 76(3):4651–4669
Dame A, Marchand E (2012) Second-order optimization of mutual information for real-time image registration. IEEE Trans Image Process 21(9):4190
Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B 42(2):513–529
Ke Q, Bennamoun M, An S, Sohel F, Boussaid F (2018) Learning clip representations for skeleton-based 3d action recognition. IEEE Trans Image Process PP(99):1–1
Kobayashi T, Otsu N (2012) Motion recognition using local auto-correlation of space-time gradients. Pattern Recogn Lett 33(9):1188–1195
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points
Liang F, Zhang ZL, Xiang Y, Tong Z (2015) Action recognition of human’s lower limbs in the process of human motion capture. Journal of Computer-Aided Design and Computer Graphics 12(27):2419–2427
Liu J, Wang G, Duan LY, Abdiyeva K, Kot AC (2018) Skeleton-based human action recognition with global context-aware attention lstm networks. IEEE Trans Image Process PP(99):1–1
Liu XY, Li P, Gao CH (2011) Fast leave-one-out cross-validation algorithm for extreme learning machine. Journal of Shanghai Jiaotong University 45(8):1140–1145
Lu ZQ, Hou ZJ, Chen C, Liang JZ (2016) Action recognition based on depth images and skeleton data. J Comput Appl 36(11):2979–2984
Oreifej O, Liu Z (2013) HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: Proceedings of 2013 IEEE conference on computer vision and pattern recognition. IEEE Computer Society Press, Los Alamitos, pp 716–723
Shan Y, Zhang Z, Huang K (2016) Visual human action recognition:history,status and prospects. Journal of Computer Research and Development 53(1):93–112
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. Commun ACM 56(1):1297–1304
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2013) Real-time human pose recognition in parts from single depth images. Springer, Berlin
Shen XX (2014) Research on human behavior recognition alogrithms based on depth information. Tianjin University of Technology
Xu HN, Chen EQ, Liang CB (2016) Three-dimensional spatio-temporal feature extraction method for action recognition. J Comput Appl 36(2):568
Yang X, Tian YL (2012) Eigenjoints-based action recognition using naïve-bayes-nearest-neighbor. Springer, Berlin
Yang X, Zhang C, Tian YL (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients
Zhang H, Zhong P, He J, Xia C (2017) Combining depth-skeleton feature with sparse coding for action recognition. Neurocomputing 230(C):417–426
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, Y., Hou, Z., Liang, J. et al. Action recognition using weighted fusion of depth images and skeleton’s key frames. Multimed Tools Appl 78, 25063–25078 (2019). https://doi.org/10.1007/s11042-019-7593-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7593-5