Abstract
This paper proposes a new local descriptor for action recognition in depth images using second-order directional Local Derivative Patterns (LDPs). LDP relies on local derivative direction variations to capture local patterns contained in an image region. Our proposed local descriptor combines different directional LDPs computed from three depth maps obtained by representing depth sequences in three orthogonal views and is able to jointly encode the shape and motion cues. Moreover, we suggest the use of Sparse Coding-based Fisher Vector (SCFVC) for encoding local descriptors into a global representation of depth sequences. SCFVC has been proven effective for object recognition but has not gained much attention for action recognition. We perform action recognition using Extreme Learning Machine (ELM). Experimental results on three public benchmark datasets show the effectiveness of the proposed approach.









Similar content being viewed by others
References
Amor BB, Su J, Srivastava A (2016) Action recognition using Rate-Invariant analysis of skeletal shape trajectories. TPAMI 38(1):1–13
Boiman O, Shechtman E, Irani M (2008) Defense of nearest-neighbor based image classification. In: CVPR, pp 1–8
Breiman L (2001) Random Forests. Mach Learn 45(1):5–32
Chaaraoui AA, Padilla-Lopez JR, Florez-Revuelta F (2013) Fusion of skeletal and Silhouette-Based features for human action recognition with RGB-d devices. In: ICCVW, pp 91–97
Chen C, Jafari R, Kehtarnavaz N (2015) Action recognition from depth sequences using depth motion maps-based local binary patterns. In: WACV, pp 1092–1099
Chen C, Liu K, Kehtarnavaz N (2016) Real-time Human Action Recognition Based on Depth Motion Maps. J Real-Time Image Proc 12(1):155–163
Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: CVPR, pp 1110–1118
Evangelidis G, Singh G, Horaud R (2014) Skeletal quads: Human action recognition using joint quadruples. In: ICPR, pp 4513–4518
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin, CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
Gowayyed MA, Torki M, Hussein ME, El-Saban M (2013) Histogram of oriented displacements (HOD): describing trajectories of human joints for action recognition. In: IJCAI, pp 1351–1357
Huang GB, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16–18):3056–3062
Huang GB, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18):3460–3468
Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: IEEE international joint conference on neural networks, vol. 2, pp 985–990
Kurakin A, Zhang Z, Liu Z (2012) A Read-Time system for dynamic hand gesture recognition with a depth sensor. In: EUSIPCO, pp 1975–1979
Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: CVPR, pp 1–8
Li W, Zhang Z, Liu Z (2008) Expandable Data-Driven graphical modeling of human actions based on salient postures. IEEE Transactions on Circuits and Systems for Video Technology 18(11):1499–1510
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: CVPRW, pp 9–14
Liang C, Chen E, Qi L, Guan L (2016) Improving action recognition using collaborative representation of local depth map feature. IEEE Signal Processing Letters 23(9):1241–1245
Liu L, Shen C, Wang L, van den Hengel A, Wang C (2014) Encoding high dimensional local features by sparse coding based fisher vectors. In: NIPS, pp 1143–1151
Luo J, Wang W, Qi H (2013) Group sparsity and geometry constrained dictionary learning for action recognition from depth maps. In: ICCV, pp 1809–1816
Mairal J, Bach F, Ponce J, Sapiro G, Jenatton R, Obozinski G SPAMS: SPArse modeling software, v2.4. http://spams-devel.gforge.inria.fr/downloads.html
Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. TPAMI 27(10):1615–1630
Müller M (2007) Information retrieval for music and motion. Springer-VerlagInc, New York
Murray RM, Sastry SS, Zexiang L (1994) A mathematical introduction to robotic manipulation. Crc Press, Inc
Ngo CW, Pong TC, Chin RT (1999) Detection of gradual transitions through temporal slice analysis. In: CVPR, pp 36–41
Ojala T, Pietikainen M, Harwood D (1994) Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In: Proceedings of the 12th IAPR international conference on pattern recognition, vol. 1, pp 582–585
Oreifej O, Liu Z (2013) HON4d: Histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR, pp 716–723
Padilla-López JR, Chaaraoui AA, Flórez-Revuelta F (2014) A discussion on the validation tests employed to compare human action recognition methods using the msr action3d dataset. CoRR arXiv:1407.7390
Rahmani H, Mian A (2016) 3D action recognition from novel viewpoints. In: CVPR, pp 1506–1515
Sanchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. IJCV 105(3):222–245
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Song Y, Liu S, Tang J (2015) Describing trajectory of surface patch for human action recognition on RGB and depth videos. IEEE Signal Process Lett 22(4):426–429
Vedaldi A, Fulkerson B (2010) Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of the 18th ACM international conference on multimedia, pp 1469–1472
Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: CVPR, pp 588–595
Wang C, Wang Y, Yuille AL (2013) An approach to Pose-Based action recognition. In: CVPR, pp 915–922
Wang J, Liu Z, Chorowski J, Chen Z, Wu Y (2012) Robust 3D action recognition with random occupancy patterns. In: ECCV, pp 872–885
Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: CVPR, pp 1290–1297
Yang X, Tian Y (2014) Super normal vector for activity recognition using depth sequences. In: CVPR, pp 804–811
Yang X, Tian YL (2012) EigenJoints-based action recognition using naive-bayes-nearest-neighbor. In: CVPRW, pp 14–19
Yang X, Zhang C, Tian Y (2012) Recognizing Actions Using Depth Motion Maps-based Histograms of Oriented Gradients. In: Proceedings of the 20th ACM international conference on multimedia, pp 1057–1060
Zhang B, Gao Y, Zhao S, Liu J (2010) Local derivative pattern versus local binary pattern: face recognition with High-Order local pattern descriptor. IEEE Trans Image Process 19(2):533–544
Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. TPAMI 29(6):915–928
Zhu Y, Chen W, Guo G (2013) Fusing spatiotemporal features and joints for 3D action recognition. In: CVPRW, pp 486–491
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nguyen, X.S., Nguyen, T.P., Charpillet, F. et al. Local derivative pattern for action recognition in depth images. Multimed Tools Appl 77, 8531–8549 (2018). https://doi.org/10.1007/s11042-017-4749-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4749-z