Abstract
This paper presents a new video representation that exploits the geometric relationships among trajectories for human action recognition. Geometric relationships are provided by applying the Delaunay triangulation method on the trajectories of each video frame. Then, graph encoding method called bag of graphs (BOG) is proposed to handle the geometrical relationships between trajectories. BOG considers local graph descriptors to learn a more discriminative graph-based codebook and to represent the video with a histogram of visual graphs. The graph-based codebook is composed of the centers of graph clusters. To define graph clusters, a classification graph technique based on the Hungarian distance is proposed. Experiments using the human action recognition datasets (Hollywood2 and UCF50) show the effectiveness of the proposed approach.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ben Aoun, N., Elghazel, H., Ben Amar, C.: Graph modeling based video event detection. In: IIT, pp. 114–117 (2011)
Bouchrika, T., Zaied, M., Jemai, O., Ben Amar, C.: Neural solutions to interact with computers by hand gesture recognition. MTA, 1–27 (2013)
Mejdoub, M., Fonteles, L., Ben Amar, C., Antonini, M.: Embedded lattices tree: An efficient indexing scheme for content based retrieval on image databases. JVCI 20, 145–156 (2009)
Ben Aoun, N., Elghazel, H., Hacid, M.-S., Ben Amar, C.: Graph aggregation based image modeling and indexing for video annotation. In: Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W. (eds.) CAIP 2011, Part II. LNCS, vol. 6855, pp. 324–331. Springer, Heidelberg (2011)
Laptev, I., lek, M.M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC (2010)
Klaser, A., Marsza lek, M., Schmid, C.: A spatio-temporal descriptor based on 3Dgradients. In: BMVC (2008)
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR (2011)
Uemura, H., Ishikawa, S., Mikolajczyk, K.: Feature tracking and motion compensation for action recognition. In: BMVC (2008)
Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: ICCV (2009)
Wang, F., Jiang, Y.G., Ngo, C.W.: Video event detection using motion relativity and visual relatedness. In: ACM MM (2008)
Raptis, M., Soatto, S.: Tracklet descriptors for action modeling and video analysis. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 577–590. Springer, Heidelberg (2010)
Matikainen, P., Hebert, M., Sukthankar, R.: Representing pairwise spatial and temporal relations for action recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 508–521. Springer, Heidelberg (2010)
Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.S., Li, J.: Hierarchical spatiotemporal context modeling for action recognition. In: CVPR (2009)
Jiang, Y.-G., Dai, Q., Xue, X., Liu, W., Ngo, C.-W.: Trajectory-based modeling of human actions with motion reference points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 425–438. Springer, Heidelberg (2012)
Jain, M., Jou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: CVPR (2013)
Dammak, M., Mejdoub, M., Amar, C.B.: Feature Vector Approximation Based on Wavelet Network. ICAART, 394–399 (2012)
Mejdoub, M., Ben Amar, C.: Classification improvement of local feature vectors over the KNN algorithm. Multimedia Tools and Applications, 197–218 (2013)
Sekma, M., Mejdoub, M., Ben Amar, C.: Human action recognition using temporal segmentation and accordion representation. In: Wilson, R., Hancock, E., Bors, A., Smith, W. (eds.) CAIP 2013, Part II. LNCS, vol. 8048, pp. 563–570. Springer, Heidelberg (2013)
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: ICCV, vol. 2, pp. 1470–1477 (2003)
Sekma, M., Mejdoub, M., Ben Amar, C. Spatio-temporal pyramidal accordion representation for human action recognition. In: ICASSP, pp. 1270–1274 (2014)
Mejdoub, M., Fonteles, L., Ben Amar, C., Antonini, M.: Fast indexing method for image retrieval using tree-structured lattices. In: CBMI, pp. 365–372 (2008)
Mejdoub, M., Fonteles, L., Ben Amar, C., Antonini, M.: Fast algorithm for image database indexing based on lattice. In: EUSIPCO, pp. 1799–1803 (2007)
Hashimoto, M., Cesar Jr., R.M.: Object Detection by Keygraph Classification. In: Torsello, A., Escolano, F., Brun, L. (eds.) GbRPR 2009. LNCS, vol. 5534, pp. 223–232. Springer, Heidelberg (2009)
Jouili, S., Mili, I., Tabbone, S.: Attributed graph matching using local descriptions. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2009. LNCS, vol. 5807, pp. 89–99. Springer, Heidelberg (2009)
Wang, H. Schmid, C.: Action recognition with improved trajectories. In: ICCV (2013)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, June 2005
Vig, E., Dorr, M., Cox, D.: Space-variant descriptor sampling for action recognition based on saliency and eye movements. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 84–97. Springer, Heidelberg (2012)
Mathe, S., Sminchisescu, C.: Dynamic eye movement datasets and learnt saliency models for visual action recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 842–856. Springer, Heidelberg (2012)
Jgou, H., Douze, M., Schmid, C., Prez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)
Sanchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the Fisher vector: Theory and practice. IJCV, 222–245 (2013)
Mahboubi, A., Benois-P, J., Barba, D.: Joint tracking of polygonal and triangulated meshes of objects in moving sequences with time varying content. In: ICIP, vol. 2, pp. 403–406 (2001)
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR (2009)
Reddy, K., Shah, M.: Recognizing 50 human action categories of web videos. In: MVA, pp 1–11 (2012)
B. Chakraborty, M.B. Holte, T.B. Moeslund, J. Gonzàlez, “Selective Spatio-Temporal interest points”, In CVIU, pp 396–410, 2012
Cho, J., Lee, M., Chang, H., Oh, S.: Robust action recognition using local motion and group sparsity. Pattern Recognition, pp 1813–1825 (2014)
Wang, H., Klaser, A., Schmid, C., Liu, C.-L.: Dense trajectories and motion boundary descriptors for action recognition. IJCV, 60–79 (2013)
Kliper-Gross, O., Gurovich, Y., Hassner, T., Wolf, L.: Motion interchange patterns for action recognition in unconstrained videos. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 256–269. Springer, Heidelberg (2012)
Solmaz, B., Assari, S.M., Shah, M.: Classifying web videos using a global video descriptor. MVA, 1–13 (2012)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV 42(3), 144–175 (2001)
Shi, F., Petriu, E., Laganiere, R.: Sampling strategies for real-time action recognition. In: CVPR (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sekma, M., Mejdoub, M., Ben Amar, C. (2015). Bag of Graphs with Geometric Relationships Among Trajectories for Better Human Action Recognition. In: Murino, V., Puppo, E. (eds) Image Analysis and Processing — ICIAP 2015. ICIAP 2015. Lecture Notes in Computer Science(), vol 9279. Springer, Cham. https://doi.org/10.1007/978-3-319-23231-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-23231-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23230-0
Online ISBN: 978-3-319-23231-7
eBook Packages: Computer ScienceComputer Science (R0)