Human Action Recognition through the First-Person Point of view, Case Study Two Basic Task

Mohammad Almasi; Hamed Fathi; Sayed Adel Ghaeinian; Samaneh Samiee

Research Article

Human Action Recognition through the First-Person Point of view, Case Study Two Basic Task

by Mohammad Almasi, Hamed Fathi, Sayed Adel Ghaeinian, Samaneh Samiee

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 177 - Issue 24

Published: Dec 2019

Authors: Mohammad Almasi, Hamed Fathi, Sayed Adel Ghaeinian, Samaneh Samiee

10.5120/ijca2019919703

PDF

Mohammad Almasi, Hamed Fathi, Sayed Adel Ghaeinian, Samaneh Samiee . Human Action Recognition through the First-Person Point of view, Case Study Two Basic Task. International Journal of Computer Applications. 177, 24 (Dec 2019), 19-23. DOI=10.5120/ijca2019919703

                        @article{ 10.5120/ijca2019919703,
                        author  = { Mohammad Almasi,Hamed Fathi,Sayed Adel Ghaeinian,Samaneh Samiee },
                        title   = { Human Action Recognition through the First-Person Point of view, Case Study Two Basic Task },
                        journal = { International Journal of Computer Applications },
                        year    = { 2019 },
                        volume  = { 177 },
                        number  = { 24 },
                        pages   = { 19-23 },
                        doi     = { 10.5120/ijca2019919703 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2019
                        %A Mohammad Almasi
                        %A Hamed Fathi
                        %A Sayed Adel Ghaeinian
                        %A Samaneh Samiee
                        %T Human Action Recognition through the First-Person Point of view, Case Study Two Basic Task%T 
                        %J International Journal of Computer Applications
                        %V 177
                        %N 24
                        %P 19-23
                        %R 10.5120/ijca2019919703
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

In this study, a human motion dataset is built and developed based on indoors and outdoors actions through a bounded-on-head camera and Xsens for tracking the motions. The key point here to structuring the dataset is utilized to set the sequence of a Deep Neural Network and order an arrangement of frames in the performed task (washing, eating, etc.). As a final point, a 3D modeling of the person suggested at every frame centered with the comparable structure of the first network. More than 120,000 frames constructed the dataset, taken from 7 different people, each one acting out different tasks in diverse indoor and outdoor scenarios. The sequences of every video frame were 3D synchronized and segmented 23 parts.

References

Alletto, S., Serra, G., Calderara, S., & Cucchiara, R. (2015). Understanding social relationships in egocentric vision. Pattern Recognition, 48(12), 4082-4096. DOI:10.1016/j.patcog.2015.06.006
Almasi, M. (2018). Investigating the Effect of Head Movement during Running and Its Results in Record Time Using Computer Vision. International Journal of Applied Engineering Research, 13(11), 9433-9436
Cao, Z., Simon, T., Wei, S., & Sheikh, Y. (2017). Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI:10.1109/cvpr.2017.143
Chuankun Li, Pichao Wang, Shuang Wang, Yonghong Hou, & Wanqing Li. (2017). Skeleton-based action recognition using LSTM and CNN. 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). DOI:10.1109/icmew.2017.8026287
Damen, D., Doughty, H., Farinella, G. M., Fidler, S., Furnari, A., Kazakos, E, Wray, M. (2018). Scaling Egocentric Vision: The Dataset Computer Vision – ECCV 2018, 753-771.doi:10.1007/978-3-030-01225-0_44
Ekvall, S., & Kragic, D. (2006). Learning Task Models from Multiple Human Demonstrations. ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication. DOI:10.1109/roman.2006.314460
El-Yacoubi, M. A., He, H., Roualdes, F., Selmi, M., Hariz, M., & Gillet, F. (2015). Vision-based Recognition of Activities by a Humanoid Robot. International Journal of Advanced Robotic Systems, 1. DOI: 10.5772/61819
Fathi, A., Li, Y., & Rehg, J. M. (2012). Learning to Recognize Daily Actions Using Gaze. Computer Vision – ECCV 2012, 314-327. DOI: 10.1007/978-3-642-33718-5_23
Hara, K., Kataoka, H., & Satoh, Y. (2018). Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. DOI:10.1109/cvpr.2018.00685
Hochreiter, S. & Schmidhuber, Jü. (1997). Long short-term memory. Neural computation, 9, 1735--1780
Jiang, H., & Grauman, K. (2017). Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI:10.1109/cvpr.2017.373
Li Y., Lan, C., Xing, J., Zeng, W., Yuan, C., & Liu, J. (2016). Online Human Action Detection Using Joint Classification-Regression Recurrent Neural Networks. Computer Vision – ECCV 2016, 203-220. DOI: 10.1007/978-3-319-46478-7_13
Majd, M., & Safabakhsh, R. (2019). Correlational Convolutional LSTM for human action recognition. Neurocomputing. DOI:10.1016/j.neucom.2018.10.095
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.
Patel, D., & Upadhyay, S. (2013). Optical Flow Measurement using Lucas Kanade Method. International Journal of Computer Applications, 61(10), 6-10. DOI: 10.5120/9962-4611
Pirsiavash, H., & Ramanan, D. (2012). Detecting activities of daily living in first-person camera views. 2012 IEEE Conference on Computer Vision and Pattern Recognition. DOI:10.1109/cvpr.2012.6248010
Squartini, S., Hussain, A., & Piazza, F. (n.d.). Preprocessing based solution for the vanishing gradient problem in recurrent neural networks. Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03. DOI:10.1109/iscas.2003.1206412
Yuan, Y., & Kitani, K. (2018). 3D Ego-Pose Estimation via Imitation Learning. Computer Vision – ECCV 2018, 763-778. DOI: 10.1007/978-3-030-01270-0_45
Zhu, L., & Wan, W. (2018). Human Pose Estimation Based on Deep Neural Network. 2018 International Conference on Audio, Language and Image Processing (ICALIP). DOI:10.1109/icalip.2018.8455245

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Machine learning deep learning Computer vision LSTM Recurrent neural network ResNet motion recognition.