TY - JOUR
T1 - A Graph-Based Approach to Recognizing Complex Human Object Interactions in Sequential Data
AU - Ghadi, Yazeed Yasin
AU - Waheed, Manahil
AU - Gochoo, Munkhjargal
AU - Alsuhibany, Suliman A.
AU - Chelloug, Samia Allaoua
AU - Jalal, Ahmad
AU - Park, Jeongmin
N1 - Funding Information:
Acknowledgments: In addition; the authors would like to thank the support of the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University. This work has also been supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R239), Princess Nourah bint Abdulrahman.University, Riyadh, Saudi Arabia.
Funding Information:
Funding: This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2023-2018-0-01426) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation). It was also supported by Emirates Center for Mobility Research (ECMR) Grant #12R012.
Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2022/5/1
Y1 - 2022/5/1
N2 - The critical task of recognizing human–object interactions (HOI) finds its application in the domains of surveillance, security, healthcare, assisted living, rehabilitation, sports, and online learning. This has led to the development of various HOI recognition systems in the recent past. Thus, the purpose of this study is to develop a novel graph-based solution for this purpose. In par-ticular, the proposed system takes sequential data as input and recognizes the HOI interaction being performed in it. That is, first of all, the system pre-processes the input data by adjusting the contrast and smoothing the incoming image frames. Then, it locates the human and object through image segmentation. Based on this, 12 key body parts are identified from the extracted human silhouette through a graph-based image skeletonization technique called image foresting transform (IFT). Then, three types of features are extracted: full-body feature, point-based features, and scene fea-tures. The next step involves optimizing the different features using isometric mapping (ISOMAP). Lastly, the optimized feature vector is fed to a graph convolution network (GCN) which performs the HOI classification. The performance of the proposed system was validated using three bench-mark datasets, namely, Olympic Sports, MSR Daily Activity 3D, and D3D-HOI. The results showed that this model outperforms the existing state-of-the-art models by achieving a mean accuracy of 94.1% with the Olympic Sports, 93.2% with the MSR Daily Activity 3D, and 89.6% with the D3D-HOI datasets.
AB - The critical task of recognizing human–object interactions (HOI) finds its application in the domains of surveillance, security, healthcare, assisted living, rehabilitation, sports, and online learning. This has led to the development of various HOI recognition systems in the recent past. Thus, the purpose of this study is to develop a novel graph-based solution for this purpose. In par-ticular, the proposed system takes sequential data as input and recognizes the HOI interaction being performed in it. That is, first of all, the system pre-processes the input data by adjusting the contrast and smoothing the incoming image frames. Then, it locates the human and object through image segmentation. Based on this, 12 key body parts are identified from the extracted human silhouette through a graph-based image skeletonization technique called image foresting transform (IFT). Then, three types of features are extracted: full-body feature, point-based features, and scene fea-tures. The next step involves optimizing the different features using isometric mapping (ISOMAP). Lastly, the optimized feature vector is fed to a graph convolution network (GCN) which performs the HOI classification. The performance of the proposed system was validated using three bench-mark datasets, namely, Olympic Sports, MSR Daily Activity 3D, and D3D-HOI. The results showed that this model outperforms the existing state-of-the-art models by achieving a mean accuracy of 94.1% with the Olympic Sports, 93.2% with the MSR Daily Activity 3D, and 89.6% with the D3D-HOI datasets.
KW - dense trajectories
KW - graph convolution network
KW - human–object interaction
KW - image foresting transform
KW - image skeletonization
UR - http://www.scopus.com/inward/record.url?scp=85130896900&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130896900&partnerID=8YFLogxK
U2 - 10.3390/app12105196
DO - 10.3390/app12105196
M3 - Article
AN - SCOPUS:85130896900
SN - 2076-3417
VL - 12
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 10
M1 - 5196
ER -