TY - JOUR
T1 - Maximum entropy scaled super pixels segmentation for multi-object detection and scene recognition via deep belief network
AU - Rafique, Adnan Ahmed
AU - Gochoo, Munkhjargal
AU - Jalal, Ahmad
AU - Kim, Kibum
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2023/4
Y1 - 2023/4
N2 - Recent advances in visionary technologies impacted multi-object recognition and scene understanding. Such scene-understanding tasks are a demanding part of several technologies such as augmented reality based scene integration, robotic navigation, autonomous driving and tourist guide applications. Incorporating visual information in contextually unified segments, super-pixel-based approaches significantly mitigate the clutter, which is normal in pixel wise frameworks during scene understanding. Super-pixels allow customized shapes and variable size patches of connected components to be obtained. Furthermore, the computational time for these segmentation approaches can significantly decreased due to the reduced number of super-pixel target clusters. Hence, the super pixel-based approaches are more commonly used in robotics, computer vision and other intelligent systems. In this paper, we propose a Maximum Entropy scaled Super-Pixels (MEsSP) Segmentation method that encapsulates super-pixel segmentation based on an Entropy Model and utilizes local energy terms to label the pixels. Initially, after acquisition and pre-processing, image is segmented by two different methods: Fuzzy C-Means (FCM) and MEsSP. Then, to extract the features from these segmented objects, the dynamic geometrical features, fast Fourier transform (FFT), blob extraction, Maximally Stable Extremal Regions (MSER) and KAZE features are extracted using the bag of features approach. Then, to categorize the objects, multiple kernel learning is applied. Finally, a deep belief network (DBN) assigns the relevant labels to the scenes based on the categorized objects, intersection over union scores and dice similarity coefficient. The experimental results regarding multiple objects recognition accuracy, precision, recall and F1 scores over PASCAL VOC, Caltech 101 and UIUC Sports datasets show a remarkable performance. In addition, the evaluation of proposed scene recognition method over these benchmark datasets outperforms the state of the art (SOTA) methods.
AB - Recent advances in visionary technologies impacted multi-object recognition and scene understanding. Such scene-understanding tasks are a demanding part of several technologies such as augmented reality based scene integration, robotic navigation, autonomous driving and tourist guide applications. Incorporating visual information in contextually unified segments, super-pixel-based approaches significantly mitigate the clutter, which is normal in pixel wise frameworks during scene understanding. Super-pixels allow customized shapes and variable size patches of connected components to be obtained. Furthermore, the computational time for these segmentation approaches can significantly decreased due to the reduced number of super-pixel target clusters. Hence, the super pixel-based approaches are more commonly used in robotics, computer vision and other intelligent systems. In this paper, we propose a Maximum Entropy scaled Super-Pixels (MEsSP) Segmentation method that encapsulates super-pixel segmentation based on an Entropy Model and utilizes local energy terms to label the pixels. Initially, after acquisition and pre-processing, image is segmented by two different methods: Fuzzy C-Means (FCM) and MEsSP. Then, to extract the features from these segmented objects, the dynamic geometrical features, fast Fourier transform (FFT), blob extraction, Maximally Stable Extremal Regions (MSER) and KAZE features are extracted using the bag of features approach. Then, to categorize the objects, multiple kernel learning is applied. Finally, a deep belief network (DBN) assigns the relevant labels to the scenes based on the categorized objects, intersection over union scores and dice similarity coefficient. The experimental results regarding multiple objects recognition accuracy, precision, recall and F1 scores over PASCAL VOC, Caltech 101 and UIUC Sports datasets show a remarkable performance. In addition, the evaluation of proposed scene recognition method over these benchmark datasets outperforms the state of the art (SOTA) methods.
KW - Bag of features
KW - Deep belief network
KW - Entropy-scaled segmentation
KW - Super-pixels
UR - http://www.scopus.com/inward/record.url?scp=85138405968&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85138405968&partnerID=8YFLogxK
U2 - 10.1007/s11042-022-13717-y
DO - 10.1007/s11042-022-13717-y
M3 - Article
AN - SCOPUS:85138405968
SN - 1380-7501
VL - 82
SP - 13401
EP - 13430
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 9
ER -