TY - JOUR
T1 - SwinSegFormer
T2 - Advancing Aerial Image Semantic Segmentation for Flood Detection
AU - Shaheen, Muhammad Tariq
AU - Iqbal, Hafsa
AU - Khurshid, Numan
AU - Sadia, Haleema
AU - Saeed, Nasir
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2025
Y1 - 2025
N2 - Semantic segmentation of aerial images is essential for unmanned aerial vehicle (UAV) applications in disaster management, particularly for identifying the flood-affected areas. Traditional techniques face challenges in capturing global semantic information due to their limited receptive fields, and high computational requirement. To address these issues, we propose a novel transformer-based model named SwinSegFormer, which feature a hierarchical encoder that efficiently generates multi-scale high-resolution features along with a lightweight decoder to reduce computational overhead. The proposed model is trained on FloodNet dataset and demonstrates efficient performance on challenging classes such as vehicles, pools, and flooded and non-flooded roads, which are crucial for effective disaster management. Additionally, we developed a post-processing module to categorize areas into flooded and non-flooded. The model achieves a validation mIoU of 75.1%, mDice of 85.4%, and mACC of 87.1%, representing a 10-12% improvement over state-of-the-art vision transformer-based methods. The effectiveness of model is further evaluated on real-world unlabeled flood imagery, highlighting its potential for supporting first aid activities during floods.
AB - Semantic segmentation of aerial images is essential for unmanned aerial vehicle (UAV) applications in disaster management, particularly for identifying the flood-affected areas. Traditional techniques face challenges in capturing global semantic information due to their limited receptive fields, and high computational requirement. To address these issues, we propose a novel transformer-based model named SwinSegFormer, which feature a hierarchical encoder that efficiently generates multi-scale high-resolution features along with a lightweight decoder to reduce computational overhead. The proposed model is trained on FloodNet dataset and demonstrates efficient performance on challenging classes such as vehicles, pools, and flooded and non-flooded roads, which are crucial for effective disaster management. Additionally, we developed a post-processing module to categorize areas into flooded and non-flooded. The model achieves a validation mIoU of 75.1%, mDice of 85.4%, and mACC of 87.1%, representing a 10-12% improvement over state-of-the-art vision transformer-based methods. The effectiveness of model is further evaluated on real-world unlabeled flood imagery, highlighting its potential for supporting first aid activities during floods.
KW - Flood detection
KW - SegFormer
KW - semantic segmentation
KW - swin transformer
KW - unmanned aerial vehicles (UAVs)
KW - vision transformers
UR - http://www.scopus.com/inward/record.url?scp=105003955098&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105003955098&partnerID=8YFLogxK
U2 - 10.1109/OJCS.2025.3565185
DO - 10.1109/OJCS.2025.3565185
M3 - Article
AN - SCOPUS:105003955098
SN - 2644-1268
VL - 6
SP - 645
EP - 657
JO - IEEE Open Journal of the Computer Society
JF - IEEE Open Journal of the Computer Society
ER -