TY - GEN
T1 - KNOWLEDGE-INFUSED LEARNING FOR FINE-GRAINED PLANT DISEASE RECOGNITION
AU - Ahmad, Jamil
AU - Gueaieb, Wail
AU - El Saddik, Abdulmotaleb
AU - De Masi, Giulia
AU - Karray, Fakhri
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Domain knowledge exists in various forms, including text, ontologies, graphs, images, audio, and videos. In plant disease detection, most works solely utilize images with disease labels, neglecting textual descriptions of visual disease symptoms used by human experts for diagnosis. These text descriptions and sample images aid expert identification of visual symptoms. We propose a novel method that leverages text descriptions and image data by modeling domain-specific knowledge about visual symptoms in leaf images as separate feature channels. Each channel corresponds to specific features whose absence or presence in the image influences model predictions. We introduce a channel attention-guided fusion module for weighting each channel based on the input and corresponding output. The combined feature channels are transformed into a standardized 3-channel input format, which can then be processed by any pre-trained convolutional neural network (CNN) as input for feature extraction and subsequent classification. Furthermore, intermediate activations of the channel attention layer combined with the weights from the fusion layer make model predictions explainable. Experimental results on three publicly available datasets of apple and cucumber leaf diseases demonstrate improvements of up to 5% utilizing various state-of-the-art CNN architectures, indicating the efficacy of incorporating textual disease descriptions using the proposed approach.
AB - Domain knowledge exists in various forms, including text, ontologies, graphs, images, audio, and videos. In plant disease detection, most works solely utilize images with disease labels, neglecting textual descriptions of visual disease symptoms used by human experts for diagnosis. These text descriptions and sample images aid expert identification of visual symptoms. We propose a novel method that leverages text descriptions and image data by modeling domain-specific knowledge about visual symptoms in leaf images as separate feature channels. Each channel corresponds to specific features whose absence or presence in the image influences model predictions. We introduce a channel attention-guided fusion module for weighting each channel based on the input and corresponding output. The combined feature channels are transformed into a standardized 3-channel input format, which can then be processed by any pre-trained convolutional neural network (CNN) as input for feature extraction and subsequent classification. Furthermore, intermediate activations of the channel attention layer combined with the weights from the fusion layer make model predictions explainable. Experimental results on three publicly available datasets of apple and cucumber leaf diseases demonstrate improvements of up to 5% utilizing various state-of-the-art CNN architectures, indicating the efficacy of incorporating textual disease descriptions using the proposed approach.
KW - channel attention
KW - disease detection
KW - explainability
KW - feature extraction
KW - knowledge-infusion
UR - http://www.scopus.com/inward/record.url?scp=85216882759&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85216882759&partnerID=8YFLogxK
U2 - 10.1109/ICIP51287.2024.10648166
DO - 10.1109/ICIP51287.2024.10648166
M3 - Conference contribution
AN - SCOPUS:85216882759
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 395
EP - 401
BT - 2024 IEEE International Conference on Image Processing, ICIP 2024 - Proceedings
PB - IEEE Computer Society
T2 - 31st IEEE International Conference on Image Processing, ICIP 2024
Y2 - 27 October 2024 through 30 October 2024
ER -