A Generic Framework for Enhancing Autonomous Driving Accuracy through Multimodal Data Fusion

Henry Alexander Ignatious, Hesham El-Sayed, Manzoor Ahmed Khan, Parag Kulkarni

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


Higher-level autonomous driving necessitates the best possible execution of important moves under all conditions. Most of the accidents in recent years caused by the AVs launched by leading automobile manufacturers are due to inadequate decision-making, which is a result of their poor perceivance of environmental information. In today’s technology-bound scenarios, versatile sensors are used by AVs to collect environmental information. Due to various technical and natural calamities, the environmental information acquired by the sensors may not be complete and clear, due to which the AVs may misinterpret the information in a different context, leading to inadequate decision-making, which may then lead to fatal accidents. To overcome this drawback, effective preprocessing of raw sensory data is a mandatory task. Pre-processing the sensory data involves two vital tasks, namely data cleaning and data fusion. Since the raw sensory data are complex and exhibit multimodal characteristics, more emphasis is given to data preprocessing. Since more innovative models have been proposed for data cleaning, this study focused on data fusion. In particular, this study proposed a generic data fusion engine, which classifies different formats of sensory data and fuses them accordingly to improve accuracy. This study proposed a generic framework to fuse the text, image, and audio data. In the first stage of this research, an innovative hybrid model was proposed to fuse multispectral image and video data. Simple and efficient models to extract the salient image features were also proposed. The hybrid image fusion model that was proposed did not yield satisfactory outcomes when combining 3D point cloud data, and its performance declined when evaluating large datasets. To address this issue, the study expanded by introducing an advanced generative adversarial network (GAN) to transform the hybrid image fusion model into a machine learning model capable of handling substantial datasets. Additionally, customized kernel functions were suggested to fuse 3D point cloud data effectively. The performance of the proposed models was assessed using standard metrics and datasets, comparing them with existing popular models. The results revealed that the proposed image fusion model outperformed the other models.

Original languageEnglish
Article number10749
JournalApplied Sciences (Switzerland)
Issue number19
Publication statusPublished - Oct 2023


  • autonomous vehicles (AVs)
  • data fusion
  • data-preprocessing
  • machine learning (ML)
  • situation awareness

ASJC Scopus subject areas

  • General Materials Science
  • Instrumentation
  • General Engineering
  • Process Chemistry and Technology
  • Computer Science Applications
  • Fluid Flow and Transfer Processes


Dive into the research topics of 'A Generic Framework for Enhancing Autonomous Driving Accuracy through Multimodal Data Fusion'. Together they form a unique fingerprint.

Cite this