ANTi-Vax: a novel Twitter dataset for COVID-19 vaccine misinformation detection

K. Hayawi, S. Shahriar, M. A. Serhani, I. Taleb, S. S. Mathew

Research output: Contribution to journalArticlepeer-review

97 Citations (Scopus)


Objectives: COVID-19 (SARS-CoV-2) pandemic has infected hundreds of millions and inflicted millions of deaths around the globe. Fortunately, the introduction of COVID-19 vaccines provided a glimmer of hope and a pathway to recovery. However, owing to misinformation being spread on social media and other platforms, there has been a rise in vaccine hesitancy which can lead to a negative impact on vaccine uptake in the population. The goal of this research is to introduce a novel machine learning–based COVID-19 vaccine misinformation detection framework. Study design: We collected and annotated COVID-19 vaccine tweets and trained machine learning algorithms to classify vaccine misinformation. Methods: More than 15,000 tweets were annotated as misinformation or general vaccine tweets using reliable sources and validated by medical experts. The classification models explored were XGBoost, LSTM, and BERT transformer model. Results: The best classification performance was obtained using BERT, resulting in 0.98 F1-score on the test set. The precision and recall scores were 0.97 and 0.98, respectively. Conclusion: Machine learning–based models are effective in detecting misinformation regarding COVID-19 vaccines on social media platforms.

Original languageEnglish
Pages (from-to)23-30
Number of pages8
JournalPublic Health
Publication statusPublished - Feb 2022


  • COVID-19
  • Deep learning
  • Misinformation detection
  • Natural language processing
  • Text classification
  • Vaccines

ASJC Scopus subject areas

  • Public Health, Environmental and Occupational Health


Dive into the research topics of 'ANTi-Vax: a novel Twitter dataset for COVID-19 vaccine misinformation detection'. Together they form a unique fingerprint.

Cite this