TY - GEN
T1 - Real Time Detection of Social Bots on Twitter Using Machine Learning and Apache Kafka
AU - Alothali, Eiman
AU - Alashwal, Hany
AU - Salih, Motamen
AU - Hayawi, Kadhim
N1 - Funding Information:
ACKNOWLEDGMENT This work was supported by Zayed University RIF grant R20132.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Social media networks, like Facebook and Twitter, are increasingly becoming important part of most people's lives. Twitter provides a useful platform for sharing contents, ideas, opinions, and promoting products and election campaigns. Due to the increased popularity, it became vulnerable to malicious attacks caused by social bots. Social bots are automated accounts created for different purposes. They are involved in spreading rumors and false information, cyberbullying, spamming, and manipulating the ecosystem of social network. Most of the social bots detection methods rely on the utilization of offline data for both training and testing. In this paper, we use Apache Kafka, a big data analytics tool to stream data from Twitter API in real time. We use profile information (metadata) as features. A machine learning technique is applied to predict the type of the incoming data (human or bot). In addition, the paper presents technical details of how to configure these different tools.
AB - Social media networks, like Facebook and Twitter, are increasingly becoming important part of most people's lives. Twitter provides a useful platform for sharing contents, ideas, opinions, and promoting products and election campaigns. Due to the increased popularity, it became vulnerable to malicious attacks caused by social bots. Social bots are automated accounts created for different purposes. They are involved in spreading rumors and false information, cyberbullying, spamming, and manipulating the ecosystem of social network. Most of the social bots detection methods rely on the utilization of offline data for both training and testing. In this paper, we use Apache Kafka, a big data analytics tool to stream data from Twitter API in real time. We use profile information (metadata) as features. A machine learning technique is applied to predict the type of the incoming data (human or bot). In addition, the paper presents technical details of how to configure these different tools.
KW - Apache Kafka
KW - Social bots
KW - Streaming Machine learning
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85123220595&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123220595&partnerID=8YFLogxK
U2 - 10.1109/CSNet52717.2021.9614282
DO - 10.1109/CSNet52717.2021.9614282
M3 - Conference contribution
AN - SCOPUS:85123220595
T3 - 2021 5th Cyber Security in Networking Conference, CSNet 2021
SP - 98
EP - 102
BT - 2021 5th Cyber Security in Networking Conference, CSNet 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th Cyber Security in Networking Conference, CSNet 2021
Y2 - 12 October 2021 through 14 October 2021
ER -