TY - JOUR
T1 - Dynamic Data Sample Selection and Scheduling in Edge Federated Learning
AU - Serhani, Mohamed Adel
AU - Abreha, Haftay Gebreslasie
AU - Tariq, Asadullah
AU - Hayajneh, Mohammad
AU - Xu, Yang
AU - Hayawi, Kadhim
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2023
Y1 - 2023
N2 - Federated Learning (FL) is a state-of-the-art paradigm used in Edge Computing (EC). It enables distributed learning to train on cross-device data, achieving efficient performance, and ensuring data privacy. In the era of Big Data, the Internet of Things (IoT), and data streaming, challenges such as monitoring and management remain unresolved. Edge IoT devices produce and stream huge amounts of sample sources, which can incur significant processing, computation, and storage costs during local updates using all data samples. Many research initiatives have improved the algorithm for FL in homogeneous networks. However, in the typical distributed learning application scenario, data is generated independently by each device, and this heterogeneous data has different distribution characteristics. As a result, the data stream, often characterized as Big Data, used by each device for local learning is unbalanced and is not independent or identically distributed. Such data heterogeneity can degrade the performance of FL and reduce resource utilization. In this paper, we present the DSS-Edge-FL, a Dynamic Sample Selection optimization algorithm that aims to optimize resources and address data heterogeneity. The extensive results of the experiment demonstrate that our proposed approach outperforms the resource efficiency of conventional training methods, with a lower convergence time and improved resource efficiency.
AB - Federated Learning (FL) is a state-of-the-art paradigm used in Edge Computing (EC). It enables distributed learning to train on cross-device data, achieving efficient performance, and ensuring data privacy. In the era of Big Data, the Internet of Things (IoT), and data streaming, challenges such as monitoring and management remain unresolved. Edge IoT devices produce and stream huge amounts of sample sources, which can incur significant processing, computation, and storage costs during local updates using all data samples. Many research initiatives have improved the algorithm for FL in homogeneous networks. However, in the typical distributed learning application scenario, data is generated independently by each device, and this heterogeneous data has different distribution characteristics. As a result, the data stream, often characterized as Big Data, used by each device for local learning is unbalanced and is not independent or identically distributed. Such data heterogeneity can degrade the performance of FL and reduce resource utilization. In this paper, we present the DSS-Edge-FL, a Dynamic Sample Selection optimization algorithm that aims to optimize resources and address data heterogeneity. The extensive results of the experiment demonstrate that our proposed approach outperforms the resource efficiency of conventional training methods, with a lower convergence time and improved resource efficiency.
KW - big data
KW - data streaming
KW - dynamic resource allocation
KW - edge computing
KW - Federated learning
KW - intelligent edge
UR - http://www.scopus.com/inward/record.url?scp=85171535878&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85171535878&partnerID=8YFLogxK
U2 - 10.1109/OJCOMS.2023.3313257
DO - 10.1109/OJCOMS.2023.3313257
M3 - Article
AN - SCOPUS:85171535878
SN - 2644-125X
VL - 4
SP - 2133
EP - 2149
JO - IEEE Open Journal of the Communications Society
JF - IEEE Open Journal of the Communications Society
ER -