TY - GEN
T1 - Toward optimal streaming feature selection
AU - Al Nuaimi, Noura
AU - Masud, Mohammad M.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Recently, real-time data brings explosion of big data that is challenged traditional data mining techniques. Analyzing data in real-time would allow making better decisions on real-time. Usually, big data contains many irrelevant and redundant data. Therefore, removing and discarding these data is essential. Streaming feature selection involving big data has generally been viewed as a solution for selecting informative features that lead to accurate learning models. In this paper we introduce an efficient algorithm for selection of features from a feature stream by online feature grouping. This technique will be useful in big data analytics due to its efficiency and scalability. The main contribution of this work is to solve the challenge of extremely high dimensional of big data by delivering the streaming feature grouping and selection algorithm. In our approach the algorithm is designed with the idea of grouping similar features to reduce the redundancy and to handle the stream of features in an online fashion. Experimental results have demonstrated that our proposed algorithm shown superior performance in terms of prediction accuracy and running time.
AB - Recently, real-time data brings explosion of big data that is challenged traditional data mining techniques. Analyzing data in real-time would allow making better decisions on real-time. Usually, big data contains many irrelevant and redundant data. Therefore, removing and discarding these data is essential. Streaming feature selection involving big data has generally been viewed as a solution for selecting informative features that lead to accurate learning models. In this paper we introduce an efficient algorithm for selection of features from a feature stream by online feature grouping. This technique will be useful in big data analytics due to its efficiency and scalability. The main contribution of this work is to solve the challenge of extremely high dimensional of big data by delivering the streaming feature grouping and selection algorithm. In our approach the algorithm is designed with the idea of grouping similar features to reduce the redundancy and to handle the stream of features in an online fashion. Experimental results have demonstrated that our proposed algorithm shown superior performance in terms of prediction accuracy and running time.
KW - Big data
KW - Feature grouping
KW - Stream of features
KW - Streaming feature
KW - Streaming feature grouping
KW - Streaming feature selection
UR - http://www.scopus.com/inward/record.url?scp=85046267383&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046267383&partnerID=8YFLogxK
U2 - 10.1109/DSAA.2017.81
DO - 10.1109/DSAA.2017.81
M3 - Conference contribution
AN - SCOPUS:85046267383
T3 - Proceedings - 2017 International Conference on Data Science and Advanced Analytics, DSAA 2017
SP - 775
EP - 782
BT - Proceedings - 2017 International Conference on Data Science and Advanced Analytics, DSAA 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th International Conference on Data Science and Advanced Analytics, DSAA 2017
Y2 - 19 October 2017 through 21 October 2017
ER -