TY - GEN
T1 - Multistream regression with asynchronous concept drift detection
AU - Dong, Bo
AU - Li, Yifan
AU - Gao, Yang
AU - Haque, Ahsanul
AU - Khan, Latifur
AU - Masud, Mohammad M.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - A recently introduced problem setting, referred as multistream, involves two independent non-stationary data generating processes. One of them is called source stream, which generates continuous data instances with true output. And the other one called target stream, which generates data instances lacking of true output. Due to the nature of data streams, scholars have addressed prediction problems under scenarios such as covariate shift or concept drift in past studies by discussing one assumption while keeping others consistent. For example, it is assumed that the data distributions of training and testing data are similar, and true output values of the stream instances would be available soon. However, in practice these assumptions are not always valid. The multistream regression problem is to predict the output of target stream, using data instances and their true output from source stream. In this paper, we propose an approach of multistream regression by incorporating concept drift detection into covariate shift adaptation. Meanwhile, empirical evaluation on synthetic and real world datasets demonstrates the effectiveness of the proposed technique by competing with the state-of-the-art approaches. Experiment results indicate that our method significantly improved prediction performance compared to existing benchmark.
AB - A recently introduced problem setting, referred as multistream, involves two independent non-stationary data generating processes. One of them is called source stream, which generates continuous data instances with true output. And the other one called target stream, which generates data instances lacking of true output. Due to the nature of data streams, scholars have addressed prediction problems under scenarios such as covariate shift or concept drift in past studies by discussing one assumption while keeping others consistent. For example, it is assumed that the data distributions of training and testing data are similar, and true output values of the stream instances would be available soon. However, in practice these assumptions are not always valid. The multistream regression problem is to predict the output of target stream, using data instances and their true output from source stream. In this paper, we propose an approach of multistream regression by incorporating concept drift detection into covariate shift adaptation. Meanwhile, empirical evaluation on synthetic and real world datasets demonstrates the effectiveness of the proposed technique by competing with the state-of-the-art approaches. Experiment results indicate that our method significantly improved prediction performance compared to existing benchmark.
KW - concept drift
KW - covariate shift
KW - multistream
KW - regression
UR - http://www.scopus.com/inward/record.url?scp=85047806960&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85047806960&partnerID=8YFLogxK
U2 - 10.1109/BigData.2017.8257975
DO - 10.1109/BigData.2017.8257975
M3 - Conference contribution
AN - SCOPUS:85047806960
T3 - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
SP - 596
EP - 605
BT - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
A2 - Nie, Jian-Yun
A2 - Obradovic, Zoran
A2 - Suzumura, Toyotaro
A2 - Ghosh, Rumi
A2 - Nambiar, Raghunath
A2 - Wang, Chonggang
A2 - Zang, Hui
A2 - Baeza-Yates, Ricardo
A2 - Baeza-Yates, Ricardo
A2 - Hu, Xiaohua
A2 - Kepner, Jeremy
A2 - Cuzzocrea, Alfredo
A2 - Tang, Jian
A2 - Toyoda, Masashi
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th IEEE International Conference on Big Data, Big Data 2017
Y2 - 11 December 2017 through 14 December 2017
ER -