TY - GEN
T1 - DIVe
T2 - 27th ACM International Conference on Information and Knowledge Management, CIKM 2018
AU - Mafrur, Rischan
AU - Sharaf, Mohamed A.
AU - Khan, Hina A.
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/10/17
Y1 - 2018/10/17
N2 - To support effective data exploration, there has been a growing interest in developing solutions that can automatically recommend data visualizations that reveal important data-driven insights. In such solutions, a large number of possible data visualization views are generated and ranked according to some metric of importance, then the top-k most important views are recommended. However, one drawback of that approach is that it often recommends similar views, leaving the data analyst with a limited amount of gained insights. To address that limitation, in this work we posit that employing diversification techniques in the process of view recommendation allows eliminating that redundancy and provides a concise coverage of the possible insights to be discovered. To that end, we propose a hybrid objective utility function, which captures both the importance, as well as the diversity of the insights revealed by the recommended views. While in principle, traditional diversification methods provide plausible solutions under our proposed utility function, they suffer from a significantly high query processing cost. In particular, directly applying such methods leads to a process-first-diversify-next approach, in which all possible data visualization are generated first via executing a large number of aggregate queries. To address that challenge, we propose the DiVE scheme, which efficiently selects the top-k recommended view based on our hybrid utility function. DiVE leverages the properties of both the importance and diversity metrics to prune a large number of query executions without compromising the quality of recommendations. Our experimental evaluation on real datasets shows the performance gains provided by DiVE.
AB - To support effective data exploration, there has been a growing interest in developing solutions that can automatically recommend data visualizations that reveal important data-driven insights. In such solutions, a large number of possible data visualization views are generated and ranked according to some metric of importance, then the top-k most important views are recommended. However, one drawback of that approach is that it often recommends similar views, leaving the data analyst with a limited amount of gained insights. To address that limitation, in this work we posit that employing diversification techniques in the process of view recommendation allows eliminating that redundancy and provides a concise coverage of the possible insights to be discovered. To that end, we propose a hybrid objective utility function, which captures both the importance, as well as the diversity of the insights revealed by the recommended views. While in principle, traditional diversification methods provide plausible solutions under our proposed utility function, they suffer from a significantly high query processing cost. In particular, directly applying such methods leads to a process-first-diversify-next approach, in which all possible data visualization are generated first via executing a large number of aggregate queries. To address that challenge, we propose the DiVE scheme, which efficiently selects the top-k recommended view based on our hybrid utility function. DiVE leverages the properties of both the importance and diversity metrics to prune a large number of query executions without compromising the quality of recommendations. Our experimental evaluation on real datasets shows the performance gains provided by DiVE.
UR - http://www.scopus.com/inward/record.url?scp=85058046265&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85058046265&partnerID=8YFLogxK
U2 - 10.1145/3269206.3271744
DO - 10.1145/3269206.3271744
M3 - Conference contribution
AN - SCOPUS:85058046265
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1123
EP - 1132
BT - CIKM 2018 - Proceedings of the 27th ACM International Conference on Information and Knowledge Management
A2 - Paton, Norman
A2 - Candan, Selcuk
A2 - Wang, Haixun
A2 - Allan, James
A2 - Agrawal, Rakesh
A2 - Labrinidis, Alexandros
A2 - Cuzzocrea, Alfredo
A2 - Zaki, Mohammed
A2 - Srivastava, Divesh
A2 - Broder, Andrei
A2 - Schuster, Assaf
PB - Association for Computing Machinery
Y2 - 22 October 2018 through 26 October 2018
ER -