TY - GEN
T1 - DivIDE
T2 - 26th International Conference on Scientific and Statistical Database Management, SSDBM 2014
AU - Khan, Hina A.
AU - Sharaf, Mohamed A.
AU - Albarrak, Abdullah
PY - 2014
Y1 - 2014
N2 - Today, Interactive Data Exploration (IDE) has become a main constituent of many discovery-oriented applications, in which users repeatedly submit exploratory queries to identify interesting subspaces in large data sets. Returning relevant yet diverse results to such queries provides users with quick insights into a rather large data space. Meanwhile, search results diversification adds additional cost to an already computationally expensive exploration process. To address this challenge, in this paper, we propose a novel diversification scheme called DivIDE, which targets the problem of efficiently diversifying the results of queries posed during data exploration sessions. In particular, our scheme exploits the properties of data diversification functions while leveraging the natural overlap occurring between the results of different queries so that to provide significant reductions in processing costs. Our extensive experimental evaluation on both synthetic and real data sets shows the significant benefits provided by our scheme as compared to existing methods.
AB - Today, Interactive Data Exploration (IDE) has become a main constituent of many discovery-oriented applications, in which users repeatedly submit exploratory queries to identify interesting subspaces in large data sets. Returning relevant yet diverse results to such queries provides users with quick insights into a rather large data space. Meanwhile, search results diversification adds additional cost to an already computationally expensive exploration process. To address this challenge, in this paper, we propose a novel diversification scheme called DivIDE, which targets the problem of efficiently diversifying the results of queries posed during data exploration sessions. In particular, our scheme exploits the properties of data diversification functions while leveraging the natural overlap occurring between the results of different queries so that to provide significant reductions in processing costs. Our extensive experimental evaluation on both synthetic and real data sets shows the significant benefits provided by our scheme as compared to existing methods.
UR - http://www.scopus.com/inward/record.url?scp=84904434711&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84904434711&partnerID=8YFLogxK
U2 - 10.1145/2618243.2618253
DO - 10.1145/2618243.2618253
M3 - Conference contribution
AN - SCOPUS:84904434711
SN - 9781450327220
T3 - ACM International Conference Proceeding Series
BT - SSDBM 2014 - Proceedings of the 26th International Conference on Scientific and Statistical Database Management
PB - Association for Computing Machinery
Y2 - 30 June 2014 through 2 July 2014
ER -