TY - GEN
T1 - Adaptive query scheduling in key-value data stores
AU - Xu, Chen
AU - Sharaf, Mohamed
AU - Zhou, Minqi
AU - Zhou, Aoying
AU - Zhou, Xiaofang
PY - 2013
Y1 - 2013
N2 - Large-scale distributed systems such as Dynamo at Amazon, PNUTS at Yahoo!, and Cassandra at Facebook, are rapidly becoming the data management platform of choice for most web applications. Those key-value data stores rely on data partitioning and replication to achieve higher levels of availability and scalability. Such design choices typically exhibit a trade-off in which data freshness is sacrificed in favor of reduced access latencies. Hence, it is indispensable to optimize resource allocation in order to minimize: 1) query tardiness, i.e., maximize Quality of Service (QoS), and 2) data staleness, i.e., maximize Quality of Data (QoD). That trade-off between QoS and QoD is further manifested at the local-level (i.e., replica-level) and is primarily shaped by the resource allocation strategies deployed for managing the processing of foreground user queries and background system updates. To this end, we propose the AFIT scheduling strategy, which allows for selective data refreshing and integrates the benefits of SJF-based scheduling with an EDF-like policy. Our experiments demonstrate the effectiveness of our method, which does not only strike a fine trade-off between QoS and QoD but also automatically adapts to workload settings.
AB - Large-scale distributed systems such as Dynamo at Amazon, PNUTS at Yahoo!, and Cassandra at Facebook, are rapidly becoming the data management platform of choice for most web applications. Those key-value data stores rely on data partitioning and replication to achieve higher levels of availability and scalability. Such design choices typically exhibit a trade-off in which data freshness is sacrificed in favor of reduced access latencies. Hence, it is indispensable to optimize resource allocation in order to minimize: 1) query tardiness, i.e., maximize Quality of Service (QoS), and 2) data staleness, i.e., maximize Quality of Data (QoD). That trade-off between QoS and QoD is further manifested at the local-level (i.e., replica-level) and is primarily shaped by the resource allocation strategies deployed for managing the processing of foreground user queries and background system updates. To this end, we propose the AFIT scheduling strategy, which allows for selective data refreshing and integrates the benefits of SJF-based scheduling with an EDF-like policy. Our experiments demonstrate the effectiveness of our method, which does not only strike a fine trade-off between QoS and QoD but also automatically adapts to workload settings.
UR - http://www.scopus.com/inward/record.url?scp=84892843386&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84892843386&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-37487-6_9
DO - 10.1007/978-3-642-37487-6_9
M3 - Conference contribution
AN - SCOPUS:84892843386
SN - 9783642374869
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 86
EP - 100
BT - Database Systems for Advanced Applications - 18th International Conference, DASFAA 2013, Proceedings
T2 - 18th International Conference on Database Systems for Advanced Applications, DASFAA 2013
Y2 - 22 April 2013 through 25 April 2013
ER -