Slow Replica and Shared Protection: Energy-Efficient and Reliable Task Assignment in Cloud Data Centers

Yuqi Fan, Chen Wang, Weili Wu, Taieb Znati, Dingzhu Du

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

With the explosive growth in the scale of cloud computing infrastructures, reliability and energy efficiency have become important concerns considering the great complexity of cloud data centers. There is an urgent need for efficient task assignment that can dispatch tasks to appropriate cloud data center servers, which is critical to achieve reliability and energy efficiency in current cloud data centers. Most of the research on task assignment focuses on only one of the objectives of reliability and energy efficiency, while the two objectives are intrinsically conflicting with each other. In this paper, we deal with the problem of task assignment in data centers, with the objective of minimizing the energy consumption while providing failure tolerance to task execution failure. We propose a reliability-aware and energy-efficient task replica assignment algorithm based on running task replicas at a low speed and enabling multiple task replicas to share the same server resources. Each task in a job processed by the cloud computing platform has two instances: main task and task replica (shadow). Each main task runs on an individual server, and the task replica associated with the main task is assigned on a different server. The main tasks run at the full server speed, while the task replicas run at a lower rate than the main tasks. The task replicas can be mapped onto dedicated backup servers or be assigned to the servers on which the main tasks are running. Multiple task replicas can share the same server resources to reduce the number of servers required. We conduct experiments through simulations. Experimental results demonstrate that the proposed algorithm can effectively reduce the energy consumption, while achieving a good balance between the number of servers used and job completion time.

Original languageEnglish
Article number8759087
Pages (from-to)931-943
Number of pages13
JournalIEEE Transactions on Reliability
Volume70
Issue number3
DOIs
Publication statusPublished - Sep 2021
Externally publishedYes

Keywords

  • Energy efficient
  • reliability
  • shadow
  • task assignment
  • task replication

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Slow Replica and Shared Protection: Energy-Efficient and Reliable Task Assignment in Cloud Data Centers'. Together they form a unique fingerprint.

Cite this