Reinforcement Learning Framework for Delay Sensitive Energy Harvesting Wireless Sensor Networks

Hanan Al-Tous, Imad Barhumi

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)


A multi-hop energy harvesting wireless sensor network (EH-WSNs) is a key enabler for future communication systems such as the internet-of-things. Optimal power management and routing selection are important for the operation and successful deployment of EH-WSNs. Characterizing the optimal policies increases significantly with the number of nodes in the network. In this paper, optimal control policy is devised based on minimum-delay transmission in a multi-hop EH-WSN using reinforcement learning (RL). The WSN consists of M EH sensor nodes aiming to transmit their data to a sink node with a minimum delay. Each sensor node is equipped with a battery of limited capacity to save the harvested energy and a data buffer of limited size to store both the sensed and relayed data from neighboring nodes. Centralized and distributed RL algorithms are considered for EH-WSNs. In the centralized RL algorithm the control action is taken at a central unit using the state information of all sensor nodes. In the distributed RL algorithm the control action is taken locally at each sensor node using its state of information and the state information of neighboring nodes. The proposed RL algorithms are based on the state-action-reward-state-action (SARSA) algorithm. Simulation results demonstrate the merits of the proposed algorithms.

Original languageEnglish
Article number9292079
Pages (from-to)7103-7113
Number of pages11
JournalIEEE Sensors Journal
Issue number5
Publication statusPublished - Mar 1 2021


  • Wireless sensor network
  • action-value-function approximation
  • energy harvesting
  • reinforcement learning

ASJC Scopus subject areas

  • Instrumentation
  • Electrical and Electronic Engineering


Dive into the research topics of 'Reinforcement Learning Framework for Delay Sensitive Energy Harvesting Wireless Sensor Networks'. Together they form a unique fingerprint.

Cite this