Emotion and memory model for social robots: a reinforcement learning based behaviour selection

Muneeb Imtiaz Ahmad, Yuan Gao, Fady Alnajjar, Suleman Shahid, Omar Mubin

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we propose a reinforcement learning (RL) mechanism for social robots to select an action based on users’ learning performance and social engagement. We applied this behavior selection mechanism to extend the emotion and memory model, which allows a robot to create a memory account of the user’s emotional events and adapt its behavior based on the developed memory. We evaluated the model in a vocabulary-learning task at a school during a children’s game involving robot interaction to see if the model results in maintaining engagement and improving vocabulary learning across the four different interaction sessions. Generally, we observed positive findings based on child vocabulary learning and sustaining social engagement during all sessions. Compared to the trends of a previous study, we observed a higher level of social engagement across sessions in terms of the duration of the user gaze toward the robot. For vocabulary retention, we saw similar trends in general but also showing high vocabulary retention across some sessions. The findings indicate the benefits of applying RL techniques that have a reward system based on multi-modal user signals or cues.

Original languageEnglish
Pages (from-to)3210-3236
Number of pages27
JournalBehaviour and Information Technology
Volume41
Issue number15
DOIs
Publication statusPublished - 2022

Keywords

  • Reinforcement learning
  • children engagement
  • educational robots
  • personalisation
  • repeated child robot interaction
  • social robots

ASJC Scopus subject areas

  • Developmental and Educational Psychology
  • Arts and Humanities (miscellaneous)
  • Social Sciences(all)
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Emotion and memory model for social robots: a reinforcement learning based behaviour selection'. Together they form a unique fingerprint.

Cite this