Imagine deploying a fully automated conversational system with sensing and language understanding to older adults to improve their social skills. We show that w/ 8 sessions over several weeks, socially isolated older adults can retain and generalize the skills with other humans.

The world is experiencing major demographic change as the global population becomes increasingly older. Currently, 617 million people worldwide (8.5%) are 65 and over. By 2050, this number is expected to reach 1.6 billion, more than double what it is now. The oldest demographic group (people 85 and older) is projected to grow to 447 million by 2050, more than tripling since 2015.

At the current pace, population aging is poised to impose a significant strain on economies, health systems, and social structures worldwide. Our ongoing exploration called “Aging and Engaging” hopes to bring about a transformation in how we age while retaining our basic human interactions.

The “Aging and Engaing” interface allows users to practice conversations with a virtual assistant and receive feedback on eye contact, speaking volume, smiling, and valence of speech content. Feedback is generated automatically by analyzing the temporal properties of the conversation using the hidden Markov model. The interface was designed with the assistance of an expert advisory panel that works with geriatric patients, as well as a focus group of 12 older adults.

We divide each conversation into four phases, allowing users to receive and reflect on feedback often. In each phase, users interact with the assistant (see Figure 1) for about four minutes. During the conversation, our system captures audio and video and uploads both to our server in real-time for immediate analysis. From the uploaded video and audio files, the facial and prosodic features, including smile intensity, volume, and eye gaze direction are extracted. A hidden Markov model-based classifier then classifies the patterns of nonverbal features into two categories: positive and negative. After each conversation phase, our system gives the feedback based on the classified temporal patterns. Our system performs automated speech recognition, which we later use to perform a sentiment analysis (ratio of positive to negative words, for instance). After each phase, the transcript is uploaded to the server, where our system looks for negative and positive words in the transcript from a pre-populated list. The list of words was generated with direct input from two clinical psychologists who provide regular therapy to elderly patients. Our system gives overall positive or negative feedback depending on the prediction and on the three nonverbal and one verbal cue, after each phase. The four types of feedback come one after another with both text and voice options to improve accessibility. After the end of the conversation, our system generates a summary of all the feedback provided during each phase, identifying users’ strong areas, as well as weaknesses. The feedback also includes suggestions for improvement.

An example of feedback interface for each conversation phase shown below. Users can receive either positive or negative feedback for each of the four conversational skills cues. For example, a user can receive positive feedback on eye contact and speaking volume, and negative feedback on smiling and content.

In a longitudinal experiment with 20 older adults (age 60 or above) with eight brief sessions across 4-6 weeks, participants showed statistical and clinical significance in improvement in eye contact and facial expressivity.

Related papers:

Z. Razavi, L. Schubert, K. V. Orden, M. R. Ali, B. Kane, E. Hoque, Discourse Behavior of Older Adults Interacting With a Dialogue Agent Competent in Multiple TopicsACM Transactions on Intelligent Interactive Systems (TiiS), Vol 12. No. 2, June 2022

M. R. Ali, E. Hoque, P. Duberstein, L. Schubert, S. Z. Razavi, B. Kane, C. Silva, J. S. Daks, M. Huang, K. V. Orden,  Aging and Engaging: A Pilot Randomized Controlled Trial of an Online Conversational Skills Coach for Older AdultsThe American Journal of Geriatric Psychiatry (AJGP), December 2020.

M. Rafayet. Ali, K.V. Orden, K. Parkhurst, V. Nguyen, S. Liu, P. Duberstein, M. E. Hoque, Aging and Engaging: A Social Conversational Skills Training Program for Older Adults, ACM intelligent user interfaces (IUI 2018).

Code

The code is now released:
https://github.com/mali7/lissa

Contributions appreciated.