Case study: Using the Wizard of Oz usability test method for artificial intelligence design

A Virtual Health Coach user experience project to design a SMS-chatbot to motivate people with low health literacy to exercise
Ever heard of a Wizard of Oz study as a way to test user experience? Neither had I until I worked within a team to design the conversations of a chatbot dedicated to improving users’ exercise habits.
As an exercise fanatic and a health technology believer, I was excited to enroll in Digital Health Design and Evaluation for Diverse Populations at UC Berkeley in Spring 2020. While completing my Master of Public Health in Epidemiology and Biostatistics, I wanted to explore my other passion for user experience research. Through this project, my course team worked with an existing group that had already set up the initial prototype of a chatbot with aims to improve upon its ability to encourage people to exercise.

One of our main research goals was to determine how conversations could be more natural and encouraging as the target population was to support an older population with low literacy, including people who spoke Spanish as their first language. The design goal was to combine the benefits of a personal health coach with the flexibility of artificial intelligence to meet the specific needs of the population.
Discover Phase
My teammates and I worked together to conduct background research and evaluate what had already been done in this field so that we could bring strengths from similar technology and improve upon gaps. We focused our research on companies that used methods of encouraging users to exercise and form habits. The development of the messages that cue individuals into action derive from adopting behavior frameworks such as COM-B, capability, opportunity and motivation behavior system.
Define Phase
We iterated on the existing prototype and developed multiple personas as a way to introduce a variety of conversational flows to meet the heterogeneous needs of the target users. We found that the top design issues to tackle were the following:
- Language Barrier
As we were designing for both Spanish speakers and English speakers, it was important adjust the way sentences were phrased to ensure that the tone of encouragement was acceptable.
2. Environment
From synthesizing information from the transcripts of a previous chatbot research project we had access to, we discovered that we needed to design conversations around how to manage encouraging physical activity when a user has a dangerous environment/neighborhood.
3. Technology Literacy
Many people use default text messaging on a regular basis as a way to communicate so designs need to be default-friendly.
Design Phase
As a form of a usability test, we conducted a Wizard of Oz study in which we set up a Google phone number where I sent text messages to our test participant while they believed that they were interacting with a system thought to be autonomous.

From our research, we decided to create four different pathways that the chatbot could have full conversations with the participant.
- Social Support
- Health Improvement
- Self-Motivated (for people who already exercise)
- Less Motivated (for people who do not normally exercise)

Deliver Phase
From affinity mapping common themes that we found in the Wizard of Oz study, we discovered the following key design priorities:
- people like to exercise with social company so the chatbot should have that option built into conversations
- conversational language is helpful especially for people who prefer using default text
- simplistic cues, gifs, or videos can be embedded to teach people how to do exercises
With the Virtual Health Coach, we wanted the chatbot to have natural conversation flows with users while also encouraging healthy outcomes and sustainability. From our iteration process, we observed both similarities and differences in the feedback from our English and Spanish speaking test users. Overall, we focused on improving the language of how people would interact with chatbot, iterated on the pathways to help people achieve their goals, and gained insights into how we can personalize the chatbot for the end-user. We are optimistic that the chatbot tool will beneficial to a broad group of users for whom many digital health tools are typically not designed.
I worked with a diverse group to complete this project. The UC Berkeley (UCB) Virtual Health Coach stakeholders included members from the Social Welfare Department, Computer Science, and the School of Information while the course team consisted of MPH students, and a senior undergraduate student studying Political Economy and Design.
Quick Minute Read
- Through my Digital Health Design and Evaluation in Diverse Populations course, I worked in a team to design conversations between the chatbot and the user to improve exercise habits for people with low health literacy.
- After conducting background research on existing products and empathizing with potential users, we decided to focus on designing for language barriers, the users’ physical environment for exercising, and technological literacy.
- We conducted a Wizard of Oz test to test the usability of the chatbot conversations in which we prepared the script and recruited participants. On one end, I would type messages that the chatbot would say to the user and on the other end, the user testing participant would reply back.
- Synthesizing learnings from testing, we created four different pathways of conversations for social support, health improvement, self-motivated users, and people who were less motivated.
- Outcome: We created natural conversation flows so that the chatbot could encourage exercise habits for users in ways that would meet the users’ language preferences and their exercise interests.