Conversational Interface | Hyo Jin (Gina) Do

Introduction

Natural language interfaces are increasingly deployed to help non-expert users navigate technical systems. I studied these natural language interfaces within the context of end-user programming tools, which allow users to create rules connecting business applications and data using naturalistic commands.

For example, a user might type, “When there is a new incident on ServiceNow, send me a message on Slack and an email to my Gmail account,” and automate the workflow between ServiceNow, Slack, and Gmail.

Problem

Goal-oriented natural language systems often struggle with abstraction matching, which refers to the difficulty users face in formulating utterances at an abstraction level the system can process (e.g., selecting correct vocabulary or sentence structure). Repeated failure to provide the “matching” input can lead to persistent system errors, making users to leave the system.

Team

A team of user researchers, a UI designer, and software developers collaborated. My role was to conduct end-to-end user research, including planning experimental protocols, recruiting participants, designing surveys, running statistical analysis, and writing up results.

Designs

Drawing on Clark and Brennan’s communication grounding principle, I designed a conversational grounding interface where the agent and user collaborate to build the command. Instead of requiring a complete sentence upfront, the agent asks the user to compose provisional input; the user and agent then take turns presenting, referencing, and revising their inputs collaboratively until the user reaches their goal.

Control Interface without grounding interaction

I also designed variations of this grounding interface that allowed users to select from multiple options or use fill-in-the-blank style structured templates.

Grounding interface with multiple options (Multiple grounding)

Grounding interface with structured input fields (Structured grounding)

User Study

We conducted a between-subjects experiment with 80 crowdworkers from Amazon Mechanical Turk. Each participant used one of the four interfaces: grounding, multiple grounding, structured grounding, and control interfaces.

Participants were given two tasks, with the goal to write a sentence describing a trigger-action program to automate a workflow. They responded to survey questions about their cognitive load, acceptance and perceptions of the system, and experiences with the system. We also measured task performance and communication costs. We ran statistical analysis (linear regression, linear mixed-effects regression) to compare the results.

Findings

The proposed grounding interfaces significantly reduced cognitive load and improved task performance. Furthermore, providing input structures enhanced the benefits of grounding by reducing conversational turns, improving task performance, and increasing technology acceptance, allowing users to feel in control without feeling constrained.

Deliverables

Product Impact

This research influenced the design of IBM App Connect, a natural language workflow automation system.

Publication

The work was presented at the ACM conference on Computer-Supported Cooperative Work (CSCW) (Do et al., 2024).

← Back to Projects