I worked with the HCDE directed research group, led by Ph.D. student Kevin Feng and Professor David McDonald, to design the future of “Human-agent interaction”. As Agents may usher in the next wave of AI and UX innovation after today’s chatbots, we wonder what the future holds for the next human-agent interaction paradigms.
Our objective was to understand the ins and outs of Human-agent interaction. We investigate through the lens of its Design challenges, solutions, and interactions, to ultimately design new interaction paradigms tailored for AI agents.
I was the only designer on this project. I worked closely with the Research Leads alongside my design peers.
We did this project quickly, with 10 weeks split: the first 5 weeks on literature review, 1 week on concept development, and the last 4 weeks on iterative design work.
During the research phase, we examined current methodologies used in designing AI agents. One of these methodologies was the Microsoft HAI (Human Agent Interaction) guidelines.
While the guidelines provide foundational principles and ground rules for agent design, they may not fully address the challenges posed by modern or future LLM agent paradigms with virtually infinite possible outputs.
After further examination, Microsoft HAI guidelines G9 and G11 emerged as guidelines that revealed a critical gap when applied to existing systems. With the next wave of LLM agents, users can execute dozens of steps independently while walking away. These guidelines raise a gap for systems they were not designed for.
After synthesizing modern agents, conducting literature analysis, and engaging in research discussions, I identified three interconnected friction patterns. Two stem directly from Microsoft HAI guidelines, G11 and G9. The third emerged from observing tools like ChatGPT Deep Research, which lock users into a single execution path. No way to branch or backtrack without losing all progress. It was my role to discover a new interaction paradigm to address these gaps.
Timeline Control is an innovative version-control interface designed to address scalable oversight in the "Black box" era of autonomous agents. Using ChatGPT agent as its medium, this project introduces scrubbing, branching, and merging, further strengthening the bridge between research and taking action on the web.
Users click between branches in the current view panel. The agent's visual browser and plan history are updated instantly, showing each branch's progress and chapters without losing context.
Clicking the connection between two nodes in the plan history reveals a tooltip with the agent's rationale. Users no longer see just input and output; they know the reasoning that connects them.
The expand button transforms the plan history into a comprehensive view. Users can inspect any chapter in depth, revealing a complete rationale behind what the agent did, why it did it, and what it considered along the way.
Dragging the one branch onto another triggers a merge preview. The expanded plan history shows which chapters will be replaced, struck through in red. Locked chapters indicate hard constraints; clicking the lock explains why that chapter must be removed for the merge to work.
Chapters without a lock can be toggled. Clicking the minus symbol offers two options: add or remove. Adding a removed chapter turns it green but triggers a trade-off, turning the other chapter red. A final confirmation pop-up summarizes the changes and any costs before you commit. One click, and the branches become a single, unified plan.
With the three friction patterns defined, I brainstormed potential solutions and graphed them by how many patterns they addressed and how novel their approach was. After mapping out possible solutions, Timeline Control for Agents emerged as the clear solution to address the entire problem space.
Since direct competitors for agent timeline control don't exist yet, I looked at analogous solutions. After testing a handful of products and watching videos, I identified functional design patterns that addressed key problems
Users prompt the agent, and it decomposes the task into chapters. During execution, they monitor progress through a live view and a plan history panel. If they step away, they scrub back through the timeline to catch up. Clicking edges between steps reveals the agent's reasoning. From any past node, users can branch into an alternative direction without losing the original path, then merge the best results back into a single, unified plan.
After going through sketches and brainstorming ideas, I built rough wireframes on top of ChatGPT's agent interface to keep the concept grounded. For each screen I created, I reviewed it with my team and against the mapped logic. I continued to expand on the designs that worked, and cut the ones that didn't align with the project scope.

I conducted usability tests and reviewed the mid-fi prototype with my research leads. Three notable issues emerged: the plan history showed steps but not the agent’s reasoning behind them, the merge interface was visually overwhelming, and branch nodes gave no indication of completion state. Each problem identified usability gaps in clarity and feedback that needed to be addressed before moving to high fidelity.

With three usability gaps identified, I moved into high-fidelity refinement. Each iteration focused on a single issue, ensuring changes were intentional and traceable. The before-and-after comparisons below show how I resolved each problem.

Edges between plan chapters lacked progressive disclosure, users could see the connection but had no way to access the agents underlying decision logic.

I introduced a two-tier disclosure pattern on edges: click between two nodes to reveal the agents rationale, expand for full context. This reduces cognitive load while keeping rationale accessible on demand.
The merge interface was described as visually overwhelming, resulting in high cognitive load and a poor signal-to-noise ratio. Users had difficulty understanding the interfaces’ roles.
I established a clear visual hierarchy through color coding and distinctive icons. This improves scannability and reduces time-to-comprehension during conflict resolution.

In the past design, branch nodes lacked a state indication for completed or incomplete objectives because nodes were identical. Violating the heuristic of system status visibility.

Now, users can see whether objectives are completed. Hollow nodes indicate pending status, while filled nodes are completed objectives.
Looking back, the key takeaway from this project is the value of research. At first, I wanted to jump straight into designs; however, I'm glad we stuck with the first five weeks of literature review; learning how agents work, where existing guidelines fall short, and scoping down the problem space gave me a complete foundation before I touched design. Designing for a problem space with no competitors to reference was a challenge, but that's precisely why the research mattered.
Thanks to our excellent research leads, Kevin Feng and Professor McDonald, and my design peers (Gloria, Amber, Jane), without their constructive feedback, advice during iterations, and vast knowledge on AI agents, I would not have gotten as far as I have. I'm truly grateful to the DRG team for pushing my design limits!










