Designing Trustworthy CUIs for Reliable Information Retrieval
A design thinking study exploring how conversational interfaces can be designed to support reliable, AI-driven information retrieval in quality management systems, with a focus on transparency and user trust.
- M.Sc. Thesis
- Design Thinking
- User Research
- Usability Testing
- Prototyping
- Figma
- Role
- UX Researcher & Designer
- Timeline
- Autumn 2025
- External Partner
- AM System
Problem
Quality management systems store critical compliance documentation, but finding the right information can be slow and dependent on prior system knowledge. Folder-based navigation and free-text search often assume users already know what they're looking for.
A conversational interface offers an alternative. But in a compliance-driven domain, the design question isn't whether it works. It's whether users can trust what it surfaces.
Process

Empathize: Interviews
Semi-structured interviews were conducted with 3 end users and 2 AM System employees to understand current workflows, pain points, and attitudes toward AI-assisted search.
Define: Thematic Analysis & User Need Statements
A thematic analysis of the interview data resulted in 6 themes and 7 user need statements. Key themes included transparency enables trust, reliability through metadata, and efficiency as a motivator.
Ideate: HMW Questions & Brainstorming
The user need statements were reframed into How Might We questions to create actionable design challenges. The ideation was conducted individually due to project constraints, which focused the output but limited the breadth of ideas explored. Solutions centered on source visibility, metadata display, role-based filtering, and multiple document formats.
Prototype: LoFi → HiFi in Figma
Two prototype iterations were created in Figma. LoFi testing revealed issues with navigation, visual density, and response length, which were addressed in the HiFi iteration. AI responses were simulated through Figma rather than a live implementation, which kept the focus on interface design and trust independent of AI variability.
Test: Two rounds of moderated usability testing
Two rounds of moderated usability testing were conducted. The LoFi prototype was tested with 4 AM System employees, and the HiFi prototype with 4 end users using think-aloud protocol, SUS, and S-TIAS questionnaires.
Prototype






Results
The HiFi prototype received a SUS score of 90 (an 'A' grade indicating excellent perceived usability) and a trust score of 5.50 out of 7 on the S-TIAS scale. All four participants completed tasks without difficulty and described natural-language search as intuitive and familiar.
But the numbers only tell part of the story. The more interesting finding was how users interacted with the generated answers: not as a final authority, but as an entry point.
90
SUS Score — System Usability Scale
Grade A • Excellent Usability
5.50
Trust Score — S-TIAS
High level of perceived trust in the prototype.
Users treated the generated answer as a starting point, not the conclusion.
Every participant opened the source card at least once after receiving a response, often immediately, before finishing reading the answer itself. They weren't looking for the AI to be right. They were looking for a faster way to reach the document they could verify themselves.
As one participant put it: "Without the source card, I would not have felt like I could trust the response."
Trust was not built through the idea of an AI-driven system. It was built through transparency.
What made users trust the system was the ability to verify information by accessing the source document and validating the metadata: creator, approver, version, and date. The conversational interaction itself was familiar and efficient, but it wasn't what created trust. Interface design, specifically the source card, document metadata, and one-click access to the original document, had an impact on perceived trust.