AI-Driven Product Design

AI Insights: turning plain-English questions into dashboards people trust

How a working knowledge of the natural-language-to-SQL pipeline, paired with rigorous user research, unlocked a +35% lift in weekly active use inside Renaissance Member Central.

My Role

Lead UX / Product Designer

Timeline

4 Months (2026)

Core Tech Stack

Azure OpenAI, NL-to-SQL, Schema-Grounded Prompting, Angular / RAIS

Key Outcome

+35% Weekly Active Use

Executive Summary & Impact

The Challenge

Member Central holds the data insurance agencies live on: premium, loss ratios, retention, carrier mix, renewals. Reaching it meant requesting a report, waiting on an analyst, or living inside fixed dashboards. The team wanted an AI layer that answered questions directly, but an LLM that can be wrong is a trust problem before it is a feature.

The Strategy

I bridged the engine and the user's mental model. The model only translates English into a governed SQL query, so I built the experience around that fact: an invisible AI, a single source of truth shared with the rest of the product, and explicit, deterministic response states that turn the model's limits into guidance instead of dead ends.

A saved Agency Admin dashboard in Member Central with premium, loss ratio, renewals, growth, and carrier-mix widgets — A saved Agency Admin dashboard — every widget produced from a plain-English question or the library.

+35%

Weekly active use of the Insights section after launch

Replace with measured

2×

Perceived speed in testing once generation showed its progress

Replace with measured

92%

Positive feedback on trusting the numbers it returned

Replace with measured

Deep User Research & Mental Models

I ran a discovery phase with internal Growth Advisors and Member Success Partners, then validated against the field with 15 interviews across external agency producers, admins, and service staff. The finding was consistent: users were not rejecting AI for lack of capability. They rejected it for lack of trust and predictability.

The Black Box Paradox

People felt anxious clicking Generate without knowing what data the model would touch, or whether the answer was the real numbers or an "AI version" of them.

The Cost of a Wrong Number

In insurance, one wrong-looking figure discredits the whole tool. Distrusting an answer costs more time than the old manual report ever did.

They Think in Questions, Not Charts

Nobody asked for a grouped bar chart. They asked "how am I doing versus last year" and expected the tool to choose the right view.

Saving Is the Real Job

The first answer is nice. The recurring value is pinning the answers that matter into a dashboard they open every Monday.

What they ask

"Premium versus last year"

"Which carriers am I heaviest with"

"What renews this month"

Where trust breaks

"Is this the real number or the AI's guess?"

"It picked a grouping I didn't ask for"

"I can't tell where this came from"

What they keep

"The 3 charts I check weekly"

"Something I can send my team"

"A starting point for my role"

!Friction point: confidence collapses at the moment of generating, when the system acts without showing its reasoning or its source.

Affinity synthesis. Trust did not break at the result. It broke at the click, before anyone saw a chart.

Technical Constraints & AI Orchestration Know-How

A great AI designer does not just sketch perfect states. They design for the edge cases of the engine. I mapped the pipeline with engineering first: a question goes to a NestJS backend that wraps it in a schema manifest of nine approved tables, Azure OpenAI returns a single SQL SELECT, and a validator rejects anything that is not a clean read scoped to the user's own agency. Then the interface mechanics were drawn to match.

The Engineering Constraint

Slow, and sometimes wrong

Generation could take several seconds while the model wrote SQL, the validator checked it, and the query ran. Worse, the question could be ambiguous, the data unavailable, or not a data question at all.

The UX Remediation

A response state for every outcome

Instead of a spinner and a generic error, I designed a five-state response system. The wait shows what is being built, and each failure mode becomes a focused, recoverable next step in the product's own voice.

Schema-Grounded PromptingSQL Safety ValidationResponse State MachineOne Source of TruthRole-Scoped Queries

The AI Insights generate experience: a plain-English prompt being turned into a governed, scoped query — The real generate experience in Member Central — a plain-English prompt becomes a governed, agency-scoped query.

Show written premium by month for personal auto last yearGenerate

Try:Premium growth this yearPremium by line of business

Success

✓

Here's your chart

Written premium by month, personal auto, last year. Added to your dashboard.

Needs clarification

Which grouping would you like?

"By region" could mean billing state or agency territory.

Billing stateAgency territory

Partial match

What would you like to measure?

I can chart this by policy count or by written premium.

Policy countWritten premium

Unsupported data

That data isn't available yet

Claims adjuster notes aren't part of AI Insights. Here's what you can ask instead.

Not a data request

I can help build a data widget

Try asking about premium, retention, carriers, or renewals.

The five response states. Ambiguity, missing data, and off-topic prompts become guidance, not dead ends.

Iterative Usability Testing & Validation

I ran two phases of moderated testing with 10 external users on interactive prototypes, to validate whether the response framework actually mitigated distrust.

Phase 1

Silent guess vs. one question

Early on the system made a best guess on ambiguous prompts. Watching people quietly distrust a grouping they never asked for, I made the one-question clarification the default. Perceived competence jumped, even though the engine was unchanged.

Phase 2

Confidence and verification

Users wanted proof the number was real. I added plain captions that name the data and tie each widget visibly to the same source of truth as the rest of Member Central, so the answer reads as Member Central data, not "AI data."

Before

Premium by region

Grouped by billing state

⚠ Did you mean territory?

After

Which grouping would you like?

"By region" is ambiguous.

Billing stateAgency territory

✓ Confirmed by the user

A/B variant: a silent best guess (low confidence) versus one clarifying question (high confidence).

The Outcome: Adoption & Trust

The feature became sticky because it treats the user as the person in control, not a passive observer waiting on a black box. People ask in their own words, get an answer that matches the rest of their data, and keep the ones that matter.

For the first time I'm not second-guessing the tool. It asks what I mean when it's not sure, and the numbers match what I already see in Member Central, so I just trust it.

Agency Admin, external user testing

Takeaways for hiring managers

1AI is a design challenge, not just an engineering one. Trust is won or lost in how ambiguity, latency, and sourcing are surfaced.

2Know the engine. Designing the five response states required understanding exactly how and where the NL-to-SQL pipeline fails.

3Invisible AI is the goal. The best outcome was that nobody using it was thinking about the model at all.

← All Work Start a project ↗