Yours, Critically

Most designers critique alone. This AI design critic pushes back so your users don't have to.

Yours, Critically

Most designers critique alone. This AI design critic pushes back so your users don't have to.

From Prompt to Product

From Prompt to Product

From Prompt to Product

Overview:

Yours, Critically is an AI-powered design critique tool that helps designers identify blind spots in their decisions. Instead of providing generic feedback, the system challenges assumptions, highlights risks, and surfaces trade-offs to encourage deeper thinking.

This AI agent is designed to feel less like a chatbot and more like a structured critique partner. It guides users through a focused interaction: describe a decision, receive a critique, and iteratively challenge the output up to a defined limit.

The goal is to shift designers from seeking validation to engaging in critical reasoning.

The Problem:

Design decisions get made in isolation every day. A junior designer submits a flow without anyone questioning the assumptions baked into it. A solo designer at a startup ships a feature without a single senior voice pushing back. The feedback comes later, from users, from metrics, from stakeholders in a review meeting.

By then, it's expensive to fix.

The problem isn't skill. It's access. Not every designer has a senior critic in the room. Not every team has the bandwidth for rigorous design review. And most AI tools don't help — they generate, suggest, and summarise. None of them challenge.

Yours, Critically was built to fill that gap.

The product is simple on the surface: describe a design decision, receive a structured critique. But the thinking behind it is deliberately opinionated. Every element, the tone system, the output structure, the follow-up depth ladder, the confidence scoring, was designed to make the critique feel trustworthy, not just fast.

Product Desicions:

The Name

Most AI tools name themselves after capability: "DesignReview," "FeedbackAI," "CritiqueBot." Functional, forgettable.

Most AI tools name themselves after capability: "DesignReview," "FeedbackAI," "CritiqueBot." Functional, forgettable.

"Yours, Critically" reframes feedback from something external to something reflective.

Instead of: “The system is critiquing you”

It feels like: “You are critiquing your own thinking”

"Yours, Critically" reframes feedback from something external to something reflective.

Instead of: “The system is critiquing you”

It feels like: “You are critiquing your own thinking”

The comma is intentional. It creates a pause, like a considered thought before a difficult truth.

The Tone

Most AI tools default to supportive, overly agreeable language.

The tone here is intentionally: Constructive, Direct, and Critical. Each shifts the register of the output without changing the structure. The analysis stays the same. The delivery changes.

Interaction Model

Most AI tools default to a chat interface because it is familiar. Yours, Critically was designed against that instinct. It is structured as a document, not a conversation.

Six sections: Critique, Risks, Trade-offs, Suggestions, Assumptions, Confidence.

This makes the output scannable, actionable, and credible.

The chat layer exists below the document, separated by a "Go deeper" divider.

Users can follow up up to five times. After that, the system prompts them to start a fresh perspective.

This constraint is intentional: it prevents the tool from drifting into an unstructured chatbot and preserves the quality of the critique.

Confidence scoring

Most AI feedback tools give you output with no indication of how reliable it is. Yours, Critically surfaces a confidence score with every critique: High, Medium, or Low, with a one-sentence explanation of what drove that score.

The score is tied to three factors: input clarity, pattern familiarity, and assumption level.

A vague input produces a lower confidence score and tells the user why. This directly improves input quality over time, and makes the critique feel earned, not generated.

The signature

Every output ends with "Yours, critically." It is a brand moment, not a UI element.

The Process:

  1. Product Thinking

The first phase was entirely about definition. Not wireframes, not colours, decisions.

Who is this for? A solo product designer at a startup, 1-5 years in, working without a senior feedback loop. What is the single most painful moment? Staring at a design decision you've made alone, unsure if it holds up, with no one to challenge it.

  1. AI Prompt Design

A three-step reasoning chain was designed explicitly into the prompt:

Interpret: identify the goal, the flow type, and the likely user context.

Analyse: detect hidden assumptions, identify UX patterns, consider failure scenarios.

Generate: produce a structured critique in a fixed format, grounded in steps 1 and 2.

This separation of reasoning from output prevents the most common AI failure: generic feedback that could apply to any design.

The follow-up behaviour was also designed explicitly, not left to chance. Each of the five follow-up turns deepens the critique on a defined ladder:

Clarifying → Pressing → Stress-testing → Reframing → Concluding.

  1. Design

The design system was built around a monochromatic blue palette: #062456 as the primary, #4A6188 as secondary, #ADB9CB for muted elements, #E3E7EE for surfaces. Oswald for headings, Inter for body.

The visual direction was editorial, not app-like. More document than dashboard. More letter than chatbot. Every spacing decision, typography choice, and colour application was made to reinforce that feeling.

Wireframes came before high fidelity. High-fidelity screens were designed in both Figma and Claude Design, then used as the reference for the build in Replit.

Wireframe
High-Fidelity
Wireframe
High-Fidelity
Wireframe
High-Fidelity
  1. Build

The functional build was done in Replit and Lovable using vibe coding, natural language prompts as the primary build tool, with the high-fidelity design as the reference.

The functional build was done in Replit and Lovable using vibe coding, natural language prompts as the primary build tool, with the high-fidelity design as the reference.

The Claude API was connected using a structured system prompt that enforces the three-step reasoning chain, the output format, the tone behaviour, and the depth ladder for follow-ups.

The Claude API was connected using a structured system prompt that enforces the three-step reasoning chain, the output format, the tone behaviour, and the depth ladder for follow-ups.

The build process reinforced something important: a precise design system produces better vibe coding results. When the colours, spacing, and typography are fully defined before prompting, the AI builds more accurately and requires fewer correction cycles.

The build process reinforced something important: a precise design system produces better vibe coding results. When the colours, spacing, and typography are fully defined before prompting, the AI builds more accurately and requires fewer correction cycles.

The Result:

Yours, Critically intentionally has no onboarding, no dashboard, no account creation. You arrive, you describe a decision, you receive a critique.

Replit

What I Learned:

On product thinking

The most valuable weeks of this project were the ones before anything was built. Defining the user, the pain moment, the tone system, and the AI behaviour before opening a single tool meant that every subsequent decision had a clear reason behind it.

The most valuable weeks of this project were the ones before anything was built. Defining the user, the pain moment, the tone system, and the AI behaviour before opening a single tool meant that every subsequent decision had a clear reason behind it.

On designing AI behaviour

Prompting is a design skill. The system prompt for Yours, Critically went through as many iterations as the visual design, maybe more. Every word in it is a decision about how the product behaves.

Prompting is a design skill. The system prompt for Yours, Critically went through as many iterations as the visual design, maybe more. Every word in it is a decision about how the product behaves.

The biggest shift in my thinking: AI output quality is a design problem, not a technical one. The model is capable. The question is whether you have given it the right structure, constraints, and reasoning framework to produce something worth using.

The biggest shift in my thinking: AI output quality is a design problem, not a technical one. The model is capable. The question is whether you have given it the right structure, constraints, and reasoning framework to produce something worth using.

On vibe coding

My design background turned out to be a direct advantage, not just adjacent to the work. Thinking in components, describing interaction states, writing precise briefs, that is exactly what effective prompting requires.

My design background turned out to be a direct advantage, not just adjacent to the work. Thinking in components, describing interaction states, writing precise briefs, that is exactly what effective prompting requires.

The clearer and more specific the design system before building, the more accurate the vibe coded output. Ambiguous prompts produce ambiguous results. A fully specified design system produces builds that require fewer correction cycles.

The clearer and more specific the design system before building, the more accurate the vibe coded output. Ambiguous prompts produce ambiguous results. A fully specified design system produces builds that require fewer correction cycles.

On constraint as a feature

The five follow-up limit was the most counterintuitive decision in the product. Every instinct said: give users more. More turns, more depth, more flexibility. The right instinct was: give users enough. Five turns is sufficient to reach a meaningful conclusion on any design decision. Beyond that, quality degrades and the product loses its identity.

The five follow-up limit was the most counterintuitive decision in the product. Every instinct said: give users more. More turns, more depth, more flexibility. The right instinct was: give users enough. Five turns is sufficient to reach a meaningful conclusion on any design decision. Beyond that, quality degrades and the product loses its identity.