Skip to main content

The biggest mistake engineers make in AI system design interviews

Before you choose a vector database or retrieval strategy, there are four product questions every AI system design interview requires.

The biggest mistake engineers make in AI system design interviews

When an interviewer asks you to design an AI-powered feature, the pull to demonstrate technical knowledge is immediate.

The problem is that reaching for those decisions before you understand the system is one of the fastest ways to lose the thread of the interview.

Join a live working session

Formation Studio Workshops — free, live, interactive interview practice sessions for senior software engineers, designed around how interviews actually run at top tech companies. These aren’t passive webinars. They’re mentor-led working sessions where engineers think out loud, make decisions, and debate tradeoffs in real time.

Find a Session

The trap of showing what you know

AI system design is a relatively new interview format, and candidates know it's a hot topic. So when a prompt involves an LLM, there's a strong instinct to signal fluency quickly. What database should we use to store context documents? Should we use RAG? What's our prompt strategy?

These are real questions that will matter eventually. The issue is timing. Making architectural decisions before the system is understood means you might be solving the wrong problem, and experienced interviewers notice.

The reframe that helps is remembering that designing a system that involves an LLM is still just system design.

The LLM is a component with specific properties. It generates natural language, it can make some decisions, and it can hallucinate. Your job is to figure out where it plugs in and what it needs to function well. That work starts with product questions, not tooling choices.

The first question to ask: interactive or agentic?

Before touching architecture, establish the interaction model. There are two modes:

Interactive: The user makes a request and waits for a response. The system is linear. Latency matters. State is simpler because the conversation itself provides context.

Agentic: The user delegates a task and comes back later. The system works through it independently, making its own decisions about next steps across multiple LLM calls. State management becomes a central design challenge.

What to ask about context before touching architecture

Once you know the interaction model, understand the context. Specifically, is it known upfront, or does the system have to find it?

Questions worth surfacing before making any retrieval decisions:

  • How large is the document corpus? A small corpus might live directly in the prompt. A large one means you've identified a retrieval problem that needs its own solution.
  • How often does the content change? A rapidly changing corpus may require an entirely separate system to keep context fresh.
  • Where does the content live today? This informs the retrieval approach, not the other way around.

This is where decisions like RAG or vector database selection start to make sense, after you understand what the context looks like, not before.

Why reads and writes change the design

Reads are generally safe to hand off to an LLM. The consequences of a mistake are limited.

Writes are a different situation. If a chatbot can make account changes on behalf of a user, upgrade a subscription, or transfer funds, you now have a design problem around:

  • Authorization: Who has the right to take that action?
  • Reversibility: What happens if the LLM takes the wrong one?
  • Scope: What is the system actually allowed to do?

Surfacing the read/write shape early gets these questions on the table before they become blind spots. It also gives you something concrete to reason about when you get to guardrails and validation later.

Sign up for our newsletter

Get the latest in tech right in your inbox

Build interviewer trust

One of the most effective strategies in any system design interview is working breadth-first. Start at a high level, break the problem down progressively, and identify design challenges at each layer before going deep on any single component.

Breadth-first thinking generates a visible to-do list even when you run out of time. If you've surfaced questions about write authorization, context retrieval complexity, and failure handling in the first half of the interview, those are on record. The interviewer knows you were thinking about them.

Technical interview feedback often includes notes on things the candidate never mentioned. There's a meaningful difference between not solving a problem and never raising it.

Breadth-first thinking also helps you identify the most critical component early so you can allocate the interview accordingly. Knowing that early means you spend time on the right things.

Designing for when the AI is wrong

This is where AI system design diverges most clearly from traditional system design, and where senior-level signal tends to show up.

In a conventional system, failures are mostly predictable. In an AI-powered system, the failure mode is a response that's confidently wrong, and the system doesn't know it.

The question worth asking early is what's the blast radius?

Raising these consequences before you design the validation layer demonstrates that you understand how AI systems actually fail in production.

Practice AI system design with Formation

Formation's Studio Sessions include live, mentor-led system design practice built around the kinds of questions engineers are seeing in real interviews right now.

If you want structured preparation before your next loop, learn more about the Formation Fellowship.

Frequently asked questions

What's the biggest mistake engineers make in AI system design interviews?

Jumping to architectural decisions like database selection or retrieval strategy before understanding the system being built. Those decisions matter, but they're downstream of product-level questions about users, interaction models, and context.

What is the difference between an interactive and agentic AI system?

An interactive system responds to user requests in real time. An agentic system works through a task independently, makes its own decisions about next steps, and determines when output is ready to return. The distinction drives significant architectural differences, especially around state management and latency.

What product questions should you ask in an AI system design interview?

Start with who the system is for and how users interact with it. Establish whether context is known upfront or discovered at runtime. Ask about the read/write shape of the system. And raise what happens when the AI is wrong. These questions generate the right engineering decisions.

How do you show senior-level judgment in an AI system design interview?

Work breadth-first to surface design problems across the whole system before going deep on any one component. Identify the most critical part early. Reason clearly about failure modes and their consequences.