Interview Design

AI-Proof Technical Interview Questions: What Still Works in 2026

ClarityHire Team(Editorial)2026-06-096 min read

The problem with "AI-proof"

Strictly speaking, no interview question is AI-proof. A determined candidate with a second monitor and a fast typist can launder almost any answer through an LLM. The realistic goal is not impossibility — it is unfavorable economics: questions where the cost of using AI well is higher than the cost of just answering.

When AI use stops being a shortcut, it stops being a problem. The patterns below are the ones we see hold up across our own assessment library and across customer interview loops in 2026.

Pattern 1: Anchor every question in a piece of code the candidate must read first

A standalone prompt — "write a function that does X" — is in the training set. A prompt that depends on 80 lines of code the candidate has never seen is not.

Concrete shape:

Give a small repo or file (40–200 lines).
Ask a question whose answer requires understanding that specific code: "what would happen if two requests hit processOrder concurrently with the same order id?"
Then ask for a fix.

An LLM with no view of the file can guess a generic answer. A candidate who has read the code can give a specific one. The gap between the two is the signal.

This is the same logic behind the fix-the-codebase format, applied to question wording rather than question format.

Pattern 2: Ambiguous requirements that punish premature coding

LLMs are trained to produce confident, complete-looking solutions. They are bad at sitting with ambiguity. Questions that reward asking clarifying questions before writing code naturally select against AI pasting.

Example prompt: "We want to add a 'mark as favorite' feature to our notes app. Walk me through how you'd build it."

A senior engineer asks: per-user or global? Sync across devices? Order preserved? What happens when a note is deleted? Each clarifying question is signal. An AI-pasted answer skips straight to a favorites table schema and never surfaces a single tradeoff.

Score the number of meaningful clarifications before code, not just the code. This is the same principle behind a system design rubric that doesn't reward buzzwords — both penalize confident-sounding but context-blind answers.

Pattern 3: Follow-up questions that require ownership of the previous answer

The single most AI-resistant move in any interview is the live followup. An LLM can write code. It cannot follow up on its own code in real time, with the interviewer watching.

Useful followup patterns:

"Why did you pick a hash map here? What would change if the inputs were sorted?"
"Walk me through what happens if this function is called from two threads."
"I'm going to change the requirements: instead of one user, it's a million. What breaks first?"
"You wrote const result = x ?? 0. What's the difference between ?? and || here?"

Candidates who wrote their own code can riff on it. Candidates who pasted can usually answer one followup, sometimes two, and then collapse. Three followups is a near-perfect filter — and the same playbook that works on take-home submissions works inside live rounds.

Pattern 4: Questions with novel constraints, not novel topics

You do not need an exotic algorithm to defeat training-set memorization. You need a small twist on a familiar problem. The twist forces actual reasoning rather than pattern-matched recall.

Concrete twists that work:

A custom data shape ("the input is a stream of {userId, eventType, ts} objects, not an array of integers").
An unusual cost model ("reads are free, writes cost 100x — design accordingly").
A constraint that flips the obvious solution ("you cannot use any hash-based structure").
A real-world dimension ("the function will run in a Lambda with a 6 MB memory cap").

The classic problem ("find duplicates") is in every model's training data. The same problem with {userId, ts} events and a memory cap is not — at least not in a form the model can paste directly. The candidate has to adapt, and adaptation is the skill you are actually hiring for.

Pattern 5: "Explain a tradeoff you have personally made" questions

Pure experience-elicitation questions are extremely hard to fake on the fly. The LLM can generate a plausible-sounding war story; it cannot make the story specific to a system this candidate actually shipped, and it definitely cannot answer four targeted followups about that specific system.

The pattern:

Ask a tradeoff question: "Tell me about a time you picked a worse-on-paper solution because the better one was wrong for the context."
After they answer, ask: "What was the metric you were optimizing for?"
Then: "What was the strongest argument against your choice?"
Then: "What would you do differently now?"

Combined with a structured behavioral rubric, these questions are nearly impossible to fake at the resolution interviews actually run at. The candidate who actually shipped the system answers in 90 seconds. The candidate who is asking ChatGPT in another tab takes 30 seconds to start, gives a generic answer, and fumbles the second followup.

What does not work in 2026

For completeness, the patterns that are now mostly dead signal:

Classic LeetCode-style algorithm prompts — pasted into any LLM, solved in seconds.
"Write a function that…" without surrounding code — same problem.
Trivia-style language questions ("what's the difference between let and var?") — solved by the model's first token.
"Design Twitter" / "design Uber" prompts — every YouTube interview-prep channel covers these, and any LLM will hand back the reference architecture.

If your current loop relies on these, your hiring signal is degrading whether you have noticed or not. We covered the broader question of whether coding assessments still work with AI elsewhere — the short answer is yes, but only if you redesign the questions.

Pair design with detection

Even well-designed questions benefit from a second layer. We score every assessment with keystroke biometrics for paste-burst patterns and run an LLM-coherence pass over the final submission. The point is not to catch every cheater — it is to remove the easy path so candidates self-select toward actually doing the work.

If you want the full taxonomy of signals, the integrity report explained walks through what we surface and how to read it.

What to do next

Audit your current question bank against the five patterns above. Anything that:

Has no code or context to read before answering,
Has a single "correct" textbook solution,
Cannot be followed up on for three rounds, or
Is recognizably a Big Tech reference problem,

is now a low-signal question. Rewrite it or retire it. Replace it with a question that is anchored, ambiguous, followup-friendly, twisted, or experience-driven — ideally several at once.

The teams whose hiring signal is holding up in 2026 are the teams that did this work in 2024 and 2025. The teams complaining "candidates pass our interview and can't code on day one" mostly have not.

ai cheatinginterview questionscoding interviewsinterview design