The Staff Engineer Interview Loop: What Senior Loops Get Wrong
Why most staff loops produce senior offers
The single most common failure mode in staff engineer hiring is running the senior loop with a longer system design round and calling it a staff loop. Every interviewer scores the candidate against senior anchors. The debrief discusses whether the candidate is a "strong senior". The hiring manager pushes for staff because the role is open at staff. The offer either gets downleveled or it goes out at staff for someone who will struggle inside six months.
The reason this keeps happening is that the rubric for "senior" is well understood — solid coding, owns a service, mentors juniors, designs features end-to-end — and the rubric for "staff" is not. Staff engineering is the discipline of choosing the right problem, aligning a group of teams behind a solution, and being correct about technical trade-offs at a scope where being wrong is expensive. None of those things are visible in a 45-minute coding round.
This post is a loop design for staff and above. It assumes you already run a competent senior loop and want to test the additional dimensions that distinguish staff from a fast senior. It does not assume any particular company size.
The four dimensions that actually distinguish staff
A useful staff rubric scores against these, with explicit anchors at senior, staff, and senior staff:
- Scope of judgement. Can the candidate identify the right problem to work on, in a domain they have not seen before, given partial information and competing constraints? Senior engineers solve the problem you give them. Staff engineers tell you which problem to solve.
- Architectural decision quality. Can they reason about a system at a granularity where the trade-offs are real — cost, latency, blast radius, organisational coupling — and defend a choice in writing? This is not the same as recalling a textbook design.
- Cross-team influence. Can they describe a time they changed the direction of work outside their direct team, and articulate the mechanisms — the document, the meeting, the prototype — that made it happen? "I gave feedback in a review" is not staff-level. "I wrote the design doc that consolidated three competing proposals into one" is.
- Technical mentorship at distance. Can they raise the bar of engineers they do not manage, on topics they are not the team's expert on? This is the bar that separates staff from a senior engineer with extra years.
Every stage of the loop should be designed to probe one or two of these, not to re-test the senior bar.
The loop, stage by stage
Total: ~6 hours of candidate time, ~10 hours of interviewer time. Six stages:
Stage 1: Hiring manager screen (60 min)
Not a fit chat. A working conversation about a real problem in your stack, framed loosely enough that the candidate has to ask questions before proposing anything. The hiring manager is listening for: do they narrow the problem before solving it? Do they ask about constraints that a senior would skip — org structure, on-call cost, migration risk?
This stage replaces the recruiter-led "tell me about yourself" round. Save the logistics for a separate 20-minute scheduling call.
Stage 2: A written design document (4–8 hours, async)
The single highest-signal stage in a staff loop, and the one most companies skip because they worry it is too much work to ask of a candidate. It is not. Senior candidates routinely complete 2-hour take-homes; a staff candidate completes a focused design exercise because the artifact is also a way for them to evaluate the role.
The prompt: a real-ish, intentionally underspecified problem from your domain. "We want to introduce a workflow engine for the orchestration our payments team currently does in ad-hoc Lambdas. Write a 3–8 page document that argues for an approach, names the alternatives you considered, and identifies the top three risks."
What the document tells you that a 60-minute whiteboard cannot:
- How the candidate structures an argument when they have time to think.
- Whether they cite trade-offs that map to your actual constraints, or hand-wave at general principles.
- Whether they identify failure modes the team has not yet discussed.
- How they write to an audience of other staff engineers, which is most of the job.
Score the document on its own rubric, before the next round. A weak document does not get rescued by a strong whiteboard.
Stage 3: Design document deep-dive (75 min)
A panel of two — the hiring manager and one staff or principal engineer — works through the document with the candidate. The candidate does not present; the panel reads the document beforehand and the time is spent on follow-up. The format:
- First 15 minutes: the alternatives. "You ruled out a step-functions approach. Why? What would change if our latency budget were 50 ms instead of 5 s?"
- Next 30 minutes: the failure modes. Walk through how the system breaks under three specific conditions you supply. Watch how the candidate updates their design in real time.
- Last 30 minutes: the migration. "We have 47 existing Lambdas that need to move to this. Walk me through the first 90 days."
This stage tests dimensions 1 and 2 jointly: did the candidate identify the right problem, and is their design actually defensible when the constraints change?
Stage 4: Coding round (60 min)
Still required, even for staff. Two reasons: a staff engineer who cannot write code at all is rare and bad, and the round tells you whether they can still pair with senior engineers on the work they actually need to do.
Skip LeetCode. Use a debugging or refactoring round on real-ish code — 200–400 lines, a non-trivial bug, an explicit request to leave the codebase in better shape than they found it. Score on the coding rubric you use for senior candidates, but weight diagnostic-thinking and code-review-quality higher than throughput.
A staff candidate who fluently identifies the bug, names two more issues they will not fix, and explains the smallest safe refactor is doing exactly what the job requires. One who out-codes a senior candidate on speed is not.
Stage 5: Cross-team influence interview (60 min)
This is the round most staff loops do not run, and it is the one that most reliably catches the false positive — the senior engineer who solves problems brilliantly but cannot move a group of teams.
A structured behavioural interview, four prompts deep, each one drilling into a different mechanism:
- "Tell me about a technical position you held that was unpopular with the team. What changed your mind, or what changed theirs?"
- "Describe a time you stopped a project that should not have shipped. How did you do it without losing the trust of the people working on it?"
- "Walk me through the last technical document you wrote that another team adopted as the basis for their work."
- "Tell me about a time you mentored an engineer outside your reporting chain on something they were better at than you a year later. What did you do?"
The candidate's answers should name people, documents, meetings, and outcomes. Vague answers — "we aligned, we built consensus, we shipped" — fail this round. The job is mechanism-rich; the interview should be too.
Stage 6: Bar raiser (45 min)
A staff or principal engineer from outside the team, evaluating the candidate against the company's staff bar, not the team's need. Their question to themselves at the end: "Would I want to work with this person on a problem I cared about?"
Hold a debrief the same day with locked scorecards — the format matters more at staff because the disagreements are about scope and judgement, not whether someone can code, and those are exactly the disagreements that get dissolved by a charismatic interviewer in the room.
What to leave out
A few things the staff loop should not include, and the reasons each is a trap:
- Two coding rounds. One is enough. A second tells you nothing new and signals to the candidate that you do not know what you are hiring for.
- A system design round that is just "design Twitter". Generic prompts test memorisation. Use a problem from your actual domain so the candidate has to actually think.
- A "leadership principles" interview that is the behavioural round with a different name. If you already run a structured behavioural round, folding it into one well-designed cross-team round is stronger than running two thin ones.
- A presentation round, unless presenting is part of the job. Asking the candidate to present 30 minutes of past work is fine if the role requires public presentations to engineering. Otherwise it tests the wrong skill.
Calibrating interviewers to the staff bar
The hardest part of running a staff loop is not the round design — it is making sure interviewers know what they are scoring. Three concrete steps:
- Write anchored scoring guides for each dimension. A 3-out-of-4 on "architectural decision quality" should mean a specific thing — for example, "identifies the load balancing trade-off without prompting and explains the cost of getting it wrong". A 4 should mean "raises a constraint the prompt did not contain and reframes the problem". This is the same scorecard discipline that works at senior level, with different anchors.
- Run two practice debriefs before the first real one. Take two recent staff offers (one accepted, one declined) and have the loop interviewers score them blind, then debrief. The point is to surface where the panel disagrees about what staff means, before a real candidate is on the line. This is exactly the calibration most teams skip and most regret skipping.
- Track the downlevel rate. If your loop produces a staff offer 30% of the time and a senior offer 50% of the time for staff-applied candidates, your hiring manager screen is letting too many senior candidates through. If it produces a no-hire 60% of the time, your inbound source is wrong. Either way, the metric tells you where to fix it.
What to do next
If you are setting up or fixing a staff engineer loop:
- Write the four-dimension rubric this week. Even a v1 with rough anchors beats no rubric at all.
- Add the written design document if you do not already have one. It is the highest-signal round and the cheapest to introduce.
- Replace one of your coding rounds with a cross-team influence interview. The trade is almost always worth it.
- Run two calibration debriefs on past candidates before the next live loop.
- Audit the last three staff hires against the rubric retrospectively. If two of them score weakly on a dimension you did not test, that is the round you are missing.
Staff hiring is not senior hiring with a longer system design. It is a different rubric, a different loop, and a different debrief. The teams that figure that out hire staff engineers who actually do the job. The ones that do not hire senior engineers with a different title and wonder why their architectural debt keeps growing.