A System Design Interview Rubric That Doesn't Reward Buzzwords
The buzzword problem
A candidate who says "we'd put a Redis cache in front, shard by user ID, use Kafka for the event bus, and run it in Kubernetes" sounds senior. They might be senior. They also might have memorized a YouTube video. A rubric that scores buzzwords cannot tell the difference.
The fix is to score the reasoning between the words, not the words themselves.
Five rubric dimensions worth scoring
1. Requirements clarification
Did the candidate ask before drawing? "What's the read/write ratio? How many users? What's the latency budget? What does failure mode X look like?" A senior engineer treats the prompt as ambiguous. A junior one treats it as a spec.
Score: Did they uncover at least one constraint that meaningfully changes the design?
2. Trade-off articulation
For every component choice — caching, sharding, consistency model — did they name the trade-off? "Postgres is fine here because the write volume is low and we want transactions" beats "we'd use Postgres" even if the answer is identical.
Score: Number of design choices accompanied by a stated alternative and a reason for the pick.
3. Failure-mode reasoning
What happens when the cache cluster falls over? When the message queue lags? When the leader fails over? Senior engineers anticipate failure. Less-experienced engineers design only the happy path.
Score: Did they identify the system's most likely failure mode unprompted?
4. Cost and operational awareness
A design that costs $40k/month for a side project is wrong. A design that requires a 24/7 oncall rotation for a feature with 100 users is wrong. Cost awareness — money, complexity, headcount — separates engineers who have run systems from engineers who have only designed them on paper.
Score: Did they reason about cost or operational burden unprompted?
5. Communication under correction
When you push back — "wait, but what if X?" — does the candidate update gracefully, dig in defensibly, or panic? All three are signal. Updating gracefully is what you want. Defending a position you've thought through is fine. Panicking and pivoting wildly is not.
Score: Quality of response to one targeted pushback.
How to use it
- Score each dimension 1–4 with anchors on what each level looks like.
- Score independently before debriefing other interviewers.
- Submit the rubric before reading anyone else's score. ClarityHire's interview reports lock the rubric so it cannot be edited after seeing peer scores.
- Weight dimensions to the role. A staff-engineer loop should weight failure-mode reasoning and cost awareness; a senior loop can weight trade-off articulation more.
The rubric does not prevent a memorized candidate from passing. It does prevent a memorized candidate from passing easily — and that is most of the value.