How to Write an AI Tool Policy for Technical Interviews
Why "no AI" is not a policy anymore
A year ago, "candidates may not use AI tools during this interview" felt like a sufficient policy. In 2026 it is not. Two things changed.
First, the rules are no longer consistent across the industry. Meta, Canva, Shopify, Rippling, and Coinbase explicitly encourage AI use in some technical rounds. Amazon and Goldman Sachs disqualify candidates for it. Anthropic flipped its own policy twice in a single quarter. A candidate interviewing at three companies in the same week is now expected to track three different rulebooks.
Second, the tools have changed. A candidate with an LLM in another tab is the floor. The ceiling is a desktop AI agent that watches the shared screen, a wireless earpiece coaching answers in real time, and a second laptop running an interview-copilot service designed specifically to be invisible during video calls. Telling that candidate "please do not use AI" is not a policy — it is a wish.
A policy that holds up needs to specify what is allowed per stage, what is explicitly forbidden, how disclosure works, and what happens when the rule is broken. This post is the template we recommend to hiring managers writing theirs from scratch.
The four-tier framework
Most interview loops have three or four distinct stages. Each stage has a different goal, so each stage needs a different AI rule. Pick a tier per stage from this menu:
- AI-forbidden. No AI assistance of any kind. Suitable for a tight live round whose goal is verifying the candidate can reason in real time without external help.
- AI-allowed-with-disclosure. Candidate may use AI but must disclose how they used it in the writeup. Suitable for take-home assignments where the work product is the thing you grade.
- AI-allowed-with-verification. Candidate may use AI; a live follow-up walkthrough probes whether they understand what they submitted. Suitable for any async stage whose output feeds into a live debrief.
- AI-required. Candidate is expected to use AI tools, screen-share their interactions, and is evaluated on how they collaborate with the model. Suitable for one round in roles where day-to-day work is AI-assisted.
The mistake is to pick one tier and apply it to the entire loop. The fix is to map each stage to the tier that matches what you are actually trying to measure.
A worked example: a four-stage engineering loop
Here is what this looks like for a typical mid-to-senior software role:
| Stage | Format | AI tier | Why |
|---|---|---|---|
| 1. Resume + AI screen | Async | Allowed-with-disclosure | The screen is a filter, not a final judgment. Candidate's AI use does not matter. |
| 2. Take-home (2 hrs cap) | Async | Allowed-with-verification | You want to see how they think, but you'll verify live in stage 3. |
| 3. Live follow-up + extension | Live | Forbidden | The single stage where you need raw, unaided reasoning. |
| 4. AI pair-programming round | Live | Required | Evaluates AI-collaboration skill, which is now a job-relevant competency. |
This sequence — async with AI, then live without, then live with — is the version of the async-then-sync flow most teams are converging on for engineering roles. It tests the candidate's unaided thinking exactly once, in the right place, while respecting that the rest of their work life is AI-assisted.
For non-engineering roles (PM, design, sales), the same template works with different stage shapes. Pick the one stage where AI use would invalidate the signal and make it the forbidden round.
What to put in writing
Three places the policy must appear, in identical language:
The job posting. One line: "Our technical interview includes one AI-forbidden live round and one AI-required pair-programming round. Other stages allow AI with disclosure." This filters self-selecting candidates and pre-empts the policy-shock complaint.
The interview invite email. Three to five sentences per stage. Be explicit about the tools you mean (ChatGPT, Claude, Copilot inline suggestions, voice agents, second laptops, interview-copilot services). The vague "please do not use external assistance" framing gives a dishonest candidate ambiguity to hide behind, and gives an honest candidate anxiety about doing something accidentally wrong.
The opening of the interview itself. Thirty seconds at the start of every live round restating the rule. Candidates forget. Restating it is also a quiet integrity test: a candidate who has AI primed will react.
ClarityHire's interview scheduling emails include a per-stage policy block that pulls from the assessment template, so the same wording reaches the candidate that the interviewer is enforcing — no drift between what was promised and what is being scored.
Enforcement: what each tier actually requires
The hard part of a policy is not writing it. It is enforcing it without becoming surveillance theater.
For AI-forbidden rounds:
- Camera on. Full-screen share, not just the IDE window. This catches tab strips and other windows.
- Audio on the candidate's microphone only; no AirPods if you can help it, since they are the most common channel for a coached candidate.
- Real-time integrity signals: keystroke biometrics to flag burst-paste events, tab-switch counting to catch alt-tabs, face continuity to spot a hand-off.
- A 5-minute conversational walkthrough at the end: "talk me through the trickiest part of what you just did." A coached candidate will struggle; a real one will not.
None of these signals are conclusive alone. Used together, they catch the obvious cases and leave the gray-area cases for a human reviewer.
For AI-allowed-with-verification rounds:
- Run code coherence analysis over the submission to flag patterns characteristic of LLM-generated code: textbook-prose comments, defensive error handling for cases that were not tested, sudden idiom shifts across functions.
- Use the coherence report as context for the live follow-up, not as a verdict. "Walk me through this function" on the section the model flagged.
- Score the writeup as heavily as the code. A candidate who copy-pasted from an LLM can write code; they cannot usually write a coherent rationale for the design choices.
For AI-required rounds:
- Provide the AI tool. Do not ask the candidate to use their own subscription — it favors candidates who pay for the best plan.
- Record the prompts, not just the code. The conversation transcript is the thing you grade. (We cover the scoring rubric in a separate post on evaluating AI collaboration skills.)
- Have the candidate narrate as they prompt. The reasoning is the signal.
How to handle disclosure honestly
If your policy allows AI in a stage, you need a disclosure mechanism that does not punish candidates for telling the truth. Two rules:
- Disclose what, not how much. Asking "did you use AI?" is binary and useless. Asking "name the parts of this submission where AI was the primary author, and what you did to verify them" is concrete and useful. The candidate who says "I asked Claude for the regex and then wrote tests to convince myself it was right" is showing exactly the workflow you want.
- Never penalize disclosed AI use in an AI-allowed stage. If a candidate followed the rule and you punish them anyway, you are training the next candidate to hide it. The whole policy collapses.
Combine these with the verification step and you get a stage that distinguishes the candidate who used AI thoughtfully from the candidate who pasted output verbatim — without forcing either to lie.
The breach-of-policy decision
What happens when the integrity layer flags a clear breach in a forbidden round? Three reasonable responses, in increasing severity:
- Pause and ask directly. "I noticed a 1,200-character paste at minute 23. Walk me through that section." Most honest cases — copying boilerplate from the problem statement, pasting their own scratch work — resolve here.
- Continue with a noted concern. Document the signal in the hiring report for the debrief. Let the panel weigh it alongside the other rounds.
- End the interview. Reserve for unambiguous evidence: a visible second person on camera, an audible coach, a paste that the candidate cannot remotely explain. Most interviewers under-react to clear evidence and over-react to ambiguous signal; bias toward addressing clear cases promptly and letting the rest play out.
Decide the escalation ladder before the loop opens, not during the moment. Write it into the interviewer training doc next to the rubric.
Calibrating the policy across interviewers
Two interviewers running the same loop with different mental models of the AI policy will produce inconsistent decisions. Three concrete moves to keep them aligned:
- A 20-minute calibration session per quarter. Run a recorded example through the loop and have every interviewer score it. The scores will diverge wildly the first time; that is the point.
- Locked scoring before the debrief. If interviewers can edit their scores during the meeting, the policy becomes a negotiation. Use structured scorecards that commit each interviewer's rating before the panel meets.
- An explicit "policy notes" field on the scorecard. Each interviewer documents what AI signals they saw and how they weighted them. The next debrief surfaces the disagreement rather than dissolving it.
What to do next
If you are writing or rewriting your AI policy this week:
- Pick the one round in your loop where unaided reasoning matters most. Make it AI-forbidden.
- Pick the one round, if any, where AI use is part of the actual job. Make it AI-required.
- Make everything else AI-allowed with a verification step. This is the safest default for the largest number of stages.
- Write the policy into the job posting, the invite email, and the interview opening. Same words, three places.
- Decide your enforcement signals per stage before the next candidate interviews. Decide your escalation ladder before the first breach happens.
A short, specific, per-stage policy beats a long, vague, one-size-fits-all rule almost every time. Candidates will respect it, interviewers can enforce it, and the loop produces the signal you actually wanted.