Integrity & Cheat Detection

How to Detect Cursor and Copilot Use in Coding Interviews

ClarityHire Team(Editorial)9 min read

Why the ChatGPT playbook misses Cursor and Copilot

The standard cheat-detection advice for coding interviews is built around one assumption: the candidate alt-tabs to a chat window, asks for a solution, and pastes it back. Paste size, tab-switch duration, and burst-typing rate all key off that workflow.

In 2026, that assumption is wrong for the AI tools most candidates actually use. Cursor, GitHub Copilot, Windsurf, Continue, and the new wave of invisible-overlay tools like Cluely don't paste. They sit inside the editor, accept suggestions on Tab, and emit characters into the buffer in a way that looks far more like typing than like a paste event. A candidate using Cursor well will never trigger a paste alert in your platform and will never switch tabs once. The integrity report from a naive screening tool will read clean.

This is a different problem, and it needs a different signal stack. This post walks through the behavioural fingerprint of in-editor AI assistants, the specific moments where they break, and how to design the live round so that an honest user of these tools still produces interpretable signal.

What in-editor AI actually looks like to the platform

Three categories of tool, three different traces:

  • Tab-completion agents (Copilot, Cursor Tab, JetBrains AI). A "ghost text" suggestion appears after a brief pause; the candidate hits Tab; 40–300 characters land in the buffer in a single tick. To a keystroke recorder, that tick can read as either a burst of fast keystrokes or as a single insertion event, depending on how the editor dispatches input.
  • Agentic editors (Cursor Composer, Windsurf Cascade, Claude Code, Aider). The candidate types or speaks a natural-language instruction in a side panel; the agent rewrites a chunk of the file. From outside, this looks like a series of multi-line edits with no corresponding keystrokes between them.
  • Invisible overlay tools (Cluely, Interview Solver, Interview Coder). A separate process reads the screen and shows the candidate an answer in an overlay that is hidden from screen-capture APIs. The candidate then re-types or transcribes from it. The on-platform trace is "candidate typed it themselves", but the rhythm is wrong.

The candidate is not pasting. They are accepting, transcribing, or instructing. Each of these has a tell.

Signal 1: Tab-accept bursts inside a typing stream

The cleanest fingerprint of a tab-completion tool is the shape of the keystroke stream. Hand-typed code on a 30-character line looks like 30 keypresses spaced 80–250 ms apart, with the occasional 1–2 second pause. A Copilot Tab acceptance on the same line looks like one event at +0 ms followed by 30 characters at +1 ms each — or in some editors, a single buffer-insertion with no per-character timing at all.

ClarityHire's keystroke recorder classifies any contiguous run of input where the inter-key delay falls below ~5 ms as a machine-emitted block. The block is logged separately from typed characters and shows up on the integrity report as a distinct event type — not a paste, not a keystroke, an accept.

What to look for in the report:

  • A series of 5–10 accept events per minute, each 40–250 characters long, interleaved with normal typing. This is the canonical Copilot/Cursor rhythm.
  • Accept blocks that complete entire functions in one event (300+ chars). This is agentic-editor behaviour, not tab completion.
  • Zero pastes, zero tab switches, but the candidate produces a 200-line working solution in 22 minutes flat. The volume itself is the signal.

None of these is a smoking gun. All of them together, against a problem the candidate has never seen, is.

Signal 2: The cursor jumps the candidate cannot do by hand

When a human writes code, the cursor moves the way humans move cursors: arrow keys, Home/End, the occasional click into another line. When an agentic editor rewrites a region, the cursor teleports — the next edit is two functions away from the previous one, with no intervening navigation. When a tool refactors across files, the active file changes without a Cmd-P fuzzy-find.

The collaborative editor used in live coding rounds records cursor position and active document on every change. In the timeline, "teleport edits" stand out: an edit at line 80 immediately followed by an edit at line 12 of a different file, with no visible navigation and no scroll. A human can do that with hotkeys, but the cluster of such moves across a round is what matters. Two or three is normal. Twenty in 20 minutes is the agent at work.

Signal 3: Typing rhythm that does not match the candidate's own baseline

Tab-acceptance gives the candidate a finished line. They still have to type the instruction that summoned it, or type the next line themselves. So the keystroke stream becomes bimodal: long stretches of machine-emitted code separated by short stretches of human typing.

Keystroke biometrics on the typed portions is still useful — the candidate's dwell-and-flight fingerprint should be consistent across the session. But the ratio is the new signal: a Cursor-heavy candidate types maybe 15–25% of the characters in the final file. A non-user types 95%+. The ratio is visible in the report, and it does not depend on classifying any single event correctly.

This is also the signal that catches the invisible-overlay tools. A candidate reading from a Cluely overlay and re-typing the answer types at an unusually steady pace, with very few corrections, because they are transcribing rather than composing. Edit distance between the keystroke stream and the final file collapses toward zero. Real authorship has backspaces, renames, and walked-back attempts; transcription does not.

Signal 4: The submission is more coherent than the process

Run code coherence analysis on the final file and you get a separate read. Cursor Composer and Claude Code produce highly coherent files — uniformly idiomatic, consistent naming, defensive error handling for cases the candidate did not exercise. That is a different failure mode than ChatGPT-stitched code, which tends to be incoherent. Cursor code is too clean for the visible process.

The diagnostic question is the one the coherence judge already asks: does this look like one person wrote it, end to end, in 30 minutes, while talking to me on video? A human under interview pressure produces a file with at least one rough edge — a leftover console.log, a function they meant to rename, a comment that contradicts the code. A Cursor-Composer file rarely has any of those. The absence of mess is the signal.

What to ask in the room

Detection is half the answer. The other half is asking the question that an agent user cannot deflect:

  1. Pick a non-obvious line and ask why. "You used Map here instead of an object — what was the trade-off?" A real author picks one. An accept-and-move-on user shrugs.
  2. Force a small extension. "Add a flag that makes this case-insensitive." Five lines, no Tab acceptance allowed. The candidate either writes it fluently or stalls. Either is signal.
  3. Ask about a bug you can see. Pre-plant a subtle issue in the prompt or scaffolding. Agentic tools tend to "fix" it without acknowledging it. Ask the candidate to walk you through what they changed and why. The honest user explains; the agent user narrates the diff.

These follow-up questions are the same instrument you would use for an AI-pasted take-home. The point is not to catch the tool — it is to find out whether the candidate has the relationship with the code that the role requires.

How to design the round so the signal is interpretable

Two structural choices make all of the above easier to read:

  • Set the rule before the round starts. "You can use any tool you want. Cursor, Copilot, your own snippets. We will ask you to extend and defend what you write." This is the open-book framing, and it converts a detection problem into an interpretation problem. A candidate who declares Cursor use up front and explains their work is fine. A candidate who hides it and cannot explain is the one you want to filter.
  • Reserve five minutes of the round for unaided edits. Tell the candidate up front: the last five minutes are a small extension, no AI tools. Watch how they type when the assistant is gone. This single block produces a clean baseline to compare the rest of the round against.

These two rules give your integrity report something to anchor against. Without them, you are inferring rhythm against an empty prior; with them, you have a known-baseline section in every interview.

What to do next

If you run technical interviews and your current integrity report only tracks pastes and tab switches:

  1. Add an accept-block detector to your keystroke logging, or pick a coding platform that ships one. Without it, Cursor users are invisible to you.
  2. Add the unaided-five-minutes block to every live round this week. It costs you almost nothing and gives you a baseline for every candidate, not just the suspicious ones.
  3. Update your candidate-facing instructions to say what is and is not allowed. Most candidates will respect a clear rule; the few who do not will hide it badly.
  4. Stop relying on tab-switch counts as a primary signal. In 2026, the cheating tool the candidate is using does not require them to leave the editor.

The arms race against in-editor AI is not winnable by detection alone. It is winnable by interview design that makes honest tool use indistinguishable from non-use, and dishonest tool use impossible to defend in the room. The signal you want is not "did they use Cursor" — it is "do they know what their own code does." Build the round around that question and the detection problem mostly solves itself.

cursorgithub copilotai cheatingcoding interviewscheat detection

Related Articles