Integrity & Cheat Detection

How to Detect Remote Desktop and TeamViewer in Coding Interviews

ClarityHire Team(Editorial)2026-06-108 min read

The threat is real, but it isn't what most teams imagine

The classic worry: a candidate installs TeamViewer or AnyDesk, hands their session to a stronger engineer in another country, and lets that engineer solve the coding interview while the candidate stays on camera nodding. Variations include Microsoft RDP, Chrome Remote Desktop, Parsec, and lately VS Code Live Share misused for impersonation.

It happens. Paid interview-completion services advertise on LinkedIn. The fraction of remote technical interviews with detectable remote-control involvement sits in the low single digits — small, but very non-zero, and concentrated in roles where the offer is large enough to make the fee economical.

The bad news: a lockdown browser does not stop this. The remote-control software runs on the candidate's OS; the lockdown browser sees its own DOM. The good news: the moment a second human takes over the session, the behavior changes in ways your data captures. This post walks through the signals that detect remote desktop impersonation reliably, and why surveillance-heavy approaches both miss the cases that matter and alienate the candidates you want to hire.

What "remote desktop impersonation" actually looks like in your data

When a remote operator takes the candidate's mouse and keyboard, several things change at once:

The keystroke fingerprint shifts mid-session. Dwell times, flight times, and burst structure are different per person; the operator is, by definition, a different person.
Input latency spikes and becomes irregular. Even a fast RDP link adds 30–120ms over the operator's underlying network — and the variance is the giveaway, not the mean.
Mouse movement micropatterns change. Acceleration curves, click-release timing, and idle drift are surprisingly individual.
The webcam continues to show the original candidate, who is now visibly not the person driving the cursor. The audio channel often goes quiet for stretches that don't match the work happening on screen.
The browser focus pattern decouples from the speech pattern. The candidate explains what they are "thinking about" while typing accelerates or pauses on a different rhythm.

No single one of these is proof. Stacked together, they are.

Signal 1: keystroke fingerprint drift

This is the strongest single signal and the cheapest to capture. Keystroke biometrics builds a typing fingerprint from the first ~300 keystrokes, then watches whether the rest of the session matches. When a remote operator takes over, dwell-time and flight-time distributions shift visibly, often within the first 30 seconds of takeover.

What you are looking for:

Mid-session fingerprint divergence. First 10 minutes look like one person; minute 25 onward looks like someone else. The XGBoost authorship score swings well outside the noise band.
Baseline mismatch. The candidate's warm-up exercise had a relaxed, error-prone typing pattern; the coding round suddenly has crisp, confident, no-backspace typing. People don't change typing personalities in 5 minutes.

ClarityHire's integrity report surfaces both signals automatically with timestamp ranges, which is what makes them actionable in a walk-through review.

Signal 2: input latency and jitter

When the candidate is typing directly on their machine, keystroke-to-render latency is dominated by their browser's event loop — typically under 20ms with low variance. When a remote operator drives the session through RDP / TeamViewer / Parsec, two things change:

Mean latency rises. The operator's keystrokes travel over their internet to the candidate's machine, then render. A 40–120ms shift is common.
Variance explodes. Network jitter on the operator's link compounds with the candidate's. Standard deviation grows much faster than the mean.

You can capture this client-side without anything invasive: record the timestamp gap between keydown events and the corresponding DOM update. A stable session has a tight distribution. A session with a remote operator looks like network telemetry: occasional 300ms outliers, irregular shape, and the pattern starts mid-session rather than being present from the first keystroke.

Signal 3: mouse micropatterns

The keyboard is the primary signal; the mouse is the corroborating one. Mouse movement has individual micropatterns that change when control is transferred:

Acceleration curves. Each person flicks the cursor differently — some snap to targets, others arc to them.
Click-release intervals. Some people double-click in 80ms, others in 220ms.
Idle micro-drift. When the cursor is "stationary," it isn't actually — fingers shift on the mouse and produce tiny continuous movements. These vanish under RDP because the cursor is rendered remotely.

The last one is a reliable RDP tell. A remote desktop session typically has stretches where the cursor is perfectly still for seconds at a time, which almost never happens on direct input.

Signal 4: face continuity and audio sync

The integrity layer in video interviews tracks face continuity across the session. In an impersonation attempt, the candidate's face is still on camera — that part doesn't change. What changes is the relationship between what's on screen and what they say:

Audio-action desync. The candidate narrates "I'm about to refactor this loop" while the cursor has already done it 6 seconds earlier — they're describing the operator's actions a beat late, the same way a sportscaster lags the play.
Eye-gaze direction. Their gaze drifts somewhere other than their own screen — often to a phone or second monitor where the operator is in a video call coaching them.
Long muted-mic stretches during heavy code production. Real engineers narrate or grumble while working. Long, complete silence during productive coding bursts is suspicious.

See our note on verifying candidate identity in online interviews for how face continuity composes with the other signals.

Signal 5: paste structure and burst timing

Even when there's no remote desktop in play, the same operator is often also feeding the candidate code via chat. The result is the same code coherence patterns that show up in AI-pasted submissions:

Large clean pastes after long silences
Variable naming style that flips between functions
Code with defensive error handling for cases the candidate never tested

Combined with the keystroke and latency signals, this gives you a multi-layer composite. You almost never need a single "smoking gun" — the composite score is the verdict.

Why lockdown browsers are the wrong answer

The instinct is to ban remote desktop software with a process scan. Three problems:

The scan can't run in a browser. A browser-based lockdown sees its own DOM. Detecting OS-level RDP requires installing native software, which most candidates correctly refuse.
Determined cheaters use a phone off-screen instead. If you block TeamViewer, the next attempt is a phone with the problem on screen and a Telegram call to the operator. You've spent your candidate goodwill blocking the wrong attack.
False positives hit honest candidates. Plenty of engineers run accessibility tools, screen recorders, or work laptops with corporate MDM that triggers "remote control software detected" warnings. Auto-rejecting them is worse than missing the actual impersonators.

The right model is the one outlined in our piece on proctoring tools: collect behavioral signals continuously and unobtrusively, surface anomalies to a human reviewer, and never auto-reject.

What to do during the interview

Three practices that catch impersonation without surveillance theater:

Set up a 5-minute baseline. Have the candidate type a short prompt at the start ("introduce yourself in writing"). This calibrates the keystroke fingerprint and the latency distribution. Any mid-session shift is then measured against a baseline you actually have.
Camera on, full screen visible. Not because you'll see TeamViewer in their browser tabs — you won't — but because the social cost of obvious off-screen coordination rises with the camera on.
Mix typing with conversation. Ask process questions while they code: "talk through what you're doing right now." An operator-driven candidate falls out of sync with the actions on their screen.

The walk-through is still your best instrument

For any candidate the signals flag, the resolution is the same as it is for AI assistance: a 10-minute walk-through where they explain their own code. A candidate who wrote it can defend it. A candidate whose operator wrote it improvises vaguely, contradicts themselves, or quietly asks for the problem to be reframed. The walk-through resolves almost every flagged case in one direction or the other.

If the walk-through is also faked — operator on a parallel call coaching the explanation — the keystroke fingerprint during the walk-through coding follow-up (you do ask them to write a tiny modification, right?) is your final check.

What to do next

Three concrete moves:

Add a 60-second typed baseline at the start of your live coding round so keystroke biometrics has something to compare against.
Capture keystroke-to-render latency client-side and review it as part of your post-interview report, not in real time.
Make the walk-through coding follow-up mandatory on every live round — that single change closes the impersonation gap more than any anti-fraud tool.

Detection isn't about catching everyone. It's about making impersonation expensive enough that the small fraction of bad actors stop trying, while the 97% of honest candidates never notice the layer is there.

remote desktopteamviewerinterview integrityimpersonationcheat detection