Data Engineer Assessment Template

A ready-to-run data engineering hiring test covering SQL, schema design, pipeline design, and data quality — with live SQL execution.

Use This Template Hire This Role

Duration

75 minutes

Questions

Level

Mid-Level

Passing Score

70%

What this template measures

Every skill needed for a data engineer hire, covered across MCQ, coding, and essay questions.

SQL Fluency

Joins, window functions, CTEs, analytical queries.

Schema Design

Normalization, star schemas, SCD, partitioning.

Pipeline Design

Airflow, dbt, Dagster DAG design and scheduling.

Data Quality

Null handling, deduplication, freshness, lineage.

Streaming Basics

Kafka, Kinesis, exactly-once semantics.

Python Data Stack

pandas, PySpark basics.

Sample questions from this template

A preview of the questions you'll see when you use this template.

Multiple ChoiceMediumQuestion 1

In a star schema, fact tables typically contain:

A.Descriptive attributes about business entities
B.Foreign keys and measurable metrics
C.Slowly changing data with history
D.Only denormalized dimensions

CodingMediumSQLQuestion 2

Given `sessions(user_id, session_start, page)`, write a query that returns each user's first 3 pages in order of time for each session.

CodingHardSQLQuestion 3

Given `orders(id, user_id, amount, created_at)`, write a query that returns each user's: running 7-day total, running 28-day total, and rank by 28-day total across all users. Single query using window functions.

CodingMediumPythonQuestion 4

Write an Airflow DAG definition that: - Runs daily at 2am UTC - Extracts yesterday's data from Postgres - Loads it into S3 as parquet - Triggers a downstream dbt model on success - Retries 3 times with exponential backoff on failure

EssayHardQuestion 5

Your dbt pipeline produces incorrect results one day per week. Walk through how you'd investigate — what you'd check first, what tools you'd use, how you'd reproduce.

Scoring rubric

How candidates are evaluated on this template.

Dimension

Description

Weight

SQL Correctness

Queries return correct, efficient results.

40%

Schema Design

Models reflect real business needs, handle change.

20%

Pipeline Design

Idempotent, observable, failure-friendly DAGs.

20%

Data Quality Sense

Anticipates nulls, dupes, and freshness issues.

10%

Communication

Explains tradeoffs in writing clearly.

10%

Frequently asked questions

Which SQL flavor does this template use?+

PostgreSQL by default with DuckDB for analytical queries. Variants for BigQuery, Snowflake, and Redshift available.

Can I customize this template?+

Yes. Every question, time limit, weighting, and rubric dimension is fully editable. Use the template as a starting point and tailor it to your role and seniority level.

Does this template include AI cheat detection?+

Yes. All ClarityHire assessment templates ship with code coherence AI, keystroke biometrics, and paste detection enabled by default. You can dial integrity level per role.

Can candidates see sample questions before starting?+

Yes. Each template supports unscored practice questions so candidates warm up before the real assessment begins. That way you measure skill, not test anxiety.

Related assessment templates

Other role-specific templates you might want to customize.

Launch Your Data Engineering Assessment Today

Customize this template and invite candidates in minutes.

Use This Template

Data Engineer Assessment Template

What this template measures

SQL Fluency

Schema Design

Pipeline Design

Data Quality

Streaming Basics

Python Data Stack

Sample questions from this template

Scoring rubric

Frequently asked questions

Related assessment templates

Data Scientist Template

Python Developer Template

SQL Analyst Template

Launch Your Data Engineering Assessment Today