SQL Interview Questions: 50 by Pattern + Mini Drills

February 14, 2026

SQL Interview Questions: 50 by Pattern + Mini Drills

TL;DR

If you want to get good at SQL interviews, stop collecting random questions and start practicing by pattern. This question bank gives you fifty high-yield prompts grouped by what interviews actually probe: joins, aggregation, conditional logic, window functions, time series, funnels, and cohorts. Each section also includes mini drills that train the real skill: choosing the right grain, preventing double counting, and narrating your logic clearly. Use one loop every time—confirm grain, define the metric, write a baseline query, then add edge cases and validation—so follow-ups feel like extensions of your reasoning, not surprises.

Why SQL interviews feel harder than “just writing SQL”

Most candidates can write a query that works on a clean, happy-path dataset.

What interviews test is whether your query still works when reality shows up: duplicated keys, nulls, changing definitions of “active,” multiple events per user per day, and joins that silently multiply rows. If you have ever felt confident and then realized you double-counted after a join, that’s the interview.

So the goal here is not “more SQL.” It’s fewer wrong assumptions and a stronger validation habit.

How to use this SQL interview question bank

Treat this as a rotation, not a checklist.

Pick one pattern category for the week, do two prompts under time pressure, then redo one prompt and improve only one thing: correctness, clarity, or speed. The redo matters more than the novelty.

If you want to practice speaking while querying, start with your checklist in Interview cheat sheets , then run a timed rep in Beyz practice mode.

The answer framework: GRAIN → METRIC → SHAPE → CHECK

This talk-track prevents the most common SQL interview failures: vague definitions, wrong join logic, and unvalidated metrics.

Grain

State what one row represents in each table you touch.

I want to confirm the grain: is this one row per user, per order, or per event?

Metric

Define what you are actually calculating and what counts.

I will compute daily active users as distinct users who had at least one qualifying event that day.

Shape

Write a baseline query first, then add complexity.

I will start with a filtered dataset, then aggregate, then join dimensions.

Check

Validate like an analyst.

I will sanity check row counts after joins, spot check a small sample, and confirm null handling and deduping rules.

Pattern map: what each SQL pattern is testing

PatternWhat interviewers are probingCommon pitfalls
Filtering and groupingcan you define a metric preciselywrong date filter, missing where logic
Joins and set logicdo you understand row multiplicationdouble counting, joining on non-unique keys
Conditional aggregationcan you compute multiple metrics cleanlyincorrect case logic, null surprises
CTEs and subqueriescan you structure complex queriesunreadable nesting, wrong intermediate grain
Window functionscan you compute changes over timeusing group by when you need row context
Time seriescan you align time windows correctlyboundary mistakes, missing days
Funnels and sessionscan you model journeysnot deduping steps, unclear definitions
Cohorts and retentioncan you anchor users correctlycohort leakage, mixed time windows
Data quality and performancecan you debug and optimizeignoring duplicates, scanning too much

SQL interview questions: 50 by pattern

A. Filtering and fundamentals

  1. Return all users created in the last thirty days, excluding internal email domains.
  2. Find the top cities by number of unique users who logged in this month.
  3. List orders with a missing shipping address and a non-null payment timestamp.
  4. For each user, return their first seen date based on an events table.
  5. Compute the percentage of users who have never placed an order.

B. Aggregations and group by

  1. Compute daily revenue and daily order count.
  2. Compute average order value by country.
  3. Find the share of revenue contributed by the top products.
  4. Find users with more than one purchase in a single day.
  5. Compute median session duration by platform.
  6. Compute the ratio of paid users to total active users by week.

C. Joins and set logic

  1. Join users to orders and return users who have never purchased.
  2. Return products that have been viewed but never purchased.
  3. Find users who purchased but never triggered a “trial started” event.
  4. Compute revenue by marketing channel using orders joined to attribution.
  5. For each order, attach the most recent user plan at the time of purchase.
  6. Compare inner join vs left join outcomes and explain when each is correct.

D. Conditional aggregation and case logic

  1. In one query, compute new users, returning users, and total active users by day.
  2. Compute the number of users who did step A and step B in the same day.
  3. Compute churned users where the last activity is older than a defined window.
  4. Compute refund rate by product category with clear null handling.

E. CTEs and subqueries

  1. Build a CTE for “eligible users,” then compute conversion rate to purchase.
  2. Use a subquery to find each user’s first purchase, then compute time to first purchase.
  3. Create a reusable CTE that maps users to weekly active status.
  4. Write the same logic using a CTE and using a window function; compare readability.

F. Window functions and ranking

  1. Rank products by revenue within each category.
  2. Return the top item per user by purchase count.
  3. Compute a running total of revenue by day.
  4. Compute day over day revenue change and percent change.
  5. Compute a seven day moving average of daily active users.
  6. Identify the first event per user and the next event timestamp.
  7. Find users whose activity streak is at least a defined length.
  8. Compute retention by counting users active in consecutive weeks.
  9. Detect anomalies by comparing today’s metric to a rolling baseline.
  10. Deduplicate events by keeping only the latest event per user per day.

G. Time series and date logic

  1. Fill missing dates in a daily metrics table and carry forward a baseline value.
  2. Compute month to date revenue and compare to the prior month to date.
  3. Compute weekly active users aligned to a specific week start.
  4. Compute time between two event types per user.
  5. Bucket users by “days since last activity” and count each bucket.
  6. Compute cohort size by signup month and activity in later months.

H. Product analytics: funnels, cohorts, experiments

  1. Build a funnel conversion rate from signup to activation to purchase.
  2. Compute funnel drop-off at each step with deduping rules.
  3. Build cohorts by first purchase date and compute repeat purchase rate.
  4. Compute retention as users returning in a later time window after first use.
  5. Compute experiment metrics by variant with guardrails for sample ratio mismatch.
  6. Explain how you would define “active user” for a product and implement it in SQL.

I. Data quality, debugging, and performance

  1. Identify duplicate keys in a dimension table and quantify impact on joins.
  2. Find rows that violate expected constraints, such as negative revenue or impossible timestamps.
  3. Optimize a slow query by reducing scanned data, pushing filters earlier, and simplifying joins.

Mini drills: practice like an interview, not like homework

These drills are short on purpose. Your job is to narrate assumptions and validate the result.

Drill one: define grain and prevent double counting

Tables

  • users(user_id, created_at, country)
  • events(user_id, event_time, event_name)
  • orders(order_id, user_id, order_time, revenue)

Prompt

Compute daily active users and daily revenue for the last fourteen days, then join them into one daily table.

Self-check

  • Are you counting distinct users for activity?
  • Does the join multiply rows?

Drill two: left join “never” questions

Prompt

Find users who have never purchased, and return their signup date and country.

Self-check

  • Are you filtering on the joined table in a way that turns your left join into an inner join?

Drill three: conditional aggregation muscle memory

Prompt

In one query, compute total users, active users, and purchasers by day.

Self-check

  • Are you using distinct counts consistently?
  • Are you handling nulls in case expressions?

Drill four: window function sanity

Prompt

For each user, return their most recent event and the time since that event.

Self-check

  • Does your window partition match the user?
  • Are you sorting in the correct direction?

Drill five: rolling metrics

Prompt

Compute a moving average of daily revenue and compare today vs the rolling average.

Self-check

  • Does your window frame match your intent?
  • Are missing days handled correctly?

Drill six: funnel with deduping

Prompt

Compute conversion from signup to activated to purchase, counting each user once per step.

Self-check

  • Are you deduping multiple events per user?
  • Is the step order defined clearly?

Drill seven: cohort retention

Prompt

Cohort users by first activity week and compute whether they return in later weeks.

Self-check

  • Is cohort assignment stable?
  • Are you mixing time windows incorrectly?

Drill eight: debugging duplicates

Prompt

A join between orders and products increased revenue totals. Identify why and fix it.

Self-check

  • Is the join key unique on the dimension side?
  • Do you need a deduped dimension CTE?

If you want a broader set of prompts beyond SQL, you can build a cross-round rotation starting from the Interview Questions & Answers hub. For a technical split that covers multiple rounds, pair this with the System Design Question Bank and a timed set from the Coding Interview Question Bank.

Before/After: how candidates actually improve SQL interview performance

Before: The candidate could write queries, but follow-ups kept exposing shaky definitions. Metrics were correct on the happy path, then broke under joins, duplicates, or edge cases. Explanations sounded like “I think this works,” instead of “Here’s why it’s correct.”

After: They practiced by pattern and used one loop every time: confirm grain, define the metric, write a baseline query, then add joins. After each join, they sanity checked row counts. They also forced themselves to narrate each assumption, and they spot-checked a small sample before trusting the result. Follow-ups became easier because validation was built into the workflow, not bolted on at the end.

If you want this to feel closer to a real interview, run one drill out loud under time pressure in Beyz practice mode, then redo the same prompt later and fix only one weakness: grain, joins, or validation.

References

Frequently Asked Questions

What is a SQL interview question bank?

A SQL interview question bank is a curated set of prompts that covers recurring patterns in analyst and data science interviews. The best banks are grouped by pattern so you can reuse the same approach across many questions, instead of memorizing isolated tricks.

How should I practice SQL for a data analyst interview?

Practice by pattern and follow one repeatable loop: confirm table grain, define the metric, write a baseline query, then add joins and edge cases. Finish with a sanity check and explain your logic out loud so your reasoning is visible.

What SQL topics matter most for interviews?

Most interviews focus on joins, aggregation, conditional aggregation, window functions, and time-based analysis. You should also be comfortable debugging nulls, duplicates, and grain mismatches, because that is where real queries fail.

Do I need window functions for SQL interviews?

Often, yes. Window functions are a clean way to compute rankings, running totals, and period-over-period metrics. Even when a solution works without them, windows usually make your intent easier to explain.

How do I avoid common SQL interview mistakes?

State assumptions early, confirm the grain of each table, and sanity check row counts after joins. When you compute a metric, spot-check a small sample and watch for null handling, duplicate keys, and accidental double counting.

Related Links