DISTINCT — unique values

Lesson Description

Remove duplicates with DISTINCT and choose the correct uniqueness grain for the question being asked. You will learn when DISTINCT is the right tool and when grouping logic is a better fit.

Easy Project

Mini project: Turn this lesson into a real-world query artifact by writing one clean business report query and validating output quality.

DISTINCT — eliminate duplicate rows

DISTINCT removes duplicate rows from the result. It considers the combination of all selected columns.

Syntax

SELECT DISTINCT column1, column2
FROM table_name;

Finding unique values in a column

-- How many different departments do employees belong to?
SELECT DISTINCT department_id
FROM employees
ORDER BY department_id;

DISTINCT across multiple columns

When you list multiple columns, DISTINCT de-duplicates on the combination:

-- Unique (department, job) pairs
SELECT DISTINCT department_id, job_id
FROM employees
ORDER BY department_id, job_id;

COUNT(DISTINCT col) — count unique values

SELECT COUNT(DISTINCT department_id) AS unique_departments
FROM employees;

Counts only the distinct non-NULL values in the column.

When to use DISTINCT

Exploring data: quickly see what values exist.
After a JOIN that multiplies rows: de-duplicate when the same row can appear via multiple join paths.
Checking cardinality: how many unique values does a column have?

Alternatives to DISTINCT

For large tables, GROUP BY is often faster and more explicit:

SELECT department_id, COUNT(*) AS headcount
FROM employees
GROUP BY department_id;

Tips

DISTINCT applies to the entire row, not just the first column.
ORDER BY can only reference columns that appear in the SELECT when DISTINCT is used.
DISTINCT ON (col) (PostgreSQL-only) keeps one row per distinct value of a specific column.

Learning context

This lesson is part of SQL Fundamentals. Focus on the core idea in DISTINCT — unique values, then validate with deliberate practice.

What to master

Understand clause order and intent
Write deterministic result queries
Use clear aliases and readable structure

Common mistakes

Skipping ORDER BY in final outputs
Using joins before validating base rows
Not checking row counts after changes

High-level strategy

Start simple, then layer complexity
Run often and compare expected shape
Refactor with CTEs when logic grows

Task ladder

Baseline query: get a correct first output.
Add one complication: edge case, join, filter, or ranking rule.
Re-run and refine until output is stable and explainable.

Transparent data checks

Check row count before and after each major clause.
Validate one manual sample row for correctness.
Confirm final ordering and column naming for trust.

Retention loop

Run the solution once.
Rewrite from memory with one variation.
Explain the logic in plain English in one sentence.

Logical reasoning for commands

WHERE

Why: Limits rows to only the business-relevant subset.

Memory cue: Filter early to reduce noise.

ORDER BY

Why: Makes output deterministic and reviewable.

Memory cue: No ORDER BY means no guaranteed row order.

AS alias

Why: Makes output columns readable for teams and reports.

Memory cue: If the name is clear, the query is easier to trust.

Concept check

Quick check: DISTINCT — unique values

3 quick questions. One at a time. Instant score at the end.

Exercise 1

Unique department IDs

Enterprise Workforce AnalyticseasyPostgreSQL

Return the distinct department_id values from the employees table, ordered ascending.

Autocomplete: start typing table/column namesPaste supported

No results yet

Run a query to see results here

Exercise 2

Unique product category IDs

Commerce Operations IntelligenceeasyPostgreSQL

From the store sandbox, return the distinct category_id values from the products table, ordered ascending.

Autocomplete: start typing table/column namesPaste supported

No results yet

Run a query to see results here

Exercise 3

Count distinct doctor specialties

Clinical Care PerformanceeasyPostgreSQL

From the hospital sandbox, return a single value specialty_count — the number of distinct specialties across all doctors.

Autocomplete: start typing table/column namesPaste supported

No results yet

Run a query to see results here

Exercise 4

Unique department and job combinations

Enterprise Workforce AnalyticseasyPostgreSQL

Return distinct (department_id, job_id) pairs from the employees table, ordered by department_id, then job_id.

Autocomplete: start typing table/column namesPaste supported

No results yet

Run a query to see results here

SQL Fundamentals

Lesson 4 / 5

DISTINCT — unique values

0/4 solved4 to unlock next

PreviousLocked

PrevLocked