Set Operations
Set Operations
🌍 Real-World Analogy
Imagine you have two stacks of business cards:
- UNION — Combine the two stacks, keeping only one copy of duplicates
- UNION ALL — Combine all cards from both stacks, keeping duplicates
- INTERSECT — Pick out only those who appear in both stacks
- EXCEPT — Remove from the first stack anyone who also appears in the second
Set operations perform "merge, intersect, and difference" operations on two query result sets.
🎯 Core Concepts
UNION — Deduplicated Merge
Merges two result sets, automatically removing duplicate rows.
SELECT city FROM employees
UNION
SELECT city FROM departments;
UNION ALL — Non-Deduplicated Merge
Merges two result sets, retaining all duplicate rows. Better performance because no deduplication is needed.
SELECT city FROM employees
UNION ALL
SELECT city FROM departments;
INTERSECT — Intersection
Returns rows that exist in both result sets.
SELECT city FROM employees
INTERSECT
SELECT city FROM departments;
EXCEPT / MINUS — Difference
Returns rows that exist in the first result set but do not exist in the second.
-- EXCEPT (SQL Server, PostgreSQL)
SELECT city FROM employees
EXCEPT
SELECT city FROM departments;
-- MINUS (Oracle)
SELECT city FROM employees
MINUS
SELECT city FROM departments;
Column Matching Rules
When using set operations, the two SELECT statements must satisfy:
| Rule | Description |
|---|---|
| Same column count | Both SELECTs must have the same number of columns |
| Compatible types | Corresponding columns must have compatible data types |
| ORDER BY at the end | Can only be used once at the very end of the statement |
-- Correct: two columns, types match
SELECT first_name, salary FROM employees
UNION
SELECT department_name, budget FROM departments;
-- Wrong: different column count
SELECT first_name, salary FROM employees
UNION
SELECT department_name; -- ❌ Column count mismatch
💡 When to Use Which Operation
| Operation | Scenario | Deduplication | Performance |
|---|---|---|---|
| UNION | Need to merge and deduplicate | ✅ Yes | Slower |
| UNION ALL | Merge without deduplication | ❌ No | Faster |
| INTERSECT | Find common parts | ✅ Yes | Medium |
| EXCEPT | Find differences | ✅ Yes | Medium |
UNION ALL for better performance.
📝 Basic Syntax
-- UNION syntax
SELECT column1, column2 FROM table1
UNION
SELECT column1, column2 FROM table2
[ORDER BY column1];
-- UNION ALL syntax
SELECT column1, column2 FROM table1
UNION ALL
SELECT column1, column2 FROM table2;
-- INTERSECT syntax
SELECT column1, column2 FROM table1
INTERSECT
SELECT column1, column2 FROM table2;
-- EXCEPT syntax
SELECT column1, column2 FROM table1
EXCEPT
SELECT column1, column2 FROM table2;
- In set operations, column names are determined by the first SELECT
- ORDER BY can only appear at the end and typically uses column ordinal numbers
- Each SELECT can have its own WHERE, GROUP BY
📌 Examples
Example: Query All Name Sources
Merge employee names and department names into one result set.
-- View all "name" sources: employees + department managers
SELECT first_name AS name, 'Employee' AS source
FROM employees
UNION ALL
SELECT department_name, 'Department'
FROM departments
ORDER BY source;
Explanation: A constant column is added to mark the data source for easy distinction.
Example: Find Cities With Employees But No Departments
-- Cities with employees minus cities with departments
SELECT city FROM employees
EXCEPT
SELECT city FROM departments;
Approach: Use EXCEPT as a difference operation to quickly find "cities that only appear in employees."
🎬 Scenario Walkthroughs
Scenario 1: Merge Multi-Channel Customer Lists
The company has online and offline customer tables that need to be merged and deduplicated to create a complete customer directory.
-- Online customers
SELECT customer_name, email, 'Online' AS channel
FROM online_customers
UNION
-- Offline customers
SELECT customer_name, email, 'Offline'
FROM offline_customers
ORDER BY customer_name;
Key point: Use UNION instead of UNION ALL to automatically remove duplicate customers.
Scenario 2: Compare Sales Differences Between Two Months
Find products that had sales this month but not last month.
-- Products sold this month
SELECT product_id FROM orders
WHERE order_date >= '2026-06-01'
EXCEPT
-- Products sold last month
SELECT product_id FROM orders
WHERE order_date >= '2026-05-01'
AND order_date < '2026-06-01';
Key point: EXCEPT is naturally suited for "finding differences" scenarios.
❓ FAQ
Q: When should I use UNION vs UNION ALL? A: If you don't need deduplication, prefer
UNION ALLbecause it doesn't require an extra deduplication step and performs better. Only useUNIONwhen you specifically need deduplication.
Q: Will mismatched column names in set operations cause errors? A: No, the result set column names are determined by the first
SELECT. However, the column count and data types of corresponding columns must match.
Q: What's the difference between INTERSECT and JOIN? A:
INTERSECTmatches entire rows for intersection, whileJOINassociates tables based on specified conditions. UseINTERSECTwhen you need to "find identical rows," and useJOINwhen you need to "associate columns from different tables by a field."
Q: Does MySQL support INTERSECT and EXCEPT? A: MySQL 8.0 and later supports
INTERSECTandEXCEPT. For earlier versions, you need to simulate them usingINNER JOINandNOT EXISTS/LEFT JOIN ... IS NULL.
📖 Summary
| Operation | Purpose | Deduplication |
|---|---|---|
| UNION | Merge two result sets | ✅ |
| UNION ALL | Merge two result sets (keep duplicates) | ❌ |
| INTERSECT | Intersection of two result sets | ✅ |
| EXCEPT | Difference of two result sets | ✅ |
- Column count and types must match
ORDER BYcan only be placed at the end of the statement- Prefer
UNION ALLwhen you're sure there are no duplicates for better performance
📝 Exercises
- Use
UNIONto merge employees with salary above 8000 from theemployeestable and department names with budget above 100000 from thedepartmentstable. - Use
INTERSECTto find cities that exist in both theemployeesanddepartmentstables. - Use
EXCEPTto find cities that have departments but no employees assigned. - Think about it: What happens if the column types of two query results don't fully match (e.g., INT and VARCHAR) in a set operation?
Next Lesson
Next we'll learn Constraints and Keys — understand how to use PRIMARY KEY, FOREIGN KEY, and other constraints to ensure data integrity.



