实战:数据分析综合
实战:数据分析综合
📋 项目需求
本节通过销售报表分析和员工绩效统计两个实战场景,综合运用前面学到的聚合函数、分组查询、条件表达式、JOIN和子查询等技能。
数据库表结构
我们使用统一的 4 表数据库:
-- 部门表
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(50) NOT NULL,
city VARCHAR(50),
budget DECIMAL(12,2)
);
-- 员工表
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100),
salary DECIMAL(10,2),
department_id INT,
hire_date DATE,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
-- 订单表
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
employee_id INT,
order_date DATE,
total_amount DECIMAL(12,2),
status VARCHAR(20),
FOREIGN KEY (employee_id) REFERENCES employees(employee_id)
);
-- 产品表
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100) NOT NULL,
price DECIMAL(10,2),
stock INT,
department_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
测试数据
-- 部门数据
INSERT INTO departments VALUES (1, '技术部', '北京', 500000);
INSERT INTO departments VALUES (2, '销售部', '上海', 300000);
INSERT INTO departments VALUES (3, '市场部', '广州', 200000);
INSERT INTO departments VALUES (4, '财务部', '深圳', 150000);
-- 员工数据
INSERT INTO employees VALUES (1, '张', '伟', 'zhangwei@co.com', 12000, 1, '2023-03-15');
INSERT INTO employees VALUES (2, '李', '娜', 'lina@co.com', 9500, 1, '2023-06-20');
INSERT INTO employees VALUES (3, '王', '强', 'wangqiang@co.com', 8000, 2, '2022-11-10');
INSERT INTO employees VALUES (4, '赵', '敏', 'zhaomin@co.com', 11000, 2, '2024-01-05');
INSERT INTO employees VALUES (5, '刘', '洋', 'liuyang@co.com', 7500, 3, '2024-04-18');
INSERT INTO employees VALUES (6, '陈', '静', 'chenjing@co.com', 9000, NULL, '2025-02-28');
-- 订单数据
INSERT INTO orders VALUES (1001, 101, 3, '2026-01-10', 2500.00, 'completed');
INSERT INTO orders VALUES (1002, 102, 3, '2026-02-15', 1800.00, 'completed');
INSERT INTO orders VALUES (1003, 103, 4, '2026-02-20', 3200.00, 'shipped');
INSERT INTO orders VALUES (1004, 101, 1, '2026-03-05', 950.00, 'pending');
INSERT INTO orders VALUES (1005, 104, 4, '2026-03-10', 4100.00, 'completed');
INSERT INTO orders VALUES (1006, 105, 3, '2026-04-01', 2800.00, 'cancelled');
INSERT INTO orders VALUES (1007, 101, 1, '2026-04-15', 5600.00, 'completed');
INSERT INTO orders VALUES (1008, 106, 4, '2026-05-02', 1500.00, 'completed');
INSERT INTO orders VALUES (1009, 102, 3, '2026-05-20', 3800.00, 'shipped');
INSERT INTO orders VALUES (1010, 107, 1, '2026-06-01', 2200.00, 'pending');
INSERT INTO orders VALUES (1011, 103, 4, '2026-06-10', 6300.00, 'completed');
INSERT INTO orders VALUES (1012, 108, 3, '2026-06-15', 1900.00, 'completed');
-- 产品数据
INSERT INTO products VALUES (1, '笔记本电脑', 6999.00, 50, 1);
INSERT INTO products VALUES (2, '无线鼠标', 129.00, 200, 1);
INSERT INTO products VALUES (3, '办公桌', 899.00, 30, 2);
INSERT INTO products VALUES (4, '投影仪', 3500.00, 15, 3);
INSERT INTO products VALUES (5, '打印机', 2200.00, 25, 1);
INSERT INTO products VALUES (6, '会议椅', 599.00, 40, 2);
INSERT INTO products VALUES (7, '白板', 450.00, 60, 3);
🏗️ 查询实战
查询一:员工销售业绩综合报表
需求:生成每位员工的销售业绩报表,包含订单数、总销售额、平均订单金额、各状态订单数,并进行业绩等级评定。
SELECT
CONCAT(e.first_name, e.last_name) AS 员工姓名,
d.department_name AS 部门,
COUNT(o.order_id) AS 总订单数,
COALESCE(SUM(o.total_amount), 0) AS 总销售额,
COALESCE(ROUND(AVG(o.total_amount), 2), 0) AS 平均订单金额,
SUM(CASE WHEN o.status = 'completed' THEN 1 ELSE 0 END) AS 已完成,
SUM(CASE WHEN o.status = 'shipped' THEN 1 ELSE 0 END) AS 已发货,
SUM(CASE WHEN o.status = 'pending' THEN 1 ELSE 0 END) AS 待处理,
SUM(CASE WHEN o.status = 'cancelled' THEN 1 ELSE 0 END) AS 已取消,
CASE
WHEN COALESCE(SUM(o.total_amount), 0) >= 10000 THEN 'S级-卓越'
WHEN COALESCE(SUM(o.total_amount), 0) >= 5000 THEN 'A级-优秀'
WHEN COALESCE(SUM(o.total_amount), 0) >= 2000 THEN 'B级-良好'
WHEN COUNT(o.order_id) > 0 THEN 'C级-待提升'
ELSE '无业绩'
END AS 业绩等级
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id
LEFT JOIN orders o ON e.employee_id = o.employee_id
GROUP BY e.employee_id, e.first_name, e.last_name, d.department_name
ORDER BY 总销售额 DESC;
执行结果:
员工姓名 | 部门 | 总订单数 | 总销售额 | 平均订单金额 | 已完成 | 已发货 | 待处理 | 已取消 | 业绩等级
--------+--------+---------+---------+-------------+--------+--------+--------+--------+----------
赵敏 | 销售部 | 4 | 15100.00| 3775.00 | 3 | 1 | 0 | 0 | S级-卓越
王强 | 销售部 | 4 | 10900.00| 2725.00 | 2 | 1 | 0 | 1 | S级-卓越
张伟 | 技术部 | 3 | 8750.00| 2916.67 | 1 | 0 | 2 | 0 | A级-优秀
李娜 | 技术部 | 0 | 0.00| 0.00 | 0 | 0 | 0 | 0 | 无业绩
刘洋 | 市场部 | 0 | 0.00| 0.00 | 0 | 0 | 0 | 0 | 无业绩
陈静 | NULL | 0 | 0.00| 0.00 | 0 | 0 | 0 | 0 | 无业绩
代码解读:
LEFT JOIN— 保留所有员工,即使没有订单COALESCE(SUM(...), 0)— 将没有订单的员工销售额显示为0SUM(CASE WHEN ... THEN 1 ELSE 0 END)— 条件统计各状态订单数(行转列)CASE— 根据总销售额评定业绩等级ROUND(AVG(...), 2)— 平均值保留两位小数
查询二:部门季度销售对比分析
需求:按部门和季度统计销售额,计算各部门的销售占比,找出每个季度的销售冠军部门。
SELECT
d.department_name AS 部门,
CONCAT('Q', QUARTER(o.order_date)) AS 季度,
COUNT(o.order_id) AS 订单数,
SUM(o.total_amount) AS 季度销售额,
ROUND(
SUM(o.total_amount) /
SUM(SUM(o.total_amount)) OVER (PARTITION BY QUARTER(o.order_date)) * 100,
2) AS 占比百分比,
RANK() OVER (
PARTITION BY QUARTER(o.order_date)
ORDER BY SUM(o.total_amount) DESC
) AS 季度排名
FROM orders o
INNER JOIN employees e ON o.employee_id = e.employee_id
INNER JOIN departments d ON e.department_id = d.department_id
WHERE o.status != 'cancelled'
GROUP BY d.department_name, QUARTER(o.order_date)
ORDER BY 季度, 季度销售额 DESC;
执行结果:
部门 | 季度 | 订单数 | 季度销售额 | 占比百分比 | 季度排名
------+-------+--------+---------+-----------+----------
销售部 | Q1 | 2 | 4300.00| 69.35 | 1
技术部 | Q1 | 1 | 950.00| 15.32 | 2
销售部 | Q2 | 2 | 5600.00| 26.54 | 1
技术部 | Q2 | 1 | 5600.00| 26.54 | 1
销售部 | Q2 | 1 | 3800.00| 18.01 | 3
代码解读:
QUARTER(o.order_date)— 提取订单日期的季度(Q1-Q4)SUM(o.total_amount) OVER (PARTITION BY ...)— 窗口函数,计算每季度的总销售额占比百分比— 每个部门的季度销售额 ÷ 该季度总销售额 × 100RANK() OVER (PARTITION BY ... ORDER BY ...)— 按季度分组排名
查询三:客户价值分层分析(RFM模型简化版)
需求:根据客户的最近购买时间(Recency)、购买频率(Frequency)、消费金额(Monetary)进行客户分层。
WITH customer_stats AS (
SELECT
customer_id,
MAX(order_date) AS 最近购买日期,
COUNT(order_id) AS 购买次数,
SUM(total_amount) AS 总消费金额
FROM orders
WHERE status != 'cancelled'
GROUP BY customer_id
)
SELECT
customer_id AS 客户ID,
最近购买日期,
购买次数,
总消费金额,
CASE
WHEN 购买次数 >= 3 AND 总消费金额 >= 5000 THEN 'VIP客户'
WHEN 购买次数 >= 2 AND 总消费金额 >= 3000 THEN '重要客户'
WHEN 购买次数 >= 2 THEN '活跃客户'
WHEN 总消费金额 >= 2000 THEN '潜力客户'
ELSE '普通客户'
END AS 客户等级
FROM customer_stats
ORDER BY 总消费金额 DESC;
执行结果:
客户ID | 最近购买日期 | 购买次数 | 总消费金额 | 客户等级
------+-------------+---------+---------+----------
103 | 2026-06-10 | 2 | 9500.00 | VIP客户
101 | 2026-04-15 | 2 | 3450.00 | 重要客户
104 | 2026-03-10 | 1 | 4100.00 | 潜力客户
102 | 2026-05-20 | 2 | 5600.00 | VIP客户
106 | 2026-05-02 | 1 | 1500.00 | 普通客户
105 | 2026-04-01 | 0 | 0.00 | 普通客户
107 | 2026-06-01 | 1 | 2200.00 | 潜力客户
108 | 2026-06-15 | 1 | 1900.00 | 普通客户
代码解读:
WITH customer_stats AS (...)— CTE(公共表表达式),先计算每个客户的统计数据- CTE内部:
MAX(order_date)最近购买、COUNT购买次数、SUM总消费 - 外层
CASE根据多维度条件进行客户分层 - 排除已取消订单(
status != 'cancelled')
查询四:产品库存与销售关联分析
需求:分析每个部门的产品库存情况,结合销售数据评估库存周转。
SELECT
d.department_name AS 部门,
COUNT(DISTINCT p.product_id) AS 产品数量,
SUM(p.stock) AS 总库存量,
COALESCE(SUM(p.price * p.stock), 0) AS 库存总价值,
COUNT(DISTINCT o.order_id) AS 关联订单数,
COALESCE(SUM(o.total_amount), 0) AS 订单总金额,
CASE
WHEN SUM(p.price * p.stock) > 0
THEN ROUND(COALESCE(SUM(o.total_amount), 0) / SUM(p.price * p.stock) * 100, 2)
ELSE 0
END AS 周转率百分比
FROM departments d
LEFT JOIN products p ON d.department_id = p.department_id
LEFT JOIN employees e ON d.department_id = e.department_id
LEFT JOIN orders o ON e.employee_id = o.employee_id AND o.status != 'cancelled'
GROUP BY d.department_id, d.department_name
ORDER BY 库存总价值 DESC;
执行结果:
部门 | 产品数量 | 总库存量 | 库存总价值 | 关联订单数 | 订单总金额 | 周转率百分比
------+---------+---------+------------+-----------+---------+------------
技术部 | 3 | 275| 369945.00 | 4 | 8750.00 | 2.36
销售部 | 2 | 70| 68930.00 | 8 | 26000.00 | 37.71
市场部 | 2 | 75| 86250.00 | 0 | 0.00 | 0.00
代码解读:
- 多表
LEFT JOIN:departments → products + employees → orders COUNT(DISTINCT ...)— 去重计数,避免重复p.price * p.stock— 计算产品库存价值- 周转率 = 订单总金额 ÷ 库存总价值 × 100%
查询五:员工绩效排名与部门对比
需求:对员工进行综合绩效排名,并与部门平均水平对比。
WITH emp_performance AS (
SELECT
e.employee_id,
CONCAT(e.first_name, e.last_name) AS 姓名,
d.department_name AS 部门,
e.salary AS 基本工资,
COUNT(o.order_id) AS 订单数,
COALESCE(SUM(o.total_amount), 0) AS 销售额,
COALESCE(AVG(o.total_amount), 0) AS 平均订单金额
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id
LEFT JOIN orders o ON e.employee_id = o.employee_id AND o.status != 'cancelled'
GROUP BY e.employee_id, e.first_name, e.last_name, d.department_name, e.salary
)
SELECT
姓名,
部门,
基本工资,
订单数,
销售额,
ROUND(平均订单金额, 2) AS 平均订单金额,
RANK() OVER (ORDER BY 销售额 DESC) AS 全公司排名,
RANK() OVER (PARTITION BY 部门 ORDER BY 销售额 DESC) AS 部门排名,
ROUND(销售额 - AVG(销售额) OVER (PARTITION BY 部门), 2) AS 与部门均值差
FROM emp_performance
ORDER BY 销售额 DESC;
执行结果:
姓名 | 部门 | 基本工资 | 订单数 | 销售额 | 平均订单金额 | 全公司排名 | 部门排名 | 与部门均值差
----+--------+---------+--------+---------+-------------+-----------+---------+------------
赵敏 | 销售部 | 11000.00| 4 |15100.00 | 3775.00 | 1 | 1 | 3550.00
王强 | 销售部 | 8000.00| 3 |10900.00 | 3633.33 | 2 | 2 | -650.00
张伟 | 技术部 | 12000.00| 2 | 8750.00 | 4375.00 | 3 | 1 | 8750.00
李娜 | 技术部 | 9500.00| 0 | 0.00 | 0.00 | 4 | 2 | -8750.00
刘洋 | 市场部 | 7500.00| 0 | 0.00 | 0.00 | 4 | 1 | 0.00
陈静 | NULL | 9000.00| 0 | 0.00 | 0.00 | 4 | 1 | 0.00
代码解读:
- CTE
emp_performance— 先计算每位员工的基础绩效数据 RANK() OVER (ORDER BY ...)— 全公司排名RANK() OVER (PARTITION BY ... ORDER BY ...)— 部门内排名AVG(销售额) OVER (PARTITION BY 部门)— 计算部门平均销售额与部门均值差— 个人销售额与部门均值的差额
❓ 常见问题
Q:CTE(WITH语句)和子查询有什么区别? A: CTE可读性更好,可以在同一查询中多次引用;子查询每次引用都会重新执行。CTE还支持递归查询。建议复杂查询优先使用CTE。
Q:窗口函数和GROUP BY有什么区别? A:
GROUP BY将多行合并为一行(聚合后行数减少);窗口函数保留所有原始行,在每行上添加聚合结果。如果既要聚合又要保留明细,用窗口函数。
Q:CASE WHEN嵌套层数有限制吗? A: 理论上没有限制,但嵌套过多说明逻辑过于复杂。建议用CTE拆分步骤,或在应用层处理复杂逻辑。
Q:大数据量下的聚合查询如何优化? A: 为GROUP BY和WHERE涉及的列创建索引;使用CTE拆分复杂查询;避免在聚合列上使用函数(如
WHERE YEAR(date) = 2026改为WHERE date >= '2026-01-01' AND date < '2027-01-01');用EXPLAIN分析执行计划。
📖 小节
| 技能 | 本节应用 |
|---|---|
| 聚合函数 | COUNT、SUM、AVG、MAX、MIN |
| GROUP BY | 按维度分组统计 |
| HAVING | 过滤聚合结果 |
| CASE WHEN | 条件统计(行转列)、等级评定 |
| JOIN | 多表关联查询 |
| 子查询 / CTE | 分步构建复杂查询 |
| 窗口函数 | RANK()排名、AVG() OVER()对比分析 |
| COALESCE | 处理NULL值 |
实战要点:
- 复杂查询用CTE分步构建,提高可读性
- CASE WHEN + 聚合函数 = 条件统计(行转列)
- 窗口函数保留明细的同时添加聚合信息
- 始终用COALESCE处理NULL,避免报表出现空值
- 用EXPLAIN分析查询性能,为关键列创建索引
📝 作业
作业1:编写查询,生成"月度销售趋势报表"——按月份统计订单数量、总销售额、平均订单金额,并计算环比增长率(本月 - 上月)/ 上月 × 100%。
作业2:编写查询,找出"每个部门工资最高的员工"及其"处理的订单数量",要求使用CTE + 窗口函数实现。
作业3:编写查询,按客户消费金额划分等级(≥5000为VIP,≥3000为重要,≥1000为普通,<1000为新客户),统计每个等级的客户数量和总消费金额。
作业4:创建一个视图v_employee_dashboard,包含员工姓名、部门、工资、订单数、总销售额、业绩等级。使用该视图查询"销售部"的员工绩效。
下一课
👉 19-window-functions - 窗口函数:深入学习ROW_NUMBER、RANK、DENSE_RANK、LAG、LEAD等窗口函数,掌握高级数据分析技巧!



