Practice: Advanced Features Integration
Practice: Advanced Features Integration
Project Requirements
This lesson will comprehensively apply advanced SQL features through four practical scenarios:
- Ranking queries: Use window functions to implement various types of rankings
- Cumulative calculations: Implement running totals and moving averages
- Recursive queries: Use CTEs to handle hierarchical data
- Complex business logic: Combine transactions for batch business operations
Data Preparation
First, ensure we have a unified four-table database structure:
-- Departments table
CREATE TABLE IF NOT EXISTS departments (
id INTEGER PRIMARY KEY,
department_name TEXT NOT NULL,
manager_id INTEGER,
location TEXT
);
-- Employees table
CREATE TABLE IF NOT EXISTS employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
department_id INTEGER,
position TEXT,
salary REAL,
hire_date TEXT,
manager_id INTEGER,
FOREIGN KEY (department_id) REFERENCES departments(id)
);
-- Products table
CREATE TABLE IF NOT EXISTS products (
id INTEGER PRIMARY KEY,
product_name TEXT NOT NULL,
category TEXT,
price REAL,
stock_quantity INTEGER
);
-- Orders table
CREATE TABLE IF NOT EXISTS orders (
id INTEGER PRIMARY KEY,
customer_name TEXT,
product_id INTEGER,
quantity INTEGER,
amount REAL,
order_date TEXT,
status TEXT,
FOREIGN KEY (product_id) REFERENCES products(id)
);
-- Insert test data
INSERT OR IGNORE INTO departments VALUES
(1, 'Tech', 1, 'Beijing'),
(2, 'Marketing', 4, 'Shanghai'),
(3, 'Finance', 6, 'Beijing');
INSERT OR IGNORE INTO employees VALUES
(1, 'Alice', 1, 'Senior Engineer', 15000, '2020-01-15', NULL),
(2, 'Bob', 1, 'Mid-level Engineer', 12000, '2021-03-20', 1),
(3, 'Charlie', 1, 'Junior Engineer', 8000, '2022-06-10', 1),
(4, 'Diana', 2, 'Marketing Director', 18000, '2019-05-01', NULL),
(5, 'Eve', 2, 'Marketing Specialist', 9000, '2021-08-15', 4),
(6, 'Frank', 3, 'Finance Manager', 16000, '2020-02-28', NULL),
(7, 'Grace', 3, 'Accountant', 10000, '2021-11-05', 6),
(8, 'Henry', 1, 'Intern Engineer', 5000, '2023-07-01', 2);
INSERT OR IGNORE INTO products VALUES
(1, 'Laptop', 'Electronics', 6999, 50),
(2, 'Wireless Mouse', 'Electronics', 199, 200),
(3, 'Mechanical Keyboard', 'Electronics', 599, 100),
(4, 'Office Chair', 'Furniture', 1299, 30),
(5, 'Monitor', 'Electronics', 2499, 80);
INSERT OR IGNORE INTO orders VALUES
(1, 'CustomerA', 1, 2, 13998, '2024-01-10', 'completed'),
(2, 'CustomerB', 2, 5, 995, '2024-01-15', 'completed'),
(3, 'CustomerA', 3, 3, 1797, '2024-02-01', 'completed'),
(4, 'CustomerC', 1, 1, 6999, '2024-02-15', 'pending'),
(5, 'CustomerB', 4, 2, 2598, '2024-03-01', 'completed'),
(6, 'CustomerD', 5, 4, 9996, '2024-03-10', 'completed'),
(7, 'CustomerA', 2, 10, 1990, '2024-03-15', 'completed'),
(8, 'CustomerC', 3, 1, 599, '2024-04-01', 'cancelled');
Practice 1: Ranking Queries
Employee Salary Ranking
-- Various ranking queries
WITH salary_ranking AS (
SELECT
e.id,
e.name,
d.department_name,
e.salary,
e.hire_date,
-- Company-wide salary ranking
RANK() OVER (ORDER BY e.salary DESC) AS company_rank,
-- Company-wide salary ranking (no ties)
ROW_NUMBER() OVER (ORDER BY e.salary DESC) AS company_row_num,
-- Department-level salary ranking
RANK() OVER (
PARTITION BY e.department_id
ORDER BY e.salary DESC
) AS dept_rank,
-- Salary percentile ranking
PERCENT_RANK() OVER (ORDER BY e.salary DESC) AS percentile,
-- Salary grouping (4 groups)
NTILE(4) OVER (ORDER BY e.salary DESC) AS salary_quartile
FROM employees e
JOIN departments d ON e.department_id = d.id
)
SELECT
name,
department_name,
salary,
company_rank AS "Company Rank",
dept_rank AS "Dept Rank",
CASE salary_quartile
WHEN 1 THEN 'Top Tier'
WHEN 2 THEN 'Upper Mid'
WHEN 3 THEN 'Lower Mid'
WHEN 4 THEN 'Bottom Tier'
END AS "Salary Band",
ROUND(percentile * 100, 1) || '%' AS "Percentile"
FROM salary_ranking
ORDER BY company_rank;
Product Sales Ranking
-- Product sales and revenue ranking
WITH product_sales AS (
SELECT
p.id,
p.product_name,
p.category,
p.price,
COALESCE(SUM(o.quantity), 0) AS total_quantity,
COALESCE(SUM(o.amount), 0) AS total_revenue,
COUNT(o.id) AS order_count
FROM products p
LEFT JOIN orders o ON p.id = o.product_id AND o.status = 'completed'
GROUP BY p.id, p.product_name, p.category, p.price
)
SELECT
product_name AS "Product",
category AS "Category",
price AS "Unit Price",
total_quantity AS "Total Qty",
total_revenue AS "Total Revenue",
order_count AS "Orders",
RANK() OVER (ORDER BY total_revenue DESC) AS "Revenue Rank",
RANK() OVER (
PARTITION BY category
ORDER BY total_quantity DESC
) AS "Category Qty Rank",
DENSE_RANK() OVER (ORDER BY total_quantity DESC) AS "Sales Rank"
FROM product_sales
ORDER BY total_revenue DESC;
Practice 2: Cumulative Calculations
Monthly Cumulative Sales
-- Monthly sales and cumulative calculations
WITH monthly_sales AS (
SELECT
strftime('%Y-%m', order_date) AS month,
SUM(amount) AS monthly_total,
COUNT(*) AS order_count
FROM orders
WHERE status = 'completed'
GROUP BY strftime('%Y-%m', order_date)
)
SELECT
month AS "Month",
monthly_total AS "Monthly Sales",
order_count AS "Orders",
-- Cumulative sales
SUM(monthly_total) OVER (ORDER BY month) AS "Cumulative Sales",
-- Cumulative orders
SUM(order_count) OVER (ORDER BY month) AS "Cumulative Orders",
-- Month-over-month growth rate
ROUND(
(monthly_total - LAG(monthly_total) OVER (ORDER BY month))
/ LAG(monthly_total) OVER (ORDER BY month) * 100,
2
) AS "MoM Growth %",
-- Difference from previous month
monthly_total - LAG(monthly_total) OVER (ORDER BY month) AS "MoM Diff"
FROM monthly_sales
ORDER BY month;
Moving Average Calculation
-- Moving average of order amounts
WITH order_details AS (
SELECT
o.id,
o.order_date,
o.amount,
o.customer_name,
-- 3-day moving average
AVG(o.amount) OVER (
ORDER BY o.order_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS avg_3day,
-- 5-day moving average
AVG(o.amount) OVER (
ORDER BY o.order_date
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
) AS avg_5day,
-- Cumulative average
AVG(o.amount) OVER (ORDER BY o.order_date) AS cumulative_avg,
-- Previous order amount
LAG(o.amount, 1) OVER (ORDER BY o.order_date) AS prev_amount,
-- Running max order amount
MAX(o.amount) OVER (ORDER BY o.order_date) AS running_max
FROM orders o
WHERE o.status = 'completed'
)
SELECT
order_date AS "Order Date",
amount AS "Amount",
ROUND(avg_3day, 2) AS "3-Day MA",
ROUND(avg_5day, 2) AS "5-Day MA",
ROUND(cumulative_avg, 2) AS "Cumulative Avg",
running_max AS "Running Max"
FROM order_details
ORDER BY order_date;
Practice 3: Recursive Queries
Organization Hierarchy Query
-- Recursive query for organization hierarchy
WITH RECURSIVE org_hierarchy AS (
-- Anchor: top-level managers (employees with no manager)
SELECT
e.id,
e.name,
e.position,
e.manager_id,
0 AS level,
e.name AS path,
e.salary
FROM employees e
WHERE e.manager_id IS NULL
UNION ALL
-- Recursive part: find subordinates
SELECT
e.id,
e.name,
e.position,
e.manager_id,
oh.level + 1,
oh.path || ' -> ' || e.name,
e.salary
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.id
)
SELECT
CASE level
WHEN 0 THEN ''
WHEN 1 THEN '├─ '
WHEN 2 THEN '│ ├─ '
ELSE '│ │ ├─ '
END || name AS "Org Chart",
position AS "Position",
level AS "Level",
path AS "Reporting Path",
salary AS "Salary"
FROM org_hierarchy
ORDER BY path;
Department Level Statistics
-- Statistics on personnel hierarchy and salary distribution per department
WITH dept_stats AS (
SELECT
d.id,
d.department_name,
COUNT(e.id) AS employee_count,
COALESCE(SUM(e.salary), 0) AS total_salary,
COALESCE(AVG(e.salary), 0) AS avg_salary,
COALESCE(MAX(e.salary), 0) AS max_salary,
COALESCE(MIN(e.salary), 0) AS min_salary
FROM departments d
LEFT JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.department_name
)
SELECT
department_name AS "Department",
employee_count AS "Employees",
ROUND(total_salary, 2) AS "Total Salary",
ROUND(avg_salary, 2) AS "Avg Salary",
max_salary AS "Max Salary",
min_salary AS "Min Salary",
max_salary - min_salary AS "Salary Gap",
-- Salary distribution
CASE
WHEN avg_salary > 15000 THEN 'High Salary'
WHEN avg_salary > 10000 THEN 'Medium Salary'
ELSE 'Base Salary'
END AS "Salary Level"
FROM dept_stats
ORDER BY total_salary DESC;
Practice 4: Complex Business Logic
Batch Order Processing (Transaction)
-- Process batch orders using transactions
BEGIN TRANSACTION;
-- Savepoint: for easy rollback
SAVEPOINT before_process;
-- 1. Update product inventory for completed orders
UPDATE products
SET stock_quantity = stock_quantity - (
SELECT COALESCE(SUM(o.quantity), 0)
FROM orders o
WHERE o.product_id = products.id
AND o.status = 'completed'
AND o.order_date >= '2024-01-01'
)
WHERE id IN (
SELECT DISTINCT product_id
FROM orders
WHERE status = 'completed'
AND order_date >= '2024-01-01'
);
-- 2. Check if any inventory went negative
SELECT
CASE
WHEN MIN(stock_quantity) < 0 THEN 'ERROR'
ELSE 'OK'
END AS inventory_check
FROM products;
-- 3. If inventory check passes, generate monthly report
INSERT INTO monthly_report (month, total_revenue, total_orders)
SELECT
strftime('%Y-%m', order_date),
SUM(amount),
COUNT(*)
FROM orders
WHERE status = 'completed'
GROUP BY strftime('%Y-%m', order_date);
-- Commit the transaction
COMMIT;
-- View results
SELECT * FROM products ORDER BY id;
Customer Value Analysis
-- Comprehensive customer value analysis
WITH customer_analysis AS (
SELECT
customer_name,
COUNT(*) AS order_count,
SUM(amount) AS total_spent,
AVG(amount) AS avg_order_value,
MIN(order_date) AS first_order,
MAX(order_date) AS last_order,
-- Calculate customer lifetime (days)
julianday(MAX(order_date)) - julianday(MIN(order_date)) AS customer_lifetime,
-- Days since last purchase
julianday('now') - julianday(MAX(order_date)) AS days_since_last_order
FROM orders
WHERE status = 'completed'
GROUP BY customer_name
),
customer_rfm AS (
SELECT
*,
-- R score: recency rating (1-5, 5 = most recent)
CASE
WHEN days_since_last_order <= 30 THEN 5
WHEN days_since_last_order <= 60 THEN 4
WHEN days_since_last_order <= 90 THEN 3
WHEN days_since_last_order <= 180 THEN 2
ELSE 1
END AS r_score,
-- F score: frequency rating
CASE
WHEN order_count >= 5 THEN 5
WHEN order_count >= 3 THEN 4
WHEN order_count >= 2 THEN 3
WHEN order_count >= 1 THEN 2
ELSE 1
END AS f_score,
-- M score: monetary value rating
NTILE(5) OVER (ORDER BY total_spent) AS m_score
FROM customer_analysis
)
SELECT
customer_name AS "Customer",
order_count AS "Orders",
ROUND(total_spent, 2) AS "Total Spent",
ROUND(avg_order_value, 2) AS "Avg Order Value",
first_order AS "First Order",
last_order AS "Last Order",
r_score AS "R Score",
f_score AS "F Score",
m_score AS "M Score",
r_score + f_score + m_score AS "RFM Total",
CASE
WHEN r_score + f_score + m_score >= 12 THEN 'VIP Customer'
WHEN r_score + f_score + m_score >= 9 THEN 'Important Customer'
WHEN r_score + f_score + m_score >= 6 THEN 'Regular Customer'
ELSE 'Low Value Customer'
END AS "Customer Tier"
FROM customer_rfm
ORDER BY r_score + f_score + m_score DESC;
Salary Band Analysis
-- Salary band analysis report
WITH salary_bands AS (
SELECT
e.id,
e.name,
d.department_name,
e.salary,
e.hire_date,
-- Calculate years of employment
(julianday('now') - julianday(e.hire_date)) / 365.25 AS years_employed,
-- Salary band classification
CASE
WHEN e.salary >= 15000 THEN 'Band A (15000+)'
WHEN e.salary >= 12000 THEN 'Band B (12000-14999)'
WHEN e.salary >= 8000 THEN 'Band C (8000-11999)'
WHEN e.salary >= 5000 THEN 'Band D (5000-7999)'
ELSE 'Band E (<5000)'
END AS salary_band
FROM employees e
JOIN departments d ON e.department_id = d.id
)
SELECT
salary_band AS "Salary Band",
COUNT(*) AS "Employees",
ROUND(AVG(salary), 2) AS "Avg Salary",
ROUND(AVG(years_employed), 1) AS "Avg Tenure",
MIN(salary) AS "Min Salary",
MAX(salary) AS "Max Salary",
GROUP_CONCAT(name, ', ') AS "Employees"
FROM salary_bands
GROUP BY salary_band
ORDER BY salary_band;
❓ FAQ
Q: What's the difference between window functions and GROUP BY? A: GROUP BY merges multiple rows into one (aggregation), while window functions retain all original rows and simply add calculated results to each row. Window functions are suitable for scenarios where you need to view both details and summaries simultaneously.
Q: Do recursive CTEs have a depth limit? A: SQLite defaults to a recursion depth of 1000, which can be adjusted via
sqlite3_limit(db, SQLITE_LIMIT_RECURSION_DEPTH, n). Excessive recursion depth can cause performance issues; consider optimizing query logic.
Q: Does a transaction automatically roll back if an error occurs? A: No, it doesn't roll back automatically. You need to explicitly use a ROLLBACK statement, or use SAVEPOINT and ROLLBACK TO SAVEPOINT for partial rollback. In SQLite, if using BEGIN...COMMIT, you need to manually roll back when encountering errors.
Q: How can I optimize complex query performance? A: 1) Use EXPLAIN QUERY PLAN to analyze the execution plan; 2) Create indexes on frequently queried columns; 3) Avoid using functions on columns in WHERE clauses; 4) Use CTEs wisely to improve readability; 5) For frequently queried results, consider creating views.
📖 Summary
Through this practice lesson we mastered:
- Implementing various ranking queries with window functions
- Cumulative calculations and moving average implementation techniques
- Recursive CTE techniques for handling hierarchical data
- Transaction processing in complex business logic
- Customer value analysis (RFM model)
- Advanced query techniques like salary band analysis
📝 Exercises
-
Ranking Exercise: Write a query to find the top 2 longest-tenured employees in each department, showing their name, department, hire date, and years of service.
-
Cumulative Calculation: Write a query to calculate monthly sales for each product category, showing:
- Monthly sales
- Cumulative sales
- Month-over-month growth rate
- The category's sales as a percentage of total monthly sales
-
Recursive Query: Suppose products have a category hierarchy (Electronics -> Computer Accessories -> Mice). Design a category table and write a recursive query to display the complete category path.
-
Comprehensive Exercise: Design an "Employee Performance Evaluation" query that combines the following metrics:
- Sales performance (joined with orders table)
- Years of service
- Salary ranking within department
- Provide an overall score and recommendation
Next Lesson → 25-database-design.md



