Database Design

Database Design

🎯 Life Analogy

Imagine you're opening a library:

📚 Core Concepts

1. Requirements Analysis

Before writing any SQL, answer these questions:

Question Example
What entities does the system manage? Users, articles, comments, tags
What are the relationships between entities? One user writes many articles (one-to-many)
What queries need to be supported? Search by tag, sort by time
Estimated data volume? Millions of articles, tens of millions of comments
Read-heavy or write-heavy? Blog systems are read-heavy

2. ER Diagram (Entity-Relationship Diagram)

ER diagrams describe data models using three core elements:

TEXT
[Entity] —— Attribute1, Attribute2, Attribute3
   |
  (Relationship)  cardinality: 1:1, 1:N, M:N
   |
[Entity] —— Attribute1, Attribute2

Common Relationship Types:

Relationship Example Implementation
One-to-One (1:1) User ↔ User Details Foreign key or merge into one table
One-to-Many (1:N) User → Article Add user_id foreign key in the articles table
Many-to-Many (M:N) Article ↔ Tag Junction table article_tag

3. Normalization Theory

First Normal Form (1NF): Columns must be atomic

SQL
-- ❌ Violates 1NF: phone column stores multiple values
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    phones VARCHAR(200)  -- '138xxx,139xxx,137xxx'
);

-- ✅ Conforms to 1NF: each field stores a single atomic value
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(50)
);
CREATE TABLE user_phones (
    id INT PRIMARY KEY,
    user_id INT,
    phone VARCHAR(20)
);

Second Normal Form (2NF): Non-key columns must fully depend on the entire primary key

SQL
-- ❌ Violates 2NF: in order items, product_name depends only on product_id, not on order_id
CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    product_name VARCHAR(100),  -- Partial dependency
    quantity INT,
    PRIMARY KEY (order_id, product_id)
);

-- ✅ Conforms to 2NF: split into a products table
CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(100)
);
CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    quantity INT,
    PRIMARY KEY (order_id, product_id)
);

Third Normal Form (3NF): Non-key columns must not have transitive dependencies

SQL
-- ❌ Violates 3NF: city_name depends on city_id, city_id depends on id → transitive dependency
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    city_id INT,
    city_name VARCHAR(50)  -- Transitive dependency
);

-- ✅ Conforms to 3NF: split into a cities table
CREATE TABLE cities (
    id INT PRIMARY KEY,
    name VARCHAR(50)
);
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    city_id INT
);

BCNF (Boyce-Codd Normal Form)

Stricter than 3NF: every determinant must be a candidate key. In practice, satisfying 3NF is usually sufficient.

4. Denormalization

Sometimes for query performance, redundancy is intentionally introduced:

SQL
-- Store comment count redundantly in the articles table to avoid COUNT every time
ALTER TABLE articles ADD COLUMN comment_count INT DEFAULT 0;

-- Update the redundant field
UPDATE articles SET comment_count = (
    SELECT COUNT(*) FROM comments WHERE comments.article_id = articles.id
);

When to denormalize:

5. Table Relationship Design

Primary Key Choices:

Type Pros Cons Use Cases
Auto-increment AUTO_INCREMENT Ordered, compact, fast inserts Predictable, sharding issues Most business tables
UUID Globally unique, unpredictable Large size, unordered indexes are slow Distributed systems
Composite primary key Semantically clear Foreign key references are cumbersome Junction tables, association tables
Snowflake ID Ordered, globally unique Requires extra dependencies Distributed high-concurrency
SQL
-- Auto-increment primary key
CREATE TABLE users (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50)
);

-- UUID primary key
CREATE TABLE users (
    id CHAR(36) PRIMARY KEY DEFAULT (UUID()),
    name VARCHAR(50)
);

-- Composite primary key (junction table)
CREATE TABLE article_tag (
    article_id BIGINT,
    tag_id BIGINT,
    PRIMARY KEY (article_id, tag_id)
);

6. Field Naming Conventions

Convention ✅ Recommended ❌ Avoid
Use lowercase + underscores user_name UserName, username
Table names in plural users user, tbl_user
Foreign key field names user_id uid, userId
Boolean fields is_deleted deleted, flag
Timestamp fields created_at createTime, add_time
Monetary fields DECIMAL(10,2) FLOAT, DOUBLE

7. Design Best Practices

  1. Every table must have a primary key
  2. Add indexes on foreign key columns (even if the database doesn't enforce foreign key constraints)
  3. Choose the smallest sufficient data type: TINYINT instead of INT for status values
  4. Use DECIMAL for monetary values, not floating-point types
  5. Use DATETIME or TIMESTAMP for time fields, not strings
  6. Reserve extension fields: extra JSON or status with bitwise operations
  7. Soft delete instead of physical delete: is_deleted TINYINT DEFAULT 0

💡 Basic Syntax

SQL
-- Basic template for creating a table
CREATE TABLE table_name (
    id BIGINT AUTO_INCREMENT PRIMARY KEY COMMENT 'Primary key',
    -- Business fields
    name VARCHAR(100) NOT NULL COMMENT 'Name',
    status TINYINT NOT NULL DEFAULT 1 COMMENT 'Status: 1-active 0-disabled',
    -- Foreign key field
    user_id BIGINT NOT NULL COMMENT 'User ID',
    -- Timestamp fields
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Created time',
    updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Updated time',
    -- Indexes
    INDEX idx_user_id (user_id),
    INDEX idx_status (status)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Table comment';
💡 Tip: Spending 1 extra hour on design can save 10 hours of refactoring later. Draw an ER diagram on paper first, then write SQL.

Example: Design a Simple Student Course Selection System (Difficulty ⭐)

Requirements: Manage students, courses, and enrollment records.

SQL
-- Students table
CREATE TABLE students (
    id BIGINT AUTO_INCREMENT PRIMARY KEY COMMENT 'Student ID',
    student_no VARCHAR(20) NOT NULL UNIQUE COMMENT 'Student number',
    name VARCHAR(50) NOT NULL COMMENT 'Name',
    gender TINYINT NOT NULL DEFAULT 0 COMMENT 'Gender: 0-unknown 1-male 2-female',
    enrollment_year INT NOT NULL COMMENT 'Enrollment year',
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Students table';

-- Courses table
CREATE TABLE courses (
    id BIGINT AUTO_INCREMENT PRIMARY KEY COMMENT 'Course ID',
    course_no VARCHAR(20) NOT NULL UNIQUE COMMENT 'Course number',
    name VARCHAR(100) NOT NULL COMMENT 'Course name',
    credit DECIMAL(3,1) NOT NULL COMMENT 'Credits',
    teacher VARCHAR(50) COMMENT 'Instructor',
    max_students INT NOT NULL DEFAULT 60 COMMENT 'Max enrollment',
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Courses table';

-- Enrollments table (junction table, many-to-many relationship)
CREATE TABLE enrollments (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    student_id BIGINT NOT NULL COMMENT 'Student ID',
    course_id BIGINT NOT NULL COMMENT 'Course ID',
    score DECIMAL(5,2) COMMENT 'Score',
    enrolled_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Enrollment time',
    UNIQUE KEY uk_student_course (student_id, course_id),
    INDEX idx_course_id (course_id),
    INDEX idx_student_id (student_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Enrollments table';
▶ Try it Yourself

Example: Design Core Tables for an E-commerce Order System (Difficulty ⭐⭐)

SQL
-- Product categories table
CREATE TABLE categories (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50) NOT NULL COMMENT 'Category name',
    parent_id BIGINT DEFAULT NULL COMMENT 'Parent category ID, NULL for top-level',
    sort_order INT NOT NULL DEFAULT 0 COMMENT 'Sort order',
    INDEX idx_parent_id (parent_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Product categories table';

-- Products table
CREATE TABLE products (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    category_id BIGINT NOT NULL COMMENT 'Category ID',
    name VARCHAR(200) NOT NULL COMMENT 'Product name',
    price DECIMAL(10,2) NOT NULL COMMENT 'Selling price',
    stock INT NOT NULL DEFAULT 0 COMMENT 'Stock',
    status TINYINT NOT NULL DEFAULT 1 COMMENT 'Status: 1-listed 0-unlisted',
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_category (category_id),
    INDEX idx_status_price (status, price)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Products table';

-- Orders table (denormalization: redundantly store shipping address snapshot)
CREATE TABLE orders (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    order_no VARCHAR(32) NOT NULL UNIQUE COMMENT 'Order number',
    user_id BIGINT NOT NULL COMMENT 'User ID',
    total_amount DECIMAL(12,2) NOT NULL COMMENT 'Total order amount',
    status TINYINT NOT NULL DEFAULT 0 COMMENT 'Status: 0-pending payment 1-paid 2-shipped 3-completed 4-cancelled',
    receiver_name VARCHAR(50) NOT NULL COMMENT 'Receiver name',
    receiver_phone VARCHAR(20) NOT NULL COMMENT 'Receiver phone',
    receiver_address VARCHAR(500) NOT NULL COMMENT 'Shipping address',
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    paid_at DATETIME COMMENT 'Payment time',
    INDEX idx_user_id (user_id),
    INDEX idx_status (status),
    INDEX idx_created_at (created_at)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Orders table';

-- Order items table
CREATE TABLE order_items (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    order_id BIGINT NOT NULL COMMENT 'Order ID',
    product_id BIGINT NOT NULL COMMENT 'Product ID',
    product_name VARCHAR(200) NOT NULL COMMENT 'Product name (snapshot)',
    product_price DECIMAL(10,2) NOT NULL COMMENT 'Unit price (snapshot)',
    quantity INT NOT NULL COMMENT 'Quantity',
    subtotal DECIMAL(12,2) NOT NULL COMMENT 'Subtotal',
    INDEX idx_order_id (order_id),
    INDEX idx_product_id (product_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Order items table';
▶ Try it Yourself

Design Highlights:

🏢 Scenario 1: Design a Blog System Database

Requirements: Support user registration, writing articles, tagging, and posting comments.

SQL
-- Users table
CREATE TABLE blog_users (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(100) NOT NULL UNIQUE,
    password_hash VARCHAR(255) NOT NULL,
    avatar_url VARCHAR(500),
    bio TEXT,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Blog users table';

-- Articles table
CREATE TABLE blog_articles (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    user_id BIGINT NOT NULL,
    category_id BIGINT,
    title VARCHAR(200) NOT NULL,
    content TEXT NOT NULL,
    status TINYINT NOT NULL DEFAULT 0 COMMENT '0-draft 1-published 2-unlisted',
    view_count INT NOT NULL DEFAULT 0,
    like_count INT NOT NULL DEFAULT 0,
    published_at DATETIME,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_user_id (user_id),
    INDEX idx_category (category_id),
    INDEX idx_status_published (status, published_at DESC)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Articles table';

-- Tags table
CREATE TABLE blog_tags (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50) NOT NULL UNIQUE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Tags table';

-- Article-tag junction table
CREATE TABLE blog_article_tag (
    article_id BIGINT NOT NULL,
    tag_id BIGINT NOT NULL,
    PRIMARY KEY (article_id, tag_id),
    INDEX idx_tag_id (tag_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Article-tag association table';

-- Comments table (supports nested comments)
CREATE TABLE blog_comments (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    article_id BIGINT NOT NULL,
    user_id BIGINT NOT NULL,
    parent_id BIGINT COMMENT 'Parent comment ID, NULL for top-level comments',
    content TEXT NOT NULL,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_article_id (article_id),
    INDEX idx_parent_id (parent_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Comments table';

🏢 Scenario 2: Multi-Tenant SaaS System Design

Requirements: A single system serving multiple enterprise clients with data isolation.

SQL
-- Tenants table
CREATE TABLE tenants (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100) NOT NULL COMMENT 'Company name',
    plan VARCHAR(20) NOT NULL DEFAULT 'free' COMMENT 'Plan: free/basic/pro',
    max_users INT NOT NULL DEFAULT 10,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Tenants table';

-- Users table (each record belongs to one tenant)
CREATE TABLE tenant_users (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    tenant_id BIGINT NOT NULL COMMENT 'Tenant ID',
    username VARCHAR(50) NOT NULL,
    role VARCHAR(20) NOT NULL DEFAULT 'member',
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    UNIQUE KEY uk_tenant_username (tenant_id, username),
    INDEX idx_tenant_id (tenant_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Tenant users table';

-- All business tables include tenant_id field
CREATE TABLE tenant_orders (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    tenant_id BIGINT NOT NULL,
    user_id BIGINT NOT NULL,
    amount DECIMAL(10,2) NOT NULL,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_tenant_id (tenant_id),
    INDEX idx_tenant_user (tenant_id, user_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Tenant orders table';
💡 Key: Every SQL query must include a WHERE tenant_id = ? condition to prevent data leakage.

❓ FAQ

Q: When should I split tables, and when should I keep one large table? A: Split when certain fields are updated much more frequently than others (e.g., user basic info vs. login logs); keep one table when fields are frequently queried together.

Q: How do I choose between soft delete and physical delete? A: Use soft delete (is_deleted) for core data like finances and orders; use physical delete for auxiliary data like logs and caches.

Q: What scenarios are JSON fields suitable for? A: Storing extension attributes with inconsistent structure and low query frequency, such as user preferences or product specifications. Don't put fields that need indexed queries into JSON.

Q: Which is better, auto-increment IDs or UUIDs? A: Prefer auto-increment IDs for single-server systems (ordered, compact); consider UUIDs or snowflake algorithms for distributed systems (globally unique).

📖 Summary

This lesson systematically covered the complete database design methodology:

📝 Exercises

  1. Design a database for an online exam system, including: users, exam papers, questions, options, answer records, and scores. Draw an ER diagram and write the CREATE TABLE statements.
  2. Normalize the following table: orders(order_id, customer_name, customer_phone, product_name, product_price, quantity), splitting it to at least 3NF.
  3. Choose an appropriate primary key strategy (auto-increment vs. UUID) for your project and explain the reasoning.

Next Lesson →26-sql-injection.md

Web-Tutorial.com

Web-Tutorial Tech Team

A team of developers maintaining programming tutorials. Each tutorial is written and reviewed by developers with expertise in that field. We work to keep our content accurate and reliable — if you spot an issue, please let us know.

100%

🙏 帮我们做得更好

我们是刚上线的编程教程站,几个人的小团队,精力有限。页面虽经检查,难免还有疏漏——链接失效、排版错乱、内容有误、语言生硬……

如果您发现了,麻烦告诉我们,我们会在收到反馈后第一时间进行修复,再次感谢您的光临 🙏