---
id: sm-database-schema-design
name: "database-schema-design"
url: https://skills.yangsir.net/skill/sm-database-schema-design
author: supercent-io
domain: ai-data-management-analysis
tags: ["sql-databases", "nosql-databases", "data-modeling", "er-diagrams", "database-architecture"]
install_count: 12200
rating: 4.50 (484 reviews)
github: https://github.com/supercent-io/skills-template
---

# database-schema-design

> 此技能用于新项目的数据库模式设计、现有模式迁移、性能优化、数据集成及安全考量，确保数据库结构合理高效。

**Stats**: 12,200 installs · 4.5/5 (484 reviews)

## Before / After 对比

### 通过优化数据库模式设计，全面提升系统性能、数据一致性与长期可维护性，确保业务高效运行

## Readme

# database-schema-design

Database Schema Design When to use this skill Lists specific situations where this skill should be triggered: New Project: Database schema design for a new application Schema Refactoring: Redesigning an existing schema for performance or scalability Relationship Definition: Implementing 1:1, 1:N, N:M relationships between tables Migration: Safely applying schema changes Performance Issues: Index and schema optimization to resolve slow queries Input Format The required and optional input information to collect from the user: Required Information Database Type: PostgreSQL, MySQL, MongoDB, SQLite, etc. Domain Description: What data will be stored (e.g., e-commerce, blog, social media) Key Entities: Core data objects (e.g., User, Product, Order) Optional Information Expected Data Volume: Small (<10K rows), Medium (10K-1M), Large (>1M) (default: Medium) Read/Write Ratio: Read-heavy, Write-heavy, Balanced (default: Balanced) Transaction Requirements: Whether ACID is required (default: true) Sharding/Partitioning: Whether large data distribution is needed (default: false) Input Example Design a database for an e-commerce platform: - DB: PostgreSQL - Entities: User, Product, Order, Review - Relationships: - A User can have multiple Orders - An Order contains multiple Products (N:M) - A Review is linked to a User and a Product - Expected data: 100,000 users, 10,000 products - Read-heavy (frequent product lookups) Instructions Specifies the step-by-step task sequence to follow precisely. Step 1: Define Entities and Attributes Identify core data objects and their attributes. Tasks: Extract nouns from business requirements → entities List each entity's attributes (columns) Determine data types (VARCHAR, INTEGER, TIMESTAMP, JSON, etc.) Designate Primary Keys (UUID vs Auto-increment ID) Example (E-commerce): Users - id: UUID PRIMARY KEY - email: VARCHAR(255) UNIQUE NOT NULL - username: VARCHAR(50) UNIQUE NOT NULL - password_hash: VARCHAR(255) NOT NULL - created_at: TIMESTAMP DEFAULT NOW() - updated_at: TIMESTAMP DEFAULT NOW() Products - id: UUID PRIMARY KEY - name: VARCHAR(255) NOT NULL - description: TEXT - price: DECIMAL(10, 2) NOT NULL - stock: INTEGER DEFAULT 0 - category_id: UUID REFERENCES Categories(id) - created_at: TIMESTAMP DEFAULT NOW() Orders - id: UUID PRIMARY KEY - user_id: UUID REFERENCES Users(id) - total_amount: DECIMAL(10, 2) NOT NULL - status: VARCHAR(20) DEFAULT 'pending' - created_at: TIMESTAMP DEFAULT NOW() OrderItems (Junction table) - id: UUID PRIMARY KEY - order_id: UUID REFERENCES Orders(id) ON DELETE CASCADE - product_id: UUID REFERENCES Products(id) - quantity: INTEGER NOT NULL - price: DECIMAL(10, 2) NOT NULL Step 2: Design Relationships and Normalization Define relationships between tables and apply normalization. Tasks: 1:1 relationship: Foreign Key + UNIQUE constraint 1:N relationship: Foreign Key N:M relationship: Create junction table Determine normalization level (1NF ~ 3NF) Decision Criteria: OLTP systems → normalize to 3NF (data integrity) OLAP/analytics systems → denormalization allowed (query performance) Read-heavy → minimize JOINs with partial denormalization Write-heavy → full normalization to eliminate redundancy Example (ERD Mermaid): erDiagram Users ||--o{ Orders : places Orders ||--|{ OrderItems : contains Products ||--o{ OrderItems : "ordered in" Categories ||--o{ Products : categorizes Users ||--o{ Reviews : writes Products ||--o{ Reviews : "reviewed by" Users { uuid id PK string email UK string username UK string password_hash timestamp created_at } Products { uuid id PK string name decimal price int stock uuid category_id FK } Orders { uuid id PK uuid user_id FK decimal total_amount string status timestamp created_at } OrderItems { uuid id PK uuid order_id FK uuid product_id FK int quantity decimal price } Step 3: Establish Indexing Strategy Design indexes for query performance. Tasks: Primary Keys automatically create indexes Columns frequently used in WHERE clauses → add indexes Foreign Keys used in JOINs → indexes Consider composite indexes (WHERE col1 = ? AND col2 = ?) UNIQUE indexes (email, username, etc.) Checklist: Indexes on frequently queried columns Indexes on Foreign Key columns Composite index order optimized (high selectivity columns first) Avoid excessive indexes (degrades INSERT/UPDATE performance) Example (PostgreSQL): -- Primary Keys (auto-indexed) CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE NOT NULL, -- UNIQUE = auto-indexed username VARCHAR(50) UNIQUE NOT NULL, password_hash VARCHAR(255) NOT NULL, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() ); -- Foreign Keys + explicit indexes CREATE TABLE orders ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, total_amount DECIMAL(10, 2) NOT NULL, status VARCHAR(20) DEFAULT 'pending', created_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX idx_orders_user_id ON orders(user_id); CREATE INDEX idx_orders_status ON orders(status); CREATE INDEX idx_orders_created_at ON orders(created_at); -- Composite index (status and created_at frequently queried together) CREATE INDEX idx_orders_status_created ON orders(status, created_at DESC); -- Products table CREATE TABLE products ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL, description TEXT, price DECIMAL(10, 2) NOT NULL CHECK (price >= 0), stock INTEGER DEFAULT 0 CHECK (stock >= 0), category_id UUID REFERENCES categories(id), created_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX idx_products_category ON products(category_id); CREATE INDEX idx_products_price ON products(price); -- price range search CREATE INDEX idx_products_name ON products(name); -- product name search -- Full-text search (PostgreSQL) CREATE INDEX idx_products_name_fts ON products USING GIN(to_tsvector('english', name)); CREATE INDEX idx_products_description_fts ON products USING GIN(to_tsvector('english', description)); Step 4: Set Up Constraints and Triggers Add constraints to ensure data integrity. Tasks: NOT NULL: required columns UNIQUE: columns that must be unique CHECK: value range constraints (e.g., price >= 0) Foreign Key + CASCADE option Set default values Example: CREATE TABLE products ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL, price DECIMAL(10, 2) NOT NULL CHECK (price >= 0), stock INTEGER DEFAULT 0 CHECK (stock >= 0), discount_percent INTEGER CHECK (discount_percent >= 0 AND discount_percent <= 100), category_id UUID REFERENCES categories(id) ON DELETE SET NULL, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() ); -- Trigger: auto-update updated_at CREATE OR REPLACE FUNCTION update_updated_at_column() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = NOW(); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER update_products_updated_at BEFORE UPDATE ON products FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); Step 5: Write Migration Scripts Write migrations that safely apply schema changes. Tasks: UP migration: apply changes DOWN migration: rollback Wrap in transactions Prevent data loss (use ALTER TABLE carefully) Example (SQL migration): -- migrations/001_create_initial_schema.up.sql BEGIN; CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE NOT NULL, username VARCHAR(50) UNIQUE NOT NULL, password_hash VARCHAR(255) NOT NULL, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() ); CREATE TABLE categories ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(100) UNIQUE NOT NULL, parent_id UUID REFERENCES categories(id) ); CREATE TABLE products ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL, description TEXT, price DECIMAL(10, 2) NOT NULL CHECK (price >= 0), stock INTEGER DEFAULT 0 CHECK (stock >= 0), category_id UUID REFERENCES categories(id), created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX idx_products_category ON products(category_id); CREATE INDEX idx_products_price ON products(price); COMMIT; -- migrations/001_create_initial_schema.down.sql BEGIN; DROP TABLE IF EXISTS products CASCADE; DROP TABLE IF EXISTS categories CASCADE; DROP TABLE IF EXISTS users CASCADE; COMMIT; Output format Defines the exact format that deliverables should follow. Basic Structure project/ ├── database/ │ ├── schema.sql # full schema │ ├── migrations/ │ │ ├── 001_create_users.up.sql │ │ ├── 001_create_users.down.sql │ │ ├── 002_create_products.up.sql │ │ └── 002_create_products.down.sql │ ├── seeds/ │ │ └── sample_data.sql # test data │ └── docs/ │ ├── ERD.md # Mermaid ERD diagram │ └── SCHEMA.md # schema documentation └── README.md ERD Diagram (Mermaid Format) # Database Schema ## Entity Relationship Diagram \`\`\`mermaid erDiagram Users ||--o{ Orders : places Orders ||--|{ OrderItems : contains Products ||--o{ OrderItems : "ordered in" Users { uuid id PK string email UK string username UK } Products { uuid id PK string name decimal price } \`\`\` ## Table Descriptions ### users - **Purpose**: Store user account information - **Indexes**: email, username - **Estimated rows**: 100,000 ### products - **Purpose**: Product catalog - **Indexes**: category_id, price, name - **Estimated rows**: 10,000 Constraints Specifies mandatory rules and prohibited actions. Mandatory Rules (MUST) Primary Key Required: Define a Primary Key on every table Unique record identification Ensures referential integrity Explicit Foreign Keys: Tables with relationships must define Foreign Keys Specify ON DELETE CASCADE/SET NULL options Prevent orphan records Use NOT NULL Appropriately: Required columns must be NOT NULL Clearly specify nullable vs. non-nullable Providing defaults is recommended Prohibited Actions (MUST NOT) Avoid EAV Pattern Abuse: Use the Entity-Attribute-Value pattern only in special cases Query complexity increases dramatically Performance degradation Excessive Denormalization: Be careful when denormalizing for performance Data consistency issues Risk of update anomalies No Plaintext Storage of Sensitive Data: Never store passwords, card numbers, etc. in plaintext Hashing/encryption is mandatory Legal liability issues Security Rules Principle of Least Privilege: Grant only the necessary permissions to application DB accounts SQL Injection Prevention: Use Prepared Statements / Parameterized Queries Encrypt Sensitive Columns: Consider encrypting personally identifiable information at rest Examples Demonstrates how to apply the skill through real-world use cases. Example 1: Blog Platform Schema Situation: Database design for a Medium-style blog platform User Request: Design a PostgreSQL schema for a blog platform: - Users can write multiple posts - Posts can have multiple tags (N:M) - Users can like and bookmark posts - Comment feature (with nested replies) Final Result: -- Users CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE NOT NULL, username VARCHAR(50) UNIQUE NOT NULL, bio TEXT, avatar_url VARCHAR(500), created_at TIMESTAMP DEFAULT NOW() ); -- Posts CREATE TABLE posts ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), author_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, title VARCHAR(255) NOT NULL, slug VARCHAR(255) UNIQUE NOT NULL, content TEXT NOT NULL, published_at TIMESTAMP, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX idx_posts_author ON posts(author_id); CREATE INDEX idx_posts_published ON posts(published_at); CREATE INDEX idx_posts_slug ON posts(slug); -- Tags CREATE TABLE tags ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(50) UNIQUE NOT NULL, slug VARCHAR(50) UNIQUE NOT NULL ); -- Post-Tag relationship (N:M) CREATE TABLE post_tags ( post_id UUID REFERENCES posts(id) ON DELETE CASCADE, tag_id UUID REFERENCES tags(id) ON DELETE CASCADE, PRIMARY KEY (post_id, tag_id) ); CREATE INDEX idx_post_tags_post ON post_tags(post_id); CREATE INDEX idx_post_tags_tag ON post_tags(tag_id); -- Likes CREATE TABLE post_likes ( user_id UUID REFERENCES users(id) ON DELETE CASCADE, post_id UUID REFERENCES posts(id) ON DELETE CASCADE, created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (user_id, post_id) ); -- Bookmarks CREATE TABLE post_bookmarks ( user_id UUID REFERENCES users(id) ON DELETE CASCADE, post_id UUID REFERENCES posts(id) ON DELETE CASCADE, created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (user_id, post_id) ); -- Comments (self-referencing for nested comments) CREATE TABLE comments ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), post_id UUID NOT NULL REFERENCES posts(id) ON DELETE CASCADE, author_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, parent_comment_id UUID REFERENCES comments(id) ON DELETE CASCADE, content TEXT NOT NULL, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX idx_comments_post ON comments(post_id); CREATE INDEX idx_comments_author ON comments(author_id); CREATE INDEX idx_comments_parent ON comments(parent_comment_id); Example 2: MongoDB Schema (NoSQL) Situation: MongoDB schema for a real-time chat app User Request: Design a MongoDB schema for a real-time chat app. Reads are very frequent, and message history needs to be retrieved quickly. Final Result: // users collection { _id: ObjectId, username: String, // indexed, unique email: String, // indexed, unique avatar_url: String, status: String, // 'online', 'offline', 'away' last_seen: Date, created_at: Date } // conversations collection (denormalized - read-optimized) { _id: ObjectId, participants: [ // indexed { user_id: ObjectId, username: String, avatar_url: String } ], last_message: { // denormalized for fast recent-message retrieval content: String, sender_id: ObjectId, sent_at: Date }, unread_counts: { // unread message count per participant "user_id_1": 5, "user_id_2": 0 }, created_at: Date, updated_at: Date } // messages collection { _id: ObjectId, conversation_id: ObjectId, // indexed sender_id: ObjectId, content: String, attachments: [ { type: String, // 'image', 'file', 'video' url: String, filename: String } ], read_by: [ObjectId], // array of user IDs who have read the message sent_at: Date, // indexed edited_at: Date } // Indexes db.users.createIndex({ username: 1 }, { unique: true }); db.users.createIndex({ email: 1 }, { unique: true }); db.conversations.createIndex({ "participants.user_id": 1 }); db.conversations.createIndex({ updated_at: -1 }); db.messages.createIndex({ conversation_id: 1, sent_at: -1 }); db.messages.createIndex({ sender_id: 1 }); Design Highlights: Denormalization for read optimization (embedding last_message) Indexes on frequently accessed fields Using array fields (participants, read_by) Best practices Quality Improvement Naming Convention Consistency: Use snake_case for table/column names users, post_tags, created_at Be consistent with plurals/singulars (tables plural, columns singular, etc.) Consider Soft Delete: Use logical deletion instead of physical deletion for important data deleted_at TIMESTAMP (NULL = active, NOT NULL = deleted) Allows recovery of accidentally deleted data Audit trail Timestamps Required: Include created_at and updated_at in most tables Data tracking and debugging Time-series analysis Efficiency Improvements Partial Indexes: Minimize index size with conditional indexes CREATE INDEX idx_posts_published ON posts(published_at) WHERE published_at IS NOT NULL; Materialized Views: Cache complex aggregate queries as Materialized Views Partitioning: Partition large tables by date/range Common Issues Issue 1: N+1 Query Problem Symptom: Multiple DB calls when a single query would suffice Cause: Individual lookups in a loop without JOINs Solution: -- ❌ Bad example: N+1 queries SELECT * FROM posts; -- 1 time -- for each post SELECT * FROM users WHERE id = ?; -- N times -- ✅ Good example: 1 query SELECT posts.*, users.username, users.avatar_url FROM posts JOIN users ON posts.author_id = users.id; Issue 2: Slow JOINs Due to Unindexed Foreign Keys Symptom: JOIN queries are very slow Cause: Missing index on Foreign Key column Solution: CREATE INDEX idx_orders_user_id ON orders(user_id); CREATE INDEX idx_order_items_order_id ON order_items(order_id); CREATE INDEX idx_order_items_product_id ON order_items(product_id); Issue 3: UUID vs Auto-increment Performance Symptom: Insert performance degradation when using UUID Primary Keys Cause: UUIDs are random, causing index fragmentation Solution: PostgreSQL: Use uuid_generate_v7() (time-ordered UUID) MySQL: Use UUID_TO_BIN(UUID(), 1) Or consider using Auto-increment BIGINT References Official Documentation PostgreSQL Documentation MySQL Documentation MongoDB Schema Design Best Practices Tools dbdiagram.io - ERD diagram creation PgModeler - PostgreSQL modeling tool Prisma - ORM + migrations Learning Resources Database Design Course (freecodecamp) Use The Index, Luke - SQL indexing guide Metadata Version Current Version: 1.0.0 Last Updated: 2025-01-01 Compatible Platforms: Claude, ChatGPT, Gemini Related Skills api-design: Schema design alongside API design performance-optimization: Query performance optimization Tags #database #schema #PostgreSQL #MySQL #MongoDB #SQL #NoSQL #migration #ERDWeekly Installs11.3KRepositorysupercent-io/sk…templateGitHub Stars53First SeenJan 24, 2026Security AuditsGen Agent Trust HubPassSocketPassSnykPassInstalled oncodex11.2Kgemini-cli11.2Kopencode11.2Kgithub-copilot11.2Kcursor11.2Kamp11.2K

---
*Source: https://skills.yangsir.net/skill/sm-database-schema-design*
*Markdown mirror: https://skills.yangsir.net/api/skill/sm-database-schema-design/markdown*