数据库工程精通
完整的数据库设计、优化、迁移和操作系统。从模式设计到生产监控 —— 涵盖 PostgreSQL、MySQL、SQLite 和通用 SQL 模式。
第一阶段 — 模式设计
设计简报
在编写任何 DDL 之前,请填写此表:
project: ""
domain: ""
primary_use_case: "OLTP | OLAP | mixed"
expected_scale:
rows_year_1: ""
rows_year_3: ""
concurrent_users: ""
read_write_ratio: "80:20 | 50:50 | 20:80"
compliance: [] # GDPR, HIPAA, PCI-DSS, SOX
multi_tenancy: "none | schema-per-tenant | row-level | database-per-tenant"
规范化决策框架
| 形式 | 规则 | 何时反规范化 |
|---|---|---|
| 1NF | 无重复组,原子值 | 永不跳过 |
| 2NF | 无复合键的部分依赖 | 永不跳过 |
| 3NF | 无传递依赖 | 报告表,读重聚合 |
| BCNF | 每个决定因素都是候选键 | 除非复杂的键关系,否则很少需要 |
反规范化触发器:
- 查询连接 > 4 表一致
- 读延迟 > 100ms 在索引查询上
- 缓存失效复杂性超过反规范化维护
- 报告查询阻塞 OLTP 工作负载
命名约定
Tables: snake_case, plural (users, order_items, payment_methods)
Columns: snake_case, singular (first_name, created_at, is_active)
PKs: id (bigint/uuid) or {table_singular}_id
FKs: {referenced_table_singular}_id
Indexes: idx_{table}_{columns}
Constraints: chk_{table}_{rule}, uq_{table}_{columns}, fk_{table}_{ref}
Enums: Use VARCHAR + CHECK, not DB enums (easier to migrate)
Booleans: is_, has_, can_ 前缀 (is_active, has_subscription)
Timestamps: _at 后缀 (created_at, updated_at, deleted_at)
列类型决策树
Text < 255 chars, fixed set? → VARCHAR(N) + CHECK
Text < 255 chars, variable? → VARCHAR(255)
Text > 255 chars? → TEXT
Whole numbers < 2B? → INTEGER
Whole numbers > 2B? → BIGINT
Money/financial? → NUMERIC(precision, scale) — NEVER float
True/false? → BOOLEAN
Date only? → DATE
Date + time? → TIMESTAMPTZ (always with timezone)
Unique identifier? → UUID (distributed) or BIGSERIAL (single DB)
JSON/flexible schema? → JSONB (Postgres) or JSON (MySQL)
Binary/file? → Store in object storage, reference by URL
IP address? → INET (Postgres) or VARCHAR(45)
Geospatial? → PostGIS geometry/geography types
基本表模板
CREATE TABLE {table_name} (
id BIGSERIAL PRIMARY KEY,
-- domain columns here --
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_by BIGINT REFERENCES users(id),
version INTEGER NOT NULL DEFAULT 1, -- optimistic locking
-- soft delete (optional)
deleted_at TIMESTAMPTZ,
-- multi-tenant (optional)
tenant_id BIGINT NOT NULL REFERENCES tenants(id)
);
-- Updated_at trigger (PostgreSQL)
CREATE OR REPLACE FUNCTION update_modified_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
NEW.version = OLD.version + 1;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_{table_name}_updated
BEFORE UPDATE ON {table_name}
FOR EACH ROW
EXECUTE FUNCTION update_modified_column();
关系模式
一对多:
-- 父表
CREATE TABLE departments (id BIGSERIAL PRIMARY KEY, name VARCHAR(100) NOT NULL);
-- 子表
CREATE TABLE employees (
id BIGSERIAL PRIMARY KEY,
department_id BIGINT NOT NULL REFERENCES departments(id) ON DELETE RESTRICT,
-- ON DELETE 选项:RESTRICT (safe default), CASCADE (children die), SET NULL
);
CREATE INDEX idx_employees_department_id ON employees(department_id);
多对多:
CREATE TABLE user_roles (
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
role_id BIGINT NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
granted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
granted_by BIGINT REFERENCES users(id),
PRIMARY KEY (user_id, role_id)
);
自引用(层次结构):
CREATE TABLE categories (
id BIGSERIAL PRIMARY KEY,
parent_id BIGINT REFERENCES categories(id) ON DELETE CASCADE,
name VARCHAR(100) NOT NULL,
depth INTEGER NOT NULL DEFAULT 0,
path TEXT NOT NULL DEFAULT '' -- materialized path: '/1/5/12/'
);
CREATE INDEX idx_categories_parent ON categories(parent_id);
CREATE INDEX idx_categories_path ON categories(path text_pattern_ops);
多态(如果必须,尽量避免):
-- 推荐:单独的 FKs
CREATE TABLE comments (
id BIGSERIAL PRIMARY KEY,
post_id BIGINT REFERENCES posts(id),
ticket_id BIGINT REFERENCES tickets(id),
body TEXT NOT NULL,
CONSTRAINT chk_one_parent CHECK (
(post_id IS NOT NULL)::int + (ticket_id IS NOT NULL)::int = 1
)
);
第二阶段 — 索引策略
索引类型选择
| 索引类型 | 使用时 | 示例 |
|---|---|---|
| B-tree (default) | 等值,范围,排序,LIKE ‘prefix%’ | CREATE INDEX idx_users_email ON users(email) |
| Hash | 仅等值,无范围 | CREATE INDEX idx_sessions_token ON sessions USING hash(token) |
| GIN | JSONB, 全文搜索,数组,tsvector | CREATE INDEX idx_products_tags ON products USING gin(tags) |
| GiST | 地理空间,范围类型,最近邻 | CREATE INDEX idx_locations_geom ON locations USING gist(geom) |
| BRIN | 非常大表具有自然排序(时间序列) | CREATE INDEX idx_events_created ON events USING brin(created_at) |
| Partial | 行的子集 | CREATE INDEX idx_orders_pending ON orders(created_at) WHERE status = 'pending' |
| Covering | 包含列以避免表查找 | CREATE INDEX idx_orders_user ON orders(user_id) INCLUDE (status, total) |
索引规则
- **总是索引:**外键,WHERE/JOIN/ORDER BY 中的列
- **永不索引:**单独的低基数列(布尔值,3个值的状态) —— 在复合中组合
- **复合顺序:**首先选择最具选择性的列,然后从左到右匹配查询模式
- **注意写入开销:**每个索引减慢 INSERT/UPDATE。>8 索引在写入重的表上 = 审查
- **未使用索引审计:**每月运行 —— 删除扫描次数为 0 的索引
查找未使用索引(PostgreSQL)
SELECT schemaname, tablename, indexname, idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelid NOT IN (
SELECT conindid FROM pg_constraint WHERE contype IN ('p', 'u')
)
ORDER BY pg_relation_size(indexrelid) DESC;
查找缺失索引(PostgreSQL)
SELECT relname, seq_scan, seq_tup_read,
idx_scan, seq_tup_read / GREATEST(seq_scan, 1) as avg_tuples_per_scan
FROM pg_stat_user_tables
WHERE seq_scan > 100 AND seq_tup_read > 10000
ORDER BY seq_tup_read DESC;
-- 高 seq_scan + 高 seq_tup_read = 缺失索引候选
第三阶段 — 查询优化
EXPLAIN 解释
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ...;
查询计划中的红旗:
| 模式 | 问题 | 修复 |
|---|---|---|
| 大表上的 Seq Scan | 缺失索引 | 添加适当索引 |
| 大外层的嵌套循环 | O(n×m) 连接 | 添加连接列上的索引,考虑哈希连接 |
| 高成本的排序 | 缺少 ORDER BY 的索引 | 添加匹配排序顺序的索引 |
| 磁盘上溢出的哈希连接 | work_mem 太低 | 增加 work_mem 或减少结果集 |
| 许多重新检查的位图堆扫描 | 选择性低的索引 | 更具选择性的索引或部分索引 |
| 每个行执行的子计划(相关子查询) | 执行次数多 | 重写为 JOIN 或 lateral |
| 行估计严重错误 | 统计数据过时 | ANALYZE 表 |
查询反模式 & 修复
*1. 生产中的 SELECT :
-- 坏:获取所有列,破坏覆盖索引
SELECT * FROM orders WHERE user_id = 123;
-- 好:显式列
SELECT id, status, total, created_at FROM orders WHERE user_id = 123;
2. N+1 查询:
-- 坏:用户 1 个查询 + 订单 N 个查询
SELECT id FROM users WHERE active = true; -- 返回 100 行
SELECT * FROM orders WHERE user_id = ?; -- 调用 100 次
-- 好:单个 JOIN 或 IN
SELECT u.id, o.id, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.active = true;
3. 索引列上的函数:
-- 坏:不能在 created_at 上使用索引
WHERE EXTRACT(YEAR FROM created_at) = 2025
-- 好:范围扫描使用索引
WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01'
-- 坏:不能在 email 上使用索引
WHERE LOWER(email) = 'user@example.com'
-- 好:表达式索引
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
4. 杀死索引的 OR 条件:
-- 坏:经常导致 Seq Scan
WHERE status = 'pending' OR status = 'processing'
-- 好:IN 使用索引
WHERE status IN ('pending', 'processing')
5. 带有 OFFSET 的分页:
-- 坏:OFFSET 10000 扫描并丢弃 10000 行
SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 10000;
-- 好:关键集分页
SELECT * FROM products WHERE id > :last_seen_id ORDER BY id LIMIT 20;
6. 大表上的 COUNT(*):
-- 坏:全表扫描
SELECT COUNT(*) FROM events;
-- 好:近似计数(PostgreSQL)
SELECT reltuples::bigint FROM pg_class WHERE relname = 'events';
-- 或维护一个计数器缓存表
窗口函数参考
-- 运行总数
SELECT id, amount, SUM(amount) OVER (ORDER BY created_at) as running_total FROM payments;
-- 组内的排名
SELECT *, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) as dept_rank FROM employees;
-- 上一行/下一行
SELECT *, LAG(amount) OVER (ORDER BY created_at) as prev_amount,
LEAD(amount) OVER (ORDER BY created_at) as next_amount FROM payments;
-- 移动平均
SELECT *, AVG(amount) OVER (ORDER BY created_at ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as ma_7 FROM daily_sales;
-- 总百分比
SELECT *, amount / SUM(amount) OVER () * 100 as pct_of_total FROM line_items WHERE order_id = 1;
CTE 模式
-- 递归:组织结构遍历
WITH RECURSIVE org AS (
SELECT id, name, manager_id, 1 as depth FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, o.depth + 1
FROM employees e JOIN org o ON e.manager_id = o.id
WHERE o.depth < 10 -- 安全限制
)
SELECT * FROM org ORDER BY depth, name;
-- 数据管道:清理 → 转换 → 聚合
WITH cleaned AS (
SELECT *, TRIM(LOWER(email)) as clean_email FROM raw_signups WHERE email IS NOT NULL
),
deduped AS (
SELECT DISTINCT ON (clean_email) * FROM cleaned ORDER BY clean_email, created_at DESC
)
SELECT DATE_TRUNC('week', created_at) as week, COUNT(*) FROM deduped GROUP BY 1 ORDER BY 1;
第四阶段 — 迁移
迁移安全规则
- 永不 在生产中重命名列/表而不经过多步骤过程
- 永不 在有数据的现有表上添加 NOT NULL 而不设置 DEFAULT
- 永不 删除应用程序代码仍然引用的列
- 总是 在生产数据的副本上首先测试迁移
- 总是 有回滚计划(下迁移)
- 总是 在生产中的模式变更之前进行备份
安全迁移模式
添加列(安全):
-- 第 1 步:添加可空列
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- 第 2 步:回填(批量!)
UPDATE users SET phone = '' WHERE phone IS NULL AND id BETWEEN 1 AND 10000;
-- 第 3 步:回填后添加 NOT NULL
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
ALTER TABLE users ALTER COLUMN phone SET DEFAULT '';
重命名列(安全的多步骤):
-- 第 1 步:添加新列
ALTER TABLE users ADD COLUMN full_name VARCHAR(200);
-- 第 2 步:在应用程序代码中双写(写入旧的 + 新的)
-- 第 3 步:回填
UPDATE users SET full_name = name WHERE full_name IS NULL;
-- 第 4 步:将应用程序切换为从新列读取
-- 第 5 步:确认没有读取后删除旧列
ALTER TABLE users DROP COLUMN name;
不锁定添加索引(PostgreSQL):
CREATE INDEX CONCURRENTLY idx_orders_customer ON orders(customer_id);
-- 耗时更长但不锁定表
大表回填(批处理):
-- 不要:一次事务更新百万行
-- 做:分批
DO $$
DECLARE
batch_size INT := 5000;
affected INT;
BEGIN
LOOP
UPDATE users SET normalized_email = LOWER(email)
WHERE normalized_email IS NULL AND id IN (
SELECT id FROM users WHERE normalized_email IS NULL LIMIT batch_size
);
GET DIAGNOSTICS affected = ROW_COUNT;
RAISE NOTICE 'Updated % rows', affected;
EXIT WHEN affected = 0;
COMMIT;
END LOOP;
END $$;
迁移文件模板
-- 迁移:YYYYMMDDHHMMSS_description.sql
-- 作者:[name]
-- 工单:[JIRA/Linear ID]
-- 风险:low|medium|high
-- 回滚:见 DOWN 部分
-- 预计时间:[生产数据量]
-- 需要:[先决条件迁移]
-- ========== UP ==========
BEGIN;
-- [DDL/DML here]
COMMIT;
-- ========== DOWN ==========
-- BEGIN;
-- [Rollback DDL/DML here]
-- COMMIT;
-- ========== VERIFY ==========
-- [确认迁移成功的查询]
-- SELECT COUNT(*) FROM ... WHERE ...;
第五阶段 — 性能监控
关键指标仪表板
health_metrics:
connections:
active: "SELECT count(*) FROM pg_stat_activity WHERE state = 'active'"
idle: "SELECT count(*) FROM pg_stat_activity WHERE state = 'idle'"
max: "SHOW max_connections"
threshold: "active > 80% of max = ALERT"
cache_hit_ratio:
query: |
SELECT ROUND(100.0 * sum(heap_blks_hit) /
NULLIF(sum(heap_blks_hit) + sum(heap_blks_read), 0), 2) as ratio
FROM pg_statio_user_tables
healthy: "> 99%"
warning: "< 95%"
critical: "< 90%"
index_hit_ratio:
query: |
SELECT ROUND(100.0 * sum(idx_blks_hit) /
NULLIF(sum(idx_blks_hit) + sum(idx_blks_read), 0), 2) as ratio
FROM pg_statio_user_indexes
healthy: "> 99%"
table_bloat:
query: |
SELECT relname, n_dead_tup, n_live_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup, 0), 2) as dead_pct
FROM pg_stat_user_tables WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC LIMIT 10
action: "VACUUM ANALYZE {table} when dead_pct > 20%"
slow_queries:
query: |
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC LIMIT 20
action: "Optimize top 5 by total_exec_time first"
replication_lag:
query: |
SELECT EXTRACT(EPOCH FROM replay_lag) as lag_seconds
FROM pg_stat_replication
warning: "> 5 seconds"
critical: "> 30 seconds"
表大小分析
SELECT
relname as table,
pg_size_pretty(pg_total_relation_size(relid)) as total_size,
pg_size_pretty(pg_relation_size(relid)) as table_size,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as index_size,
n_live_tup as row_count
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 20;
锁定监控
-- 查找阻塞查询
SELECT
blocked.pid as blocked_pid,
blocked.query as blocked_query,
blocking.pid as blocking_pid,
blocking.query as blocking_query,
NOW() - blocked.query_start as blocked_duration
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid
JOIN pg_locks kl ON kl.locktype = bl.locktype AND kl.relation = bl.relation AND kl.pid != bl.pid
JOIN pg_stat_activity blocking ON blocking.pid = kl.pid
WHERE NOT bl.granted;
第六阶段 — 备份与恢复
备份策略决策
| 方法 | RPO | 速度 | 使用时 |
|---|---|---|---|
| pg_dump (logical) | 点时间 | 慢于 >50GB | 小型中型数据库,跨版本迁移 |
| pg_basebackup (physical) | 连续(带 WAL) | 快速 | 大型数据库,同版本恢复 |
| WAL 归档(PITR) | 秒 | N/A(连续) | 生产与接近零 RPO |
| 复制提升 | 秒 | 即时 | HA 故障转移 |
备份命令
# 逻辑备份(压缩)
pg_dump -Fc -Z 9 -j 4 -d mydb -f backup_$(date +%Y%m%d_%H%M%S).dump
# 恢复
pg_restore -d mydb -j 4 --clean --if-exists backup_20260216.dump
# 仅模式
pg_dump -s -d mydb -f schema.sql
# 单表
pg_dump -t orders -d mydb -f orders_backup.dump
# 物理备份
pg_basebackup -D /backup/base -Ft -z -P -X stream
备份验证清单
- [ ] 备份完成无错误
- [ ] 备份文件大小在预期范围内(不是异常小)
- [ ] 恢复到测试数据库成功
- [ ] 行数与生产匹配(抽查 5 个表)
- [ ] 应用程序可以连接并查询恢复的数据库
- [ ] 对恢复的备份运行自动化测试套件
- [ ] 备份加密验证(如果需要)
- [ ] 确认离线副本
第七阶段 — 安全
访问控制清单
-- 创建应用程序角色(最小权限)
CREATE ROLE app_user LOGIN PASSWORD 'use-vault-not-plaintext';
GRANT CONNECT ON DATABASE mydb TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
-- NO: GRANT ALL, superuser, CREATE, DROP
-- 仅限读取角色,用于分析
CREATE ROLE analyst LOGIN PASSWORD 'use-vault';
GRANT CONNECT ON DATABASE mydb TO analyst;
GRANT USAGE ON SCHEMA public TO analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst;
-- 行级安全(多租户)
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON orders
USING (tenant_id = current_setting('app.tenant_id')::bigint);
SQL 注入防护
规则 1:永远不要将用户输入拼接到 SQL 字符串中
规则 2:始终使用参数化查询/预准备语句
规则 3:如果动态验证和白名单表/列名称
规则 4:对于 CRUD,使用 ORM,仅对复杂查询使用原始 SQL
规则 5:审计日志记录不寻常的查询模式(UNION, DROP, --)
数据保护
-- 加密敏感列(应用级)
-- 存储:pgp_sym_encrypt(data, key)
-- 读取:pgp_sym_decrypt(encrypted_col, key)
-- 审计跟踪表
CREATE TABLE audit_log (
id BIGSERIAL PRIMARY KEY,
table_name VARCHAR(100) NOT NULL,
record_id BIGINT NOT NULL,
action VARCHAR(10) NOT NULL, -- INSERT, UPDATE, DELETE
old_data JSONB,
new_data JSONB,
changed_by BIGINT REFERENCES users(id),
changed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
ip_address INET
);
-- 通用审计触发器
CREATE OR REPLACE FUNCTION audit_trigger() RETURNS TRIGGER AS $$
BEGIN
INSERT INTO audit_log (table_name, record_id, action, old_data, new_data, changed_by)
VALUES (
TG_TABLE_NAME,
COALESCE(NEW.id, OLD.id),
TG_OP,
CASE WHEN TG_OP != 'INSERT' THEN to_jsonb(OLD) END,
CASE WHEN TG_OP != 'DELETE' THEN to_jsonb(NEW) END,
current_setting('app.user_id', true)::bigint
);
RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;
第八阶段 — PostgreSQL 配置调整
按服务器大小的基本设置
| 设置 | 小型(4GB RAM) | 中型(16GB) | 大型(64GB+) |
|---|---|---|---|
| shared_buffers | 1GB | 4GB | 16GB |
| effective_cache_size | 3GB | 12GB | 48GB |
| work_mem | 16MB | 64MB | 256MB |
| maintenance_work_mem | 256MB | 1GB | 2GB |
| max_connections | 100 | 200 | 300 |
| wal_buffers | 64MB | 128MB | 256MB |
| random_page_cost | 1.1 (SSD) | 1.1 (SSD) | 1.1 (SSD) |
| effective_io_concurrency | 200 (SSD) | 200 (SSD) | 200 (SSD) |
| max_parallel_workers_per_gather | 2 | 4 | 8 |
连接池(PgBouncer)
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]
pool_mode = transaction # transaction pooling (best for most apps)
max_client_conn = 1000 # accept up to 1000 app connections
default_pool_size = 25 # 25 actual DB connections per database
reserve_pool_size = 5 # extra connections for burst
reserve_pool_timeout = 3 # seconds before using reserve
server_idle_timeout = 300 # close idle server connections after 5 min
第九阶段 — 常见模式
软删除
-- 添加到表
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMPTZ;
CREATE INDEX idx_users_active ON users(id) WHERE deleted_at IS NULL;
-- 应用程序查询始终过滤
SELECT * FROM users WHERE deleted_at IS NULL AND ...;
-- 或使用视图
CREATE VIEW active_users AS SELECT * FROM users WHERE deleted_at IS NULL;
乐观锁定
UPDATE products SET
price = 29.99,
version = version + 1,
updated_at = NOW()
WHERE id = 123 AND version = 5; -- 预期版本
-- 如果 0 行受影响 → 并发修改 → 重试或错误
事件源表
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
aggregate_type VARCHAR(50) NOT NULL,
aggregate_id UUID NOT NULL,
event_type VARCHAR(100) NOT NULL,
event_data JSONB NOT NULL,
metadata JSONB DEFAULT '{}',
version INTEGER NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (aggregate_id, version)
);
CREATE INDEX idx_events_aggregate ON events(aggregate_id, version);
CREATE INDEX idx_events_type ON events(event_type, created_at);
时间序列优化
-- 按月分区
CREATE TABLE metrics (
id BIGSERIAL,
sensor_id INTEGER NOT NULL,
value NUMERIC(12,4) NOT NULL,
recorded_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (recorded_at);
CREATE TABLE metrics_2026_01 PARTITION OF metrics
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE metrics_2026_02 PARTITION OF metrics
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
-- 通过 cron 或 pg_partman 自动创建未来分区
-- 使用 BRIN 索引进行时间序列
CREATE INDEX idx_metrics_time ON metrics USING brin(recorded_at);
全文搜索(PostgreSQL)
-- 添加搜索列
ALTER TABLE articles ADD COLUMN search_vector tsvector;
CREATE INDEX idx_articles_search ON articles USING gin(search_vector);
-- 填充
UPDATE articles SET search_vector =
setweight(to_tsvector('english', COALESCE(title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(body, '')), 'B');
-- 搜索排名
SELECT id, title, ts_rank(search_vector, query) as rank
FROM articles, plainto_tsquery('english', 'database optimization') query
WHERE search_vector @@ query
ORDER BY rank DESC LIMIT 20;
JSONB 模式
-- 存储灵活属性
CREATE TABLE products (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
attributes JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- 索引特定 JSON 路径
CREATE INDEX idx_products_color ON products((attributes->>'color'));
-- 或 GIN 用于任何键查找
CREATE INDEX idx_products_attrs ON products USING gin(attributes);
-- 查询模式
SELECT * FROM products WHERE attributes->>'color' = 'red';
SELECT * FROM products WHERE attributes @> '{"size": "large"}';
SELECT * FROM products WHERE attributes ? 'warranty';
第十阶段 — 运维手册
紧急情况:数据库过载
-- 1. 查找并终止长时间运行的查询
SELECT pid, NOW() - query_start as duration, query
FROM pg_stat_activity WHERE state = 'active' AND query_start < NOW() - INTERVAL '5 minutes'
ORDER BY duration DESC;
-- 终止特定查询
SELECT pg_cancel_backend(pid); -- 优雅
SELECT pg_terminate_backend(pid); -- 强制
-- 2. 检查锁定争用(见第五阶段)
-- 3. 临时减少最大连接数
-- 在 pgbouncer 中:暂停数据库,减少池,恢复
-- 4. 检查是否需要 VACUUM
SELECT relname, n_dead_tup, last_autovacuum FROM pg_stat_user_tables
WHERE n_dead_tup > 100000 ORDER BY n_dead_tup DESC;
紧急情况:磁盘已满
# 1. 检查什么占用空间
du -sh /var/lib/postgresql/*/main/ 2>/dev/null || du -sh /var/lib/mysql/
# 2. 清理 WAL 文件(PostgreSQL) —— 小心
# 首先检查复制槽位状态
SELECT slot_name, active FROM pg_replication_slots;
# 删除消耗 WAL 的不活动槽
SELECT pg_drop_replication_slot('unused_slot');
# 3. 对最大表执行 VACUUM FULL(锁定表!)
VACUUM FULL large_table;
# 4. 删除旧备份 / 日志
find /backups -name "*.dump" -mtime +7 -delete
每周维护清单
- [ ] 查看慢查询日志(总时间前 10)
- [ ] 检查索引使用情况 —— 删除未使用的,添加缺失的
- [ ] 验证备份成功并测试恢复
- [ ] 检查表膨胀 —— 安排需要的 VACUUM
- [ ] 查看连接计数趋势
- [ ] 查看磁盘空间轨迹
- [ ] 查看复制延迟
- [ ] 更新表统计信息:
ANALYZE;
第十一阶段 — 数据库比较快速参考
| 特性 | PostgreSQL | MySQL (InnoDB) | SQLite |
|---|---|---|---|
| 最适合 | 复杂查询,扩展 | 网络应用,读重 | 嵌入式,开发,小型应用 |
| 最大大小 | 无限(实际) | 无限(实际) | 281 TB(实际 ~1TB) |
| JSON 支持 | JSONB(可索引,快速) | JSON(有限索引) | JSON1 扩展 |
| 全文搜索 | 内置(tsvector) | 内置(FULLTEXT) | FTS5 扩展 |
| 窗口函数 | 全支持 | 全支持(8.0+) | 全支持(3.25+) |
| CTEs | 递归 + 物化 | 递归(8.0+) | 递归(3.8+) |
| 分区 | 声明式 + 列表/范围/哈希 | 范围/列表/哈希/键 | 无 |
| 行级安全 | 是 | 否(使用视图) | 否 |
| 复制 | 流 + 逻辑 | 二进制日志 | 无(使用 Litestream) |
| 连接模型 | 每个连接一个进程 | 每个连接一个线程 | 进程内 |
质量评分标准(0-100)
| 维度 | 权重 | 0(差) | 5(好) | 10(优秀) |
|---|---|---|---|---|
| 模式设计 | 20% | 无规范化,无约束 | 3NF,FKs,适当类型 | 优化规范化形式,所有约束,审计字段 |
| 索引 | 15% | PK 之外无索引 | FKs 和常见查询上的索引 | 覆盖索引,部分索引,无未使用索引 |
| 查询质量 | 20% | SELECT *,N+1,无 EXPLAIN | 特定列,JOINs,基本优化 | 键集分页,窗口函数,优化计划 |
| 迁移安全 | 10% | 原始 DDL,无回滚 | 版本文件,上下 | 零停机时间,批量回填,同时索引 |
| 安全 | 15% | 超级用户访问,无审计 | 最小权限,参数化查询 | RLS,加密,审计触发器,定期访问审查 |
| 监控 | 10% | 无监控 | 基本警报连接/磁盘 | 全仪表板,慢查询分析,主动调整 |
| 备份/恢复 | 10% | 无备份 | 每日转储 | PITR,测试恢复,离线副本 |
分数解释: <40 = 重大风险 | 40-60 = 需要工作 | 60-80 = 稳固 | 80-90 = 专业 | 90+ = 专家
自然语言命令
- “为 [domain] 设计一个模式” → 第一阶段完整设计过程
- “优化这个查询:[SQL]” → EXPLAIN 分析 + 重写
- “为 [query pattern] 添加一个索引” → 索引类型选择 + 创建
- “编写一个迁移以 [change]” → 安全迁移与回滚
- “审计这个数据库” → 全面评分所有维度
- “为 [database] 设置监控” → 第五阶段仪表板查询
- “审查这个模式” → 命名,类型,约束,关系检查
- “帮助我 [PostgreSQL/MySQL/SQLite] [topic]” → 平台特定指导
- “故障排除慢查询” → pg_stat_statements 分析 + 顶部修复
- “计划一个备份策略” → 第六阶段决策框架
- “使这个表成为多租户” → RLS + tenant_id 模式
- “将这个转换为使用分区” → 第九阶段时间序列模式