数据库备份与恢复
概览
实施全面的备份和灾难恢复策略。涵盖备份类型、保留策略、恢复测试和恢复时间目标(RTO/RPO)。
何时使用
- 备份自动化设置
- 灾难恢复计划
- 恢复测试程序
- 备份保留策略
- 点时间恢复(PITR)
- 跨区域备份复制
- 合规性和审计要求
PostgreSQL备份策略
全库备份
pg_dump - 文本格式:
# 简单全库备份
pg_dump -h localhost -U postgres -F p database_name > backup.sql
# 压缩备份
pg_dump -h localhost -U postgres -F p database_name | gzip > backup.sql.gz
# 详细输出备份
pg_dump -h localhost -U postgres -F p -v database_name > backup.sql 2>&1
# 排除特定表
pg_dump -h localhost -U postgres database_name \
--exclude-table=temp_* --exclude-table=logs > backup.sql
pg_dump - 自定义二进制格式:
# 自定义二进制格式(大型数据库更佳)
pg_dump -h localhost -U postgres -F c database_name > backup.dump
# 并行作业以加快备份(PostgreSQL 9.3+)
pg_dump -h localhost -U postgres -F c -j 4 \
--load-via-partition-root database_name > backup.dump
# 备份特定模式
pg_dump -h localhost -U postgres -n public database_name > backup.dump
# 获取备份信息
pg_dump_all -h localhost -U postgres > all_databases.sql
pg_basebackup - 物理备份:
# 用于流复制的基础备份
pg_basebackup -h localhost -D ./backup_data -U replication_user -v -P
# 标记备份以归档
pg_basebackup -h localhost -D ./backup_data \
-U replication_user -l "backup_$(date +%Y%m%d)" -v -P
# Tar格式压缩
pg_basebackup -h localhost -D - -U replication_user \
-Ft -z -l "backup_$(date +%s)" | tar -xz -C ./backups/
增量和差异备份
WAL归档设置:
-- postgresql.conf配置
-- wal_level = replica
-- archive_mode = on
-- archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'
-- archive_timeout = 300
-- 监控WAL归档
SELECT
name,
setting
FROM pg_settings
WHERE name LIKE 'archive%';
-- 检查WAL目录
-- ls -lh $PGDATA/pg_wal/
-- 列出归档的WAL
-- ls -lh /archive/
连续WAL备份:
#!/bin/bash
# 带有WAL归档的备份脚本
BACKUP_DIR="/backups"
DB_NAME="production"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
# 创建基础备份
pg_basebackup -h localhost -D $BACKUP_DIR/base_$TIMESTAMP \
-U backup_user -v
# 归档WAL文件
WAL_DIR=$BACKUP_DIR/wal_$TIMESTAMP
mkdir -p $WAL_DIR
cp /var/lib/postgresql/14/main/pg_wal/* $WAL_DIR/
# 压缩备份
tar -czf $BACKUP_DIR/backup_$TIMESTAMP.tar.gz \
$BACKUP_DIR/base_$TIMESTAMP $BACKUP_DIR/wal_$TIMESTAMP
# 验证备份
pg_basebackup -h localhost -U backup_user --analyze
# 上传到S3
aws s3 cp $BACKUP_DIR/backup_$TIMESTAMP.tar.gz \
s3://backup-bucket/postgres/
MySQL备份策略
全库备份
mysqldump - 文本格式:
# 简单全库备份
mysqldump -h localhost -u root -p database_name > backup.sql
# 所有数据库
mysqldump -h localhost -u root -p --all-databases > all_databases.sql
# 带有刷新权限和触发器
mysqldump -h localhost -u root -p \
--flush-privileges --triggers --routines \
database_name > backup.sql
# 并行备份(MySQL 5.7.11+)
mydumper -h localhost -u root -p password \
-o ./backup_dir --threads 4 --compress
备份特定表:
# 备份特定表
mysqldump -h localhost -u root -p database_name table1 table2 > tables.sql
# 排除表
mysqldump -h localhost -u root -p database_name \
--ignore-table=database_name.temp_table \
--ignore-table=database_name.logs > backup.sql
二进制日志备份
启用二进制日志:
-- 检查二进制日志状态
SHOW VARIABLES LIKE 'log_bin%';
-- 配置在my.cnf
-- [mysqld]
-- log-bin = mysql-bin
-- binlog_format = ROW
-- 查看二进制日志
SHOW BINARY LOGS;
-- 获取当前位置
SHOW MASTER STATUS;
二进制日志备份:
# 备份二进制日志
MYSQL_PWD="password" mysqldump -h localhost -u root \
--single-transaction --flush-logs --all-databases > backup.sql
# 复制二进制日志
cp /var/log/mysql/mysql-bin.* /backup/binlogs/
# 备份增量变化
mysqlbinlog /var/log/mysql/mysql-bin.000001 > binlog_backup.sql
恢复程序
PostgreSQL恢复
从文本备份恢复:
# 删除并重新创建数据库
psql -h localhost -U postgres -c "DROP DATABASE IF EXISTS database_name;"
psql -h localhost -U postgres -c "CREATE DATABASE database_name;"
# 从文本备份恢复
psql -h localhost -U postgres database_name < backup.sql
# 详细输出恢复
psql -h localhost -U postgres -1 database_name < backup.sql 2>&1 | tee restore.log
从二进制备份恢复:
# 从自定义格式恢复
pg_restore -h localhost -U postgres -d database_name \
-v backup.dump
# 并行恢复(更快)
pg_restore -h localhost -U postgres -d database_name \
-j 4 -v backup.dump
# 测试恢复(不提交测试恢复)
pg_restore --list backup.dump > restore_plan.txt
点时间恢复(PITR):
# 列出可用备份和WAL归档
ls -lh /archive/
# 恢复到特定时间点
pg_basebackup -h localhost -D ./recovery_data \
-U replication_user -c fast
# 创建recovery.conf
cat > ./recovery_data/recovery.conf << EOF
recovery_target_timeline = 'latest'
recovery_target_xid = '1000000'
recovery_target_time = '2024-01-15 14:30:00'
recovery_target_name = 'before_bad_update'
EOF
# 启动PostgreSQL恢复
pg_ctl -D ./recovery_data start
MySQL恢复
从SQL备份恢复:
# 恢复全库
mysql -h localhost -u root -p < backup.sql
# 恢复特定数据库
mysql -h localhost -u root -p database_name < database_backup.sql
# 恢复进度
pv backup.sql | mysql -h localhost -u root -p database_name
带二进制日志恢复:
# 从备份恢复然后应用二进制日志
mysql -h localhost -u root -p < backup.sql
# 从备份获取起始二进制日志位置
grep "SET @@GLOBAL.GTID_PURGED=" backup.sql
# 备份后应用二进制日志
mysqlbinlog /var/log/mysql/mysql-bin.000005 \
--start-position=12345 | \
mysql -h localhost -u root -p database_name
点时间恢复:
# 恢复基础备份
mysql -h localhost -u root -p database_name < base_backup.sql
# 应用特定时间的二进制日志
mysqlbinlog /var/log/mysql/mysql-bin.000005 \
--stop-datetime='2024-01-15 14:30:00' | \
mysql -h localhost -u root -p database_name
备份验证
PostgreSQL - 备份完整性检查:
# 验证备份文件
pg_dump --analyze --schema-only database_name > /dev/null && echo "Backup OK"
# 测试恢复程序
createdb test_restore
pg_restore -d test_restore backup.dump
psql -d test_restore -c "SELECT COUNT(*) FROM information_schema.tables;"
dropdb test_restore
MySQL - 备份完整性:
# 检查备份文件语法
mysql -h localhost -u root -p < backup.sql --dry-run
# 验证校验和
md5sum backup.sql
# 保存校验和:echo "abc123def456 backup.sql" > backup.sql.md5
md5sum -c backup.sql.md5
自动化备份计划
PostgreSQL - Cron备份:
#!/bin/bash
# backup.sh - 每日备份脚本
BACKUP_DIR="/backups/postgresql"
RETENTION_DAYS=30
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
# 创建备份
pg_dump -h localhost -U postgres mydb | gzip > \
$BACKUP_DIR/backup_$TIMESTAMP.sql.gz
# 删除旧备份
find $BACKUP_DIR -name "backup_*.sql.gz" -mtime +$RETENTION_DAYS -delete
# 上传到S3
aws s3 cp $BACKUP_DIR/backup_$TIMESTAMP.sql.gz \
s3://backup-bucket/postgresql/
# 记录备份
echo "$TIMESTAMP: Backup completed" >> /var/log/db_backup.log
Crontab条目:
# 每日2 AM备份
0 2 * * * /scripts/backup.sh
# 每小时备份
0 * * * * /scripts/hourly_backup.sh
# 每周全库备份
0 3 0 * * /scripts/weekly_backup.sh
备份保留策略
PostgreSQL - 保留策略:
-- 创建保留跟踪
CREATE TABLE backup_retention_policy (
backup_id UUID PRIMARY KEY,
database_name VARCHAR(255),
backup_date TIMESTAMP,
backup_type VARCHAR(20), -- 'full', 'incremental', 'wal'
retention_days INT,
expires_at TIMESTAMP GENERATED ALWAYS AS
(backup_date + INTERVAL '1 day' * retention_days) STORED
);
-- 示例保留期
INSERT INTO backup_retention_policy VALUES
('backup-001', 'production', NOW(), 'full', 30),
('backup-002', 'production', NOW(), 'incremental', 7),
('backup-003', 'staging', NOW(), 'full', 7);
-- 查询到期备份
SELECT backup_id, expires_at
FROM backup_retention_policy
WHERE expires_at < NOW();
RTO/RPO计划
恢复时间目标(RTO):系统必须多快恢复
恢复点目标(RPO):可以接受的数据丢失量
示例:
- RTO:1小时(系统必须在1小时内恢复)
- RPO:15分钟(可以接受不超过15分钟的数据丢失)
备份频率:每15分钟一次(以满足RPO)
复制延迟:< 5分钟(以满足RTO)
最佳实践清单
✅ 定期测试恢复程序 ✅ 实施自动化备份 ✅ 监控备份成功 ✅ 加密备份文件 ✅ 异地存储备份 ✅ 文档化恢复程序 ✅ 跟踪备份保留策略 ✅ 监控备份性能
❌ 不要依赖未经测试的备份 ❌ 不要跳过备份验证 ❌ 不要在同一个服务器上存储备份 ❌ 不要使用弱加密 ❌ 不要忘记备份保留限制