Ansible 自动化
概览
使用 Ansible 剧本、角色和动态库存管理,跨多个服务器自动化基础设施配置、管理和应用部署。
何时使用
- 配置管理
- 应用部署
- 基础设施补丁和更新
- 多服务器编排
- 云实例配置
- 容器管理
- 数据库管理
- 安全合规自动化
实施示例
1. 剧本结构和最佳实践
# site.yml - 主剧本
---
- name: 部署应用栈
hosts: all
gather_facts: yes
serial: 1 # 滚动部署
pre_tasks:
- name: 显示主机信息
debug:
var: inventory_hostname
tags: [always]
roles:
- common
- docker
- application
post_tasks:
- name: 验证部署
uri:
url: "http://{{ inventory_hostname }}:8080/health"
status_code: 200
retries: 3
delay: 10
tags: [verify]
# roles/common/tasks/main.yml
---
- name: 更新系统包
apt:
update_cache: yes
cache_valid_time: 3600
when: ansible_os_family == 'Debian'
- name: 安装所需包
package:
name: "{{ packages }}"
state: present
vars:
packages:
- curl
- git
- htop
- python3-pip
- name: 配置 sysctl 设置
sysctl:
name: "{{ item.name }}"
value: "{{ item.value }}"
sysctl_set: yes
state: present
loop:
- name: net.core.somaxconn
value: 65535
- name: net.ipv4.tcp_max_syn_backlog
value: 65535
- name: fs.file-max
value: 2097152
- name: 创建应用用户
user:
name: appuser
shell: /bin/bash
home: /home/appuser
createhome: yes
state: present
# roles/docker/tasks/main.yml
---
- name: 安装 Docker 先决条件
package:
name: "{{ docker_packages }}"
state: present
vars:
docker_packages:
- apt-transport-https
- ca-certificates
- curl
- gnupg
- lsb-release
- name: 添加 Docker GPG 密钥
apt_key:
url: https://download.docker.com/linux/ubuntu/gpg
state: present
- name: 添加 Docker 仓库
apt_repository:
repo: "deb https://download.docker.com/linux/ubuntu {{ ansible_distribution_release }} stable"
state: present
- name: 安装 Docker
package:
name:
- docker-ce
- docker-ce-cli
- containerd.io
state: present
- name: 启动 Docker 服务
systemd:
name: docker
enabled: yes
state: started
- name: 将用户添加到 docker 组
user:
name: appuser
groups: docker
append: yes
# roles/application/tasks/main.yml
---
- name: 克隆应用仓库
git:
repo: "{{ app_repo_url }}"
dest: "/home/appuser/app"
version: "{{ app_version }}"
force: yes
become: yes
become_user: appuser
- name: 复制环境配置
template:
src: .env.j2
dest: "/home/appuser/app/.env"
owner: appuser
group: appuser
mode: '0600'
notify: 重启应用
- name: 构建 Docker 镜像
docker_image:
name: "myapp:{{ app_version }}"
build:
path: "/home/appuser/app"
pull: yes
source: build
state: present
become: yes
- name: 启动应用容器
docker_container:
name: myapp
image: "myapp:{{ app_version }}"
state: started
restart_policy: always
ports:
- "8080:8080"
volumes:
- /home/appuser/app:/app:ro
env:
NODE_ENV: "{{ environment }}"
LOG_LEVEL: "{{ log_level }}"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
handlers:
- name: 重启应用
docker_container:
name: myapp
state: restarted
2. 库存和变量
# inventory/hosts.ini
[webservers]
web1 ansible_host=10.0.1.10
web2 ansible_host=10.0.1.11
web3 ansible_host=10.0.1.12
[databases]
db1 ansible_host=10.0.2.10 db_role=primary
db2 ansible_host=10.0.2.11 db_role=replica
[all:vars]
ansible_user=ubuntu
ansible_ssh_private_key_file=~/.ssh/id_rsa
ansible_python_interpreter=/usr/bin/python3
# inventory/group_vars/webservers.yml
---
app_version: "1.2.3"
app_repo_url: "https://github.com/myorg/myapp.git"
environment: production
log_level: INFO
# inventory/host_vars/web1.yml
---
server_role: primary
max_connections: 500
3. Ansible 部署脚本
#!/bin/bash
# ansible-deploy.sh - 使用 Ansible 部署
set -euo pipefail
ENVIRONMENT="${1:-dev}"
PLAYBOOK="${2:-site.yml}"
INVENTORY="inventory/hosts.ini"
LIMIT="${3:-all}"
echo "使用 Ansible 部署: $PLAYBOOK"
echo "环境: $ENVIRONMENT"
echo "限制: $LIMIT"
# 语法检查
echo "检查 Ansible 语法..."
ansible-playbook --syntax-check \
-i "$INVENTORY" \
-e "environment=$ENVIRONMENT" \
"$PLAYBOOK"
# 试运行
echo "执行试运行..."
ansible-playbook \
-i "$INVENTORY" \
-e "environment=$ENVIRONMENT" \
-l "$LIMIT" \
--check \
"$PLAYBOOK"
# 请求确认
read -p "继续部署?(y/n): " -r
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "部署取消"
exit 1
fi
# 执行剧本
echo "执行剧本..."
ansible-playbook \
-i "$INVENTORY" \
-e "environment=$ENVIRONMENT" \
-l "$LIMIT" \
-v \
"$PLAYBOOK"
echo "部署完成!"
# 运行验证
echo "运行部署后验证..."
ansible-playbook \
-i "$INVENTORY" \
-e "environment=$ENVIRONMENT" \
-l "$LIMIT" \
verify.yml
4. 配置模板
# roles/application/templates/.env.j2
# 环境配置
NODE_ENV={{ environment }}
LOG_LEVEL={{ log_level }}
PORT=8080
# 数据库配置
DATABASE_URL=postgresql://{{ db_user }}:{{ db_password }}@{{ db_host }}:5432/{{ db_name }}
DATABASE_POOL_SIZE=20
DATABASE_TIMEOUT=30000
# 缓存配置
REDIS_URL=redis://{{ redis_host }}:6379
CACHE_TTL=3600
# 应用配置
APP_NAME=MyApp
APP_VERSION={{ app_version }}
WORKERS={{ ansible_processor_vcpus }}
# API 配置
API_TIMEOUT=30000
API_RATE_LIMIT=1000
# 监控
SENTRY_DSN={{ sentry_dsn | default('') }}
DATADOG_API_KEY={{ datadog_api_key | default('') }}
Ansible 命令
# 列出库存中的所有主机
ansible all -i inventory/hosts.ini --list-hosts
# 运行即席命令
ansible webservers -i inventory/hosts.ini -m ping
# 执行剧本
ansible-playbook -i inventory/hosts.ini site.yml
# 语法检查
ansible-playbook --syntax-check site.yml
# 试运行
ansible-playbook -i inventory/hosts.ini site.yml --check
# 使用特定标签运行
ansible-playbook -i inventory/hosts.ini site.yml -t deploy
最佳实践
✅ DO
- 使用角色以实现模块化
- 实施适当的错误处理
- 使用模板进行配置
- 利用处理器实现幂等性
- 使用串行部署进行滚动更新
- 实施健康检查
- 将库存存储在版本控制中
- 使用 vault 保护敏感数据
❌ DON’T
- 无条件地使用 command/shell
- 不使用模板复制文件
- 不先运行检查模式
- 在库存中混合环境
- 硬编码值
- 忽略错误处理
- 使用 shell 执行简单任务