名称: fluxcd 描述: 用于 Kubernetes 的 GitOps 连续交付工具包,支持 Flux CD。在实施 GitOps 工作流、声明式部署、Helm 图表自动化、Kustomize 覆盖、镜像更新自动化、多租户或基于 Git 的连续交付时使用。触发词: flux, fluxcd, gitops, kustomization, helmrelease, gitrepository, helmrepository, imagerepository, imagepolicy, image automation, source controller, continuous delivery, kubernetes deployment automation, helm automation, kustomize automation, git sync, declarative deployment. 允许工具: Read, Grep, Glob, Edit, Write, Bash
Flux CD GitOps 工具包
概述
Flux CD 是一个声明式的 GitOps 连续交付解决方案,用于 Kubernetes。它自动确保您的 Kubernetes 集群状态与存储在 Git 仓库中的配置相匹配。
何时使用此技能:
- 为 Kubernetes 实施 GitOps 工作流
- 自动化 Helm 图表部署和升级
- 跨环境管理 Kustomize 覆盖
- 自动化容器镜像从注册表更新
- 设置多租户 Kubernetes 与隔离团队
- 集成基于 Git 的连续交付管道
- 管理基础设施和应用程序依赖项
- 实施具有金丝雀部署的渐进式交付
核心架构
Flux 由专门的控制器组成,每个控制器处理 GitOps 的特定方面:
源控制器
- GitRepository: 从 Git 仓库获取工件
- HelmRepository: 从图表仓库获取 Helm 图表
- HelmChart: 从 GitRepository 或 HelmRepository 源获取图表
- Bucket: 从 S3 兼容存储获取工件
Kustomize 控制器
- Kustomization: 应用 Kustomize 覆盖并管理协调
- 支持依赖排序和健康检查
- 处理已删除资源的清理
Helm 控制器
- HelmRelease: 管理 Helm 图表安装和升级
- 支持自动修复和测试
- 处理失败时的回滚
通知控制器
- Provider: 定义通知端点(如 Slack、MS Teams 等)
- Alert: 基于资源事件发送警报
- Receiver: 处理来自外部系统的 Webhook 通知
镜像自动化控制器
- ImageRepository: 扫描容器注册表以获取镜像元数据
- ImagePolicy: 定义选择镜像标签的规则
- ImageUpdateAutomation: 使用新镜像标签更新 Git 仓库
安装与引导
先决条件
# 安装 Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash
# 或使用 Homebrew
brew install fluxcd/tap/flux
# 验证安装
flux --version
使用 GitHub 引导
# 导出 GitHub 个人访问令牌
export GITHUB_TOKEN=<your-token>
# 引导 Flux
flux bootstrap github \
--owner=<github-username> \
--repository=<repo-name> \
--branch=main \
--path=clusters/production \
--personal \
--components-extra=image-reflector-controller,image-automation-controller
使用 GitLab 引导
export GITLAB_TOKEN=<your-token>
flux bootstrap gitlab \
--owner=<gitlab-group> \
--repository=<repo-name> \
--branch=main \
--path=clusters/production \
--personal
预提交验证
提交前检查您的清单:
# 验证所有 Flux 资源
flux check
# 检查特定资源
kubectl apply --dry-run=server -f clusters/production/
仓库结构最佳实践
标准布局
├── clusters/
│ ├── production/
│ │ ├── flux-system/ # Flux 组件(由引导管理)
│ │ ├── infrastructure.yaml # 基础设施源和 kustomizations
│ │ └── apps.yaml # 应用程序源和 kustomizations
│ └── staging/
│ ├── flux-system/
│ ├── infrastructure.yaml
│ └── apps.yaml
├── infrastructure/
│ ├── base/ # 基础基础设施
│ │ ├── ingress-nginx/
│ │ ├── cert-manager/
│ │ └── sealed-secrets/
│ └── overlays/
│ ├── production/
│ └── staging/
└── apps/
├── base/
│ ├── app1/
│ └── app2/
└── overlays/
├── production/
└── staging/
多租户布局
├── clusters/
│ └── production/
│ ├── flux-system/
│ ├── tenants/
│ │ ├── team-a.yaml # 团队 A 命名空间和 RBAC
│ │ └── team-b.yaml # 团队 B 命名空间和 RBAC
│ └── infrastructure.yaml
├── tenants/
│ ├── base/
│ │ ├── team-a/
│ │ │ ├── namespace.yaml
│ │ │ ├── rbac.yaml
│ │ │ └── sync.yaml # 团队的 GitRepository + Kustomization
│ │ └── team-b/
│ │ ├── namespace.yaml
│ │ ├── rbac.yaml
│ │ └── sync.yaml
│ └── overlays/
│ └── production/
└── teams/ # 每个团队的单独仓库或路径
├── team-a-repo/
└── team-b-repo/
GitRepository 和 Kustomization
基本 GitRepository
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 1m0s
ref:
branch: main
url: https://github.com/org/repo
secretRef:
name: flux-system
指定路径的 GitRepository
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m0s
ref:
branch: main
url: https://github.com/org/apps-repo
ignore: |
# 排除所有
/*
# 包含特定路径
!/apps/production/
基本 Kustomization
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 10m0s
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/production
prune: true
wait: true
timeout: 5m0s
带依赖的 Kustomization
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 10m0s
dependsOn:
- name: infrastructure
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps/production
prune: true
wait: true
timeout: 5m0s
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: app-name
namespace: app-namespace
postBuild:
substitute:
cluster_name: production
domain: example.com
substituteFrom:
- kind: ConfigMap
name: cluster-vars
变量替换
创建用于集群特定变量的 ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-vars
namespace: flux-system
data:
cluster_name: production
cluster_region: us-east-1
domain: example.com
在清单中使用变量:
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: default
data:
cluster: ${cluster_name}
region: ${cluster_region}
url: https://app.${domain}
多租户模式
命名空间隔离
Flux 支持多租户集群,其中团队拥有隔离的命名空间和自己的 GitRepository 源及 Kustomizations。
租户引导模式
# clusters/production/tenants/team-a.yaml
apiVersion: v1
kind: Namespace
metadata:
name: team-a
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: team-a-reconciler
namespace: team-a
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: team-a-reconciler
namespace: team-a
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: team-a-reconciler
namespace: team-a
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: team-a-repo
namespace: team-a
spec:
interval: 1m
url: https://github.com/org/team-a-repo
ref:
branch: main
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: team-a-apps
namespace: team-a
spec:
interval: 10m
serviceAccountName: team-a-reconciler
sourceRef:
kind: GitRepository
name: team-a-repo
path: ./apps
prune: true
validation: client
租户 RBAC 限制
将租户协调器限制在其命名空间内:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: team-a-reconciler
namespace: team-a
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: team-a-reconciler
namespace: team-a
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: team-a-reconciler
subjects:
- kind: ServiceAccount
name: team-a-reconciler
namespace: team-a
跨租户依赖
团队可以依赖共享基础设施,同时保持隔离:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: team-a-apps
namespace: team-a
spec:
interval: 10m
dependsOn:
- name: shared-ingress
namespace: flux-system
- name: shared-monitoring
namespace: flux-system
sourceRef:
kind: GitRepository
name: team-a-repo
path: ./apps
prune: true
Helm 集成
Flux 提供与 Helm 的深度集成,用于基于图表的部署。
HelmRepository 和 HelmRelease
HelmRepository
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: bitnami
namespace: flux-system
spec:
interval: 1h0s
url: https://charts.bitnami.com/bitnami
带认证的 HelmRepository
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: private-charts
namespace: flux-system
spec:
interval: 1h0s
url: https://charts.example.com
secretRef:
name: helm-charts-auth
---
apiVersion: v1
kind: Secret
metadata:
name: helm-charts-auth
namespace: flux-system
type: Opaque
stringData:
username: user
password: pass
基本 HelmRelease
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
name: nginx-ingress
namespace: ingress-nginx
spec:
interval: 10m0s
chart:
spec:
chart: ingress-nginx
version: "4.8.x"
sourceRef:
kind: HelmRepository
name: ingress-nginx
namespace: flux-system
interval: 1h0s
values:
controller:
service:
type: LoadBalancer
带 ValuesFrom 的 HelmRelease
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
name: my-app
namespace: apps
spec:
interval: 10m0s
chart:
spec:
chart: my-app
version: "1.0.x"
sourceRef:
kind: HelmRepository
name: my-charts
namespace: flux-system
values:
replicas: 2
valuesFrom:
- kind: ConfigMap
name: app-config
valuesKey: values.yaml
- kind: Secret
name: app-secrets
valuesKey: secrets.yaml
带测试和回滚的 HelmRelease
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
name: my-app
namespace: apps
spec:
interval: 10m0s
chart:
spec:
chart: my-app
version: "1.0.x"
sourceRef:
kind: HelmRepository
name: my-charts
namespace: flux-system
install:
remediation:
retries: 3
upgrade:
remediation:
retries: 3
remediateLastFailure: true
cleanupOnFail: true
test:
enable: true
rollback:
cleanupOnFail: true
recreate: true
values:
image:
tag: v1.0.0
带依赖的 HelmRelease
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
name: my-app
namespace: apps
spec:
interval: 10m0s
dependsOn:
- name: cert-manager
namespace: cert-manager
- name: nginx-ingress
namespace: ingress-nginx
chart:
spec:
chart: my-app
version: "1.0.x"
sourceRef:
kind: HelmRepository
name: my-charts
namespace: flux-system
values:
ingress:
enabled: true
className: nginx
使用 SOPS 进行秘密管理
安装 SOPS 和 Age
# 安装 SOPS
brew install sops
# 安装 Age
brew install age
# 生成 Age 密钥
age-keygen -o age.agekey
# 获取 .sops.yaml 的公共密钥
age-keygen -y age.agekey
配置 SOPS
在仓库根目录创建 .sops.yaml:
creation_rules:
- path_regex: .*/production/.*\.yaml
encrypted_regex: ^(data|stringData)$
age: age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8p
- path_regex: .*/staging/.*\.yaml
encrypted_regex: ^(data|stringData)$
age: age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8p
创建加密的秘密
# 创建秘密清单
cat <<EOF > secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
namespace: apps
stringData:
username: admin
password: supersecret
EOF
# 使用 SOPS 加密
sops --encrypt --in-place secret.yaml
# 解密查看
sops --decrypt secret.yaml
配置 Flux 用于 SOPS 解密
使用 Age 私钥创建秘密:
cat age.agekey | kubectl create secret generic sops-age \
--namespace=flux-system \
--from-file=age.agekey=/dev/stdin
配置 Kustomization 进行解密:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 10m0s
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps/production
prune: true
decryption:
provider: sops
secretRef:
name: sops-age
带多个密钥的 SOPS
对于团队协作,添加多个 Age 密钥:
creation_rules:
- path_regex: .*/production/.*\.yaml
encrypted_regex: ^(data|stringData)$
age: >-
age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8p,
age1zvkyg2lqzraa2lnjvqej32nkuu0ues2s82hzrye869xeexvn73equnujwj,
age1penhr3v0pklzv6lqrvt3zyqhfvqffkjn5j2qhzc8xr7q8vpfck4q7n8k3f
镜像自动化
Flux 可以自动检测新容器镜像版本并更新 Git 中的清单。
镜像自动化架构
镜像自动化工作流由三个资源组成:
- ImageRepository - 扫描容器注册表以获取可用标签
- ImagePolicy - 定义标签选择规则(语义版本、正则表达式、字母顺序)
- ImageUpdateAutomation - 将更新的镜像标签提交回 Git
镜像自动化工作流
容器注册表
|
| (扫描标签)
v
ImageRepository
|
| (过滤和选择)
v
ImagePolicy
|
| (更新清单)
v
ImageUpdateAutomation
|
| (提交到 Git)
v
GitRepository
|
| (协调)
v
Kustomization
|
v
Kubernetes 集群
ImageRepository
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: my-app
namespace: flux-system
spec:
image: ghcr.io/org/my-app
interval: 1m0s
带认证的 ImageRepository
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: my-app
namespace: flux-system
spec:
image: registry.example.com/org/my-app
interval: 1m0s
secretRef:
name: registry-credentials
---
apiVersion: v1
kind: Secret
metadata:
name: registry-credentials
namespace: flux-system
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: <base64-encoded-docker-config>
ImagePolicy - 语义版本化
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: my-app
namespace: flux-system
spec:
imageRepositoryRef:
name: my-app
policy:
semver:
range: 1.0.x
ImagePolicy - 字母顺序
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: my-app-develop
namespace: flux-system
spec:
imageRepositoryRef:
name: my-app
policy:
alphabetical:
order: asc
filterTags:
pattern: "^develop-[a-f0-9]+-(?P<ts>[0-9]+)"
extract: "$ts"
ImagePolicy - 数字
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: my-app-build
namespace: flux-system
spec:
imageRepositoryRef:
name: my-app
policy:
numerical:
order: asc
filterTags:
pattern: "^build-(?P<num>[0-9]+)"
extract: "$num"
ImageUpdateAutomation
apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
name: my-app
namespace: flux-system
spec:
interval: 1m0s
sourceRef:
kind: GitRepository
name: flux-system
git:
checkout:
ref:
branch: main
commit:
author:
email: fluxcdbot@users.noreply.github.com
name: fluxcdbot
messageTemplate: |
自动镜像更新
自动化名称: {{ .AutomationObject }}
文件:
{{ range $filename, $_ := .Updated.Files -}}
- {{ $filename }}
{{ end -}}
对象:
{{ range $resource, $_ := .Updated.Objects -}}
- {{ $resource.Kind }} {{ $resource.Name }}
{{ end -}}
镜像:
{{ range .Updated.Images -}}
- {{.}}
{{ end -}}
update:
path: ./apps/production
strategy: Setters
带镜像更新标记的清单
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: apps
spec:
template:
spec:
containers:
- name: app
image: ghcr.io/org/my-app:1.0.0 # {"$imagepolicy": "flux-system:my-app"}
镜像自动化最佳实践
环境策略:
- 首先在开发/暂存环境中启用自动化
- 使用手动批准进行生产(基于 PR 的工作流)
- 部署前测试策略规则
标签策略:
- 使用语义版本进行发布(例如
1.0.x,>=1.0.0) - 使用正则表达式进行基于分支的标签(例如
^develop-.*) - 使用数字进行构建编号
安全:
- 部署前扫描镜像(与 CI 集成)
- 使用带认证的私有注册表
- 启用镜像签名验证
带推送分支的 ImageUpdateAutomation
对于基于 PR 的工作流:
apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
name: my-app
namespace: flux-system
spec:
interval: 1m0s
sourceRef:
kind: GitRepository
name: flux-system
git:
checkout:
ref:
branch: main
push:
branch: image-updates
commit:
author:
email: fluxcdbot@users.noreply.github.com
name: fluxcdbot
messageTemplate: |
由 Flux 自动镜像更新
[ci skip]
update:
path: ./apps/production
strategy: Setters
通知
Slack 提供商
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: slack
namespace: flux-system
spec:
type: slack
channel: flux-notifications
secretRef:
name: slack-webhook-url
---
apiVersion: v1
kind: Secret
metadata:
name: slack-webhook-url
namespace: flux-system
stringData:
address: https://hooks.slack.com/services/YOUR/WEBHOOK/URL
Kustomization 失败警报
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: kustomization-failures
namespace: flux-system
spec:
providerRef:
name: slack
eventSeverity: error
eventSources:
- kind: Kustomization
name: "*"
exclusionList:
- ".*health check failed.*"
HelmRelease 事件警报
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: helm-releases
namespace: flux-system
spec:
providerRef:
name: slack
eventSeverity: info
eventSources:
- kind: HelmRelease
name: "*"
namespace: "*"
summary: "Helm 发布 {{ .InvolvedObject.name }} 在 {{ .InvolvedObject.namespace }}"
Microsoft Teams 提供商
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: msteams
namespace: flux-system
spec:
type: msteams
secretRef:
name: msteams-webhook-url
---
apiVersion: v1
kind: Secret
metadata:
name: msteams-webhook-url
namespace: flux-system
stringData:
address: https://outlook.office.com/webhook/YOUR/WEBHOOK/URL
GitHub Webhooks 接收器
apiVersion: notification.toolkit.fluxcd.io/v1
kind: Receiver
metadata:
name: github-receiver
namespace: flux-system
spec:
type: github
events:
- "ping"
- "push"
secretRef:
name: github-webhook-token
resources:
- kind: GitRepository
name: flux-system
---
apiVersion: v1
kind: Secret
metadata:
name: github-webhook-token
namespace: flux-system
type: Opaque
stringData:
token: <webhook-secret>
多集群设置
舰队仓库结构
fleet-infra/
├── clusters/
│ ├── production/
│ │ ├── flux-system/
│ │ └── cluster-config.yaml
│ ├── staging/
│ │ ├── flux-system/
│ │ └── cluster-config.yaml
│ └── development/
│ ├── flux-system/
│ └── cluster-config.yaml
├── infrastructure/
│ ├── base/
│ └── overlays/
│ ├── production/
│ ├── staging/
│ └── development/
└── apps/
├── base/
└── overlays/
├── production/
├── staging/
└── development/
集群特定配置
生产集群 (clusters/production/cluster-config.yaml):
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 10m0s
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/overlays/production
prune: true
wait: true
postBuild:
substitute:
cluster_name: production
cluster_region: us-east-1
replicas: "3"
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 10m0s
dependsOn:
- name: infrastructure
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps/overlays/production
prune: true
postBuild:
substitute:
cluster_name: production
domain: prod.example.com
使用 Cluster API 的多集群
使用 Cluster API 管理多个集群:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: cluster-staging
namespace: flux-system
spec:
interval: 10m0s
sourceRef:
kind: GitRepository
name: flux-system
path: ./clusters/staging
prune: true
kubeConfig:
secretRef:
name: staging-kubeconfig
---
apiVersion: v1
kind: Secret
metadata:
name: staging-kubeconfig
namespace: flux-system
type: Opaque
data:
value: <base64-encoded-kubeconfig>
依赖管理
基础设施层依赖
# 基础基础设施
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: crds
namespace: flux-system
spec:
interval: 1h
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/crds
prune: false # 从不自动清理 CRDs
---
# 依赖 CRDs
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: cert-manager
namespace: flux-system
spec:
interval: 10m
dependsOn:
- name: crds
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/cert-manager
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: cert-manager
namespace: cert-manager
---
# 依赖 cert-manager
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: ingress-nginx
namespace: flux-system
spec:
interval: 10m
dependsOn:
- name: cert-manager
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/ingress-nginx
应用程序依赖
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: database
namespace: flux-system
spec:
interval: 10m
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps/database
healthChecks:
- apiVersion: apps/v1
kind: StatefulSet
name: postgresql
namespace: database
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: backend
namespace: flux-system
spec:
interval: 5m
dependsOn:
- name: database
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps/backend
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: frontend
namespace: flux-system
spec:
interval: 5m
dependsOn:
- name: backend
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps/frontend
最佳实践
1. 资源组织
- 分离关注点:将基础设施、应用程序和集群配置放在单独的目录中
- 使用覆盖:利用 Kustomize 覆盖进行环境特定配置
- 命名空间隔离:为不同团队或应用程序使用单独的命名空间
2. 协调间隔
- 基础设施:1 小时(稳定资源,变化较少)
- 应用程序:10 分钟(平衡响应能力和 API 负载)
- 开发:1 分钟-5 分钟(在活跃开发期间获得更快的反馈)
- 源仓库:1 分钟-5 分钟(快速检测变更)
3. 清理策略
- 启用清理:为 Kustomizations 设置
prune: true以清理已删除的资源 - CRDs 例外:为 CRD Kustomizations 设置
prune: false以防止意外删除 - 生产前测试:首先在非生产环境中测试清理
4. 健康检查
始终为关键资源定义健康检查:
spec:
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: critical-app
namespace: apps
- apiVersion: v1
kind: Service
name: critical-service
namespace: apps
5. 暂停协调
需要时临时暂停协调:
# 暂停 Kustomization
flux suspend kustomization apps
# 恢复协调
flux resume kustomization apps
6. 强制协调
触发立即协调:
# 协调特定 Kustomization
flux reconcile kustomization apps --with-source
# 协调 HelmRelease
flux reconcile helmrelease my-app -n apps
7. 监控和调试
# 检查 Flux 组件状态
flux check
# 获取所有 Flux 资源
flux get all
# 获取带有详细信息的特定资源
flux get kustomization infrastructure
# 查看日志
flux logs --level=error --all-namespaces
# 导出当前集群状态
flux export source git flux-system
flux export kustomization --all
8. 版本控制
- 频繁提交:小的、原子性的提交更容易调试
- 有意义的消息:描述内容和原因,而不仅仅是内容
- 分支保护:为主/生产分支要求审查
- 标签发布:使用 Git 标签进行应用程序版本跟踪
9. 安全
- 加密秘密:始终使用 SOPS 或外部秘密管理器
- RBAC:为多租户实施严格的 RBAC 策略
- 网络策略:为命名空间隔离定义网络策略
- 镜像扫描:在 CI/CD 中集成容器镜像扫描
- 策略执行:使用 OPA Gatekeeper 或 Kyverno 等工具
10. 灾难恢复
# 备份 Flux 配置
flux export source git --all > sources.yaml
flux export kustomization --all > kustomizations.yaml
flux export helmrelease --all > helmreleases.yaml
# 从备份恢复
kubectl apply -f sources.yaml
kubectl apply -f kustomizations.yaml
kubectl apply -f helmreleases.yaml
常见模式
使用 Flagger 进行渐进式交付
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
name: flagger
namespace: flagger-system
spec:
interval: 10m
chart:
spec:
chart: flagger
version: "1.x"
sourceRef:
kind: HelmRepository
name: flagger
namespace: flux-system
---
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: my-app
namespace: apps
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
service:
port: 80
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
外部秘密操作员集成
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: external-secrets
namespace: flux-system
spec:
interval: 10m
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/external-secrets
prune: true
---
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secretsmanager
namespace: apps
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
auth:
jwt:
serviceAccountRef:
name: external-secrets-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: app-secrets
namespace: apps
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secretsmanager
kind: SecretStore
target:
name: app-secrets
creationPolicy: Owner
data:
- secretKey: db-password
remoteRef:
key: prod/app/database
property: password
故障排除
常见问题
问题:Kustomization 卡在“进行中”状态
# 检查 Kustomization 状态
flux get kustomization infrastructure
# 查看详细事件
kubectl describe kustomization infrastructure -n flux-system
# 检查日志
kubectl logs -n flux-system deploy/kustomize-controller
问题:HelmRelease 安装失败
# 获取 HelmRelease 状态
flux get helmrelease my-app -n apps
# 查看 Helm 发布历史
helm history my-app -n apps
# 检查 Helm 控制器日志
kubectl logs -n flux-system deploy/helm-controller
问题:镜像自动化未更新清单
# 检查 ImageRepository 状态
flux get image repository my-app
# 检查 ImagePolicy 状态
flux get image policy my-app
# 查看镜像自动化日志
kubectl logs -n flux-system deploy/image-reflector-controller
kubectl logs -n flux-system deploy/image-automation-controller
问题:源协调失败
# 检查 GitRepository 状态
flux get source git flux-system
# 查看源控制器日志
kubectl logs -n flux-system deploy/source-controller
# 手动协调
flux reconcile source git flux-system
调试模式
启用调试日志记录:
# 为控制器打补丁以启用调试日志记录
kubectl patch deployment kustomize-controller \
-n flux-system \
--type='json' \
-p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--log-level=debug"}]'
性能优化
减少 API 服务器负载
spec:
interval: 1h # 增加稳定资源的间隔
retryInterval: 5m # 错误时减少重试频率
优化 Git 操作
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 5m
ref:
branch: main
url: https://github.com/org/repo
ignore: |
# 减少克隆大小
*.md
docs/
examples/
并行协调
在控制器中启用并行协调:
flux install \
--components-extra=image-reflector-controller,image-automation-controller \
--reconcile-interval=1h \
--kustomize-concurrency=10 \
--helm-concurrency=10
总结
Flux CD 提供了一种强大的声明式方法,通过 GitOps 管理 Kubernetes 部署。关键要点:
- 一次引导:使用
flux bootstrap在集群中设置 Flux - 组织周到:为清晰度和可维护性结构化您的仓库
- 分层依赖:在应用程序之前构建基础设施
- 保护秘密:使用 SOPS 或外部秘密管理器
- 主动监控:设置警报并定期检查 Flux 状态
- 小心自动化:首先在非生产环境中使用镜像自动化
- 多租户:利用命名空间和 RBAC 进行团队隔离
- 测试变更:在生产之前在较低环境中验证
关键决策点
选择 GitRepository vs HelmRepository:
- GitRepository:用于自定义清单、Kustomize 覆盖或 Git 中的 Helm 图表
- HelmRepository:用于公共/私有 Helm 图表仓库
选择 Kustomization vs HelmRelease:
- Kustomization:用于原始清单、ConfigMaps、Secrets、Kustomize 覆盖
- HelmRelease:用于打包的 Helm 图表,具有值自定义
镜像自动化策略:
- 直接提交:开发/暂存环境,快速迭代
- PR 工作流:需要审查和批准的生产环境
- 禁用:具有手动部署门控的任务关键型生产
多租户方法:
- 命名空间隔离:团队共享集群,按命名空间分离
- 集群隔离:每个团队获得专用集群
- 混合:核心团队共享,外部团队隔离
秘密管理:
- SOPS:Git 原生,age/pgp 加密,适合小团队
- 外部秘密操作员:集成 AWS Secrets Manager、Vault、GCP Secret Manager
- Sealed Secrets:Kubernetes 原生,单向加密
通过遵循这些模式和实践,您可以构建可靠、自动化的部署管道,随着您的组织规模扩展。