FluxCDGitOps工具包Skill fluxcd

Flux CD是一个用于Kubernetes的GitOps连续交付工具包,用于自动化部署、管理配置和实现声明式交付。适用于GitOps工作流、Helm图表自动化、Kustomize覆盖、镜像更新自动化、多租户等场景。关键词:Flux CD, GitOps, Kubernetes, 持续交付, 自动化部署, 云原生, CI/CD。

CI/CD 0 次安装 0 次浏览 更新于 3/24/2026

名称: fluxcd 描述: 用于 Kubernetes 的 GitOps 连续交付工具包,支持 Flux CD。在实施 GitOps 工作流、声明式部署、Helm 图表自动化、Kustomize 覆盖、镜像更新自动化、多租户或基于 Git 的连续交付时使用。触发词: flux, fluxcd, gitops, kustomization, helmrelease, gitrepository, helmrepository, imagerepository, imagepolicy, image automation, source controller, continuous delivery, kubernetes deployment automation, helm automation, kustomize automation, git sync, declarative deployment. 允许工具: Read, Grep, Glob, Edit, Write, Bash

Flux CD GitOps 工具包

概述

Flux CD 是一个声明式的 GitOps 连续交付解决方案,用于 Kubernetes。它自动确保您的 Kubernetes 集群状态与存储在 Git 仓库中的配置相匹配。

何时使用此技能:

  • 为 Kubernetes 实施 GitOps 工作流
  • 自动化 Helm 图表部署和升级
  • 跨环境管理 Kustomize 覆盖
  • 自动化容器镜像从注册表更新
  • 设置多租户 Kubernetes 与隔离团队
  • 集成基于 Git 的连续交付管道
  • 管理基础设施和应用程序依赖项
  • 实施具有金丝雀部署的渐进式交付

核心架构

Flux 由专门的控制器组成,每个控制器处理 GitOps 的特定方面:

源控制器

  • GitRepository: 从 Git 仓库获取工件
  • HelmRepository: 从图表仓库获取 Helm 图表
  • HelmChart: 从 GitRepository 或 HelmRepository 源获取图表
  • Bucket: 从 S3 兼容存储获取工件

Kustomize 控制器

  • Kustomization: 应用 Kustomize 覆盖并管理协调
  • 支持依赖排序和健康检查
  • 处理已删除资源的清理

Helm 控制器

  • HelmRelease: 管理 Helm 图表安装和升级
  • 支持自动修复和测试
  • 处理失败时的回滚

通知控制器

  • Provider: 定义通知端点(如 Slack、MS Teams 等)
  • Alert: 基于资源事件发送警报
  • Receiver: 处理来自外部系统的 Webhook 通知

镜像自动化控制器

  • ImageRepository: 扫描容器注册表以获取镜像元数据
  • ImagePolicy: 定义选择镜像标签的规则
  • ImageUpdateAutomation: 使用新镜像标签更新 Git 仓库

安装与引导

先决条件


# 安装 Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash

# 或使用 Homebrew
brew install fluxcd/tap/flux

# 验证安装
flux --version

使用 GitHub 引导

# 导出 GitHub 个人访问令牌
export GITHUB_TOKEN=<your-token>

# 引导 Flux
flux bootstrap github \
  --owner=<github-username> \
  --repository=<repo-name> \
  --branch=main \
  --path=clusters/production \
  --personal \
  --components-extra=image-reflector-controller,image-automation-controller

使用 GitLab 引导

export GITLAB_TOKEN=<your-token>

flux bootstrap gitlab \
  --owner=<gitlab-group> \
  --repository=<repo-name> \
  --branch=main \
  --path=clusters/production \
  --personal

预提交验证

提交前检查您的清单:

# 验证所有 Flux 资源
flux check

# 检查特定资源
kubectl apply --dry-run=server -f clusters/production/

仓库结构最佳实践

标准布局

├── clusters/
│   ├── production/
│   │   ├── flux-system/           # Flux 组件(由引导管理)
│   │   ├── infrastructure.yaml    # 基础设施源和 kustomizations
│   │   └── apps.yaml              # 应用程序源和 kustomizations
│   └── staging/
│       ├── flux-system/
│       ├── infrastructure.yaml
│       └── apps.yaml
├── infrastructure/
│   ├── base/                      # 基础基础设施
│   │   ├── ingress-nginx/
│   │   ├── cert-manager/
│   │   └── sealed-secrets/
│   └── overlays/
│       ├── production/
│       └── staging/
└── apps/
    ├── base/
    │   ├── app1/
    │   └── app2/
    └── overlays/
        ├── production/
        └── staging/

多租户布局

├── clusters/
│   └── production/
│       ├── flux-system/
│       ├── tenants/
│       │   ├── team-a.yaml        # 团队 A 命名空间和 RBAC
│       │   └── team-b.yaml        # 团队 B 命名空间和 RBAC
│       └── infrastructure.yaml
├── tenants/
│   ├── base/
│   │   ├── team-a/
│   │   │   ├── namespace.yaml
│   │   │   ├── rbac.yaml
│   │   │   └── sync.yaml          # 团队的 GitRepository + Kustomization
│   │   └── team-b/
│   │       ├── namespace.yaml
│   │       ├── rbac.yaml
│   │       └── sync.yaml
│   └── overlays/
│       └── production/
└── teams/                         # 每个团队的单独仓库或路径
    ├── team-a-repo/
    └── team-b-repo/

GitRepository 和 Kustomization

基本 GitRepository

apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: flux-system
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  url: https://github.com/org/repo
  secretRef:
    name: flux-system

指定路径的 GitRepository

apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  ref:
    branch: main
  url: https://github.com/org/apps-repo
  ignore: |
    # 排除所有
    /*
    # 包含特定路径
    !/apps/production/

基本 Kustomization

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infrastructure
  namespace: flux-system
spec:
  interval: 10m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/production
  prune: true
  wait: true
  timeout: 5m0s

带依赖的 Kustomization

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 10m0s
  dependsOn:
    - name: infrastructure
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/production
  prune: true
  wait: true
  timeout: 5m0s
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: app-name
      namespace: app-namespace
  postBuild:
    substitute:
      cluster_name: production
      domain: example.com
    substituteFrom:
      - kind: ConfigMap
        name: cluster-vars

变量替换

创建用于集群特定变量的 ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-vars
  namespace: flux-system
data:
  cluster_name: production
  cluster_region: us-east-1
  domain: example.com

在清单中使用变量:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: default
data:
  cluster: ${cluster_name}
  region: ${cluster_region}
  url: https://app.${domain}

多租户模式

命名空间隔离

Flux 支持多租户集群,其中团队拥有隔离的命名空间和自己的 GitRepository 源及 Kustomizations。

租户引导模式

# clusters/production/tenants/team-a.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: team-a
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: team-a-reconciler
  namespace: team-a
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: team-a-reconciler
  namespace: team-a
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: team-a-reconciler
    namespace: team-a
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: team-a-repo
  namespace: team-a
spec:
  interval: 1m
  url: https://github.com/org/team-a-repo
  ref:
    branch: main
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: team-a-apps
  namespace: team-a
spec:
  interval: 10m
  serviceAccountName: team-a-reconciler
  sourceRef:
    kind: GitRepository
    name: team-a-repo
  path: ./apps
  prune: true
  validation: client

租户 RBAC 限制

将租户协调器限制在其命名空间内:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: team-a-reconciler
  namespace: team-a
rules:
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: team-a-reconciler
  namespace: team-a
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: team-a-reconciler
subjects:
  - kind: ServiceAccount
    name: team-a-reconciler
    namespace: team-a

跨租户依赖

团队可以依赖共享基础设施,同时保持隔离:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: team-a-apps
  namespace: team-a
spec:
  interval: 10m
  dependsOn:
    - name: shared-ingress
      namespace: flux-system
    - name: shared-monitoring
      namespace: flux-system
  sourceRef:
    kind: GitRepository
    name: team-a-repo
  path: ./apps
  prune: true

Helm 集成

Flux 提供与 Helm 的深度集成,用于基于图表的部署。

HelmRepository 和 HelmRelease

HelmRepository

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: bitnami
  namespace: flux-system
spec:
  interval: 1h0s
  url: https://charts.bitnami.com/bitnami

带认证的 HelmRepository

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: private-charts
  namespace: flux-system
spec:
  interval: 1h0s
  url: https://charts.example.com
  secretRef:
    name: helm-charts-auth
---
apiVersion: v1
kind: Secret
metadata:
  name: helm-charts-auth
  namespace: flux-system
type: Opaque
stringData:
  username: user
  password: pass

基本 HelmRelease

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: nginx-ingress
  namespace: ingress-nginx
spec:
  interval: 10m0s
  chart:
    spec:
      chart: ingress-nginx
      version: "4.8.x"
      sourceRef:
        kind: HelmRepository
        name: ingress-nginx
        namespace: flux-system
      interval: 1h0s
  values:
    controller:
      service:
        type: LoadBalancer

带 ValuesFrom 的 HelmRelease

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: my-app
  namespace: apps
spec:
  interval: 10m0s
  chart:
    spec:
      chart: my-app
      version: "1.0.x"
      sourceRef:
        kind: HelmRepository
        name: my-charts
        namespace: flux-system
  values:
    replicas: 2
  valuesFrom:
    - kind: ConfigMap
      name: app-config
      valuesKey: values.yaml
    - kind: Secret
      name: app-secrets
      valuesKey: secrets.yaml

带测试和回滚的 HelmRelease

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: my-app
  namespace: apps
spec:
  interval: 10m0s
  chart:
    spec:
      chart: my-app
      version: "1.0.x"
      sourceRef:
        kind: HelmRepository
        name: my-charts
        namespace: flux-system
  install:
    remediation:
      retries: 3
  upgrade:
    remediation:
      retries: 3
      remediateLastFailure: true
    cleanupOnFail: true
  test:
    enable: true
  rollback:
    cleanupOnFail: true
    recreate: true
  values:
    image:
      tag: v1.0.0

带依赖的 HelmRelease

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: my-app
  namespace: apps
spec:
  interval: 10m0s
  dependsOn:
    - name: cert-manager
      namespace: cert-manager
    - name: nginx-ingress
      namespace: ingress-nginx
  chart:
    spec:
      chart: my-app
      version: "1.0.x"
      sourceRef:
        kind: HelmRepository
        name: my-charts
        namespace: flux-system
  values:
    ingress:
      enabled: true
      className: nginx

使用 SOPS 进行秘密管理

安装 SOPS 和 Age

# 安装 SOPS
brew install sops

# 安装 Age
brew install age

# 生成 Age 密钥
age-keygen -o age.agekey

# 获取 .sops.yaml 的公共密钥
age-keygen -y age.agekey

配置 SOPS

在仓库根目录创建 .sops.yaml

creation_rules:
  - path_regex: .*/production/.*\.yaml
    encrypted_regex: ^(data|stringData)$
    age: age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8p
  - path_regex: .*/staging/.*\.yaml
    encrypted_regex: ^(data|stringData)$
    age: age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8p

创建加密的秘密

# 创建秘密清单
cat <<EOF > secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
  namespace: apps
stringData:
  username: admin
  password: supersecret
EOF

# 使用 SOPS 加密
sops --encrypt --in-place secret.yaml

# 解密查看
sops --decrypt secret.yaml

配置 Flux 用于 SOPS 解密

使用 Age 私钥创建秘密:

cat age.agekey | kubectl create secret generic sops-age \
  --namespace=flux-system \
  --from-file=age.agekey=/dev/stdin

配置 Kustomization 进行解密:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 10m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/production
  prune: true
  decryption:
    provider: sops
    secretRef:
      name: sops-age

带多个密钥的 SOPS

对于团队协作,添加多个 Age 密钥:

creation_rules:
  - path_regex: .*/production/.*\.yaml
    encrypted_regex: ^(data|stringData)$
    age: >-
      age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8p,
      age1zvkyg2lqzraa2lnjvqej32nkuu0ues2s82hzrye869xeexvn73equnujwj,
      age1penhr3v0pklzv6lqrvt3zyqhfvqffkjn5j2qhzc8xr7q8vpfck4q7n8k3f

镜像自动化

Flux 可以自动检测新容器镜像版本并更新 Git 中的清单。

镜像自动化架构

镜像自动化工作流由三个资源组成:

  1. ImageRepository - 扫描容器注册表以获取可用标签
  2. ImagePolicy - 定义标签选择规则(语义版本、正则表达式、字母顺序)
  3. ImageUpdateAutomation - 将更新的镜像标签提交回 Git

镜像自动化工作流

容器注册表
       |
       | (扫描标签)
       v
ImageRepository
       |
       | (过滤和选择)
       v
  ImagePolicy
       |
       | (更新清单)
       v
ImageUpdateAutomation
       |
       | (提交到 Git)
       v
   GitRepository
       |
       | (协调)
       v
  Kustomization
       |
       v
   Kubernetes 集群

ImageRepository

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
  name: my-app
  namespace: flux-system
spec:
  image: ghcr.io/org/my-app
  interval: 1m0s

带认证的 ImageRepository

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
  name: my-app
  namespace: flux-system
spec:
  image: registry.example.com/org/my-app
  interval: 1m0s
  secretRef:
    name: registry-credentials
---
apiVersion: v1
kind: Secret
metadata:
  name: registry-credentials
  namespace: flux-system
type: kubernetes.io/dockerconfigjson
data:
  .dockerconfigjson: <base64-encoded-docker-config>

ImagePolicy - 语义版本化

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: my-app
  namespace: flux-system
spec:
  imageRepositoryRef:
    name: my-app
  policy:
    semver:
      range: 1.0.x

ImagePolicy - 字母顺序

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: my-app-develop
  namespace: flux-system
spec:
  imageRepositoryRef:
    name: my-app
  policy:
    alphabetical:
      order: asc
  filterTags:
    pattern: "^develop-[a-f0-9]+-(?P<ts>[0-9]+)"
    extract: "$ts"

ImagePolicy - 数字

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: my-app-build
  namespace: flux-system
spec:
  imageRepositoryRef:
    name: my-app
  policy:
    numerical:
      order: asc
  filterTags:
    pattern: "^build-(?P<num>[0-9]+)"
    extract: "$num"

ImageUpdateAutomation

apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
  name: my-app
  namespace: flux-system
spec:
  interval: 1m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  git:
    checkout:
      ref:
        branch: main
    commit:
      author:
        email: fluxcdbot@users.noreply.github.com
        name: fluxcdbot
      messageTemplate: |
        自动镜像更新

        自动化名称: {{ .AutomationObject }}

        文件:
        {{ range $filename, $_ := .Updated.Files -}}
        - {{ $filename }}
        {{ end -}}

        对象:
        {{ range $resource, $_ := .Updated.Objects -}}
        - {{ $resource.Kind }} {{ $resource.Name }}
        {{ end -}}

        镜像:
        {{ range .Updated.Images -}}
        - {{.}}
        {{ end -}}
  update:
    path: ./apps/production
    strategy: Setters

带镜像更新标记的清单

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: apps
spec:
  template:
    spec:
      containers:
        - name: app
          image: ghcr.io/org/my-app:1.0.0 # {"$imagepolicy": "flux-system:my-app"}

镜像自动化最佳实践

环境策略:

  • 首先在开发/暂存环境中启用自动化
  • 使用手动批准进行生产(基于 PR 的工作流)
  • 部署前测试策略规则

标签策略:

  • 使用语义版本进行发布(例如 1.0.x, >=1.0.0
  • 使用正则表达式进行基于分支的标签(例如 ^develop-.*
  • 使用数字进行构建编号

安全:

  • 部署前扫描镜像(与 CI 集成)
  • 使用带认证的私有注册表
  • 启用镜像签名验证

带推送分支的 ImageUpdateAutomation

对于基于 PR 的工作流:

apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
  name: my-app
  namespace: flux-system
spec:
  interval: 1m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  git:
    checkout:
      ref:
        branch: main
    push:
      branch: image-updates
    commit:
      author:
        email: fluxcdbot@users.noreply.github.com
        name: fluxcdbot
      messageTemplate: |
        由 Flux 自动镜像更新

        [ci skip]
  update:
    path: ./apps/production
    strategy: Setters

通知

Slack 提供商

apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
  name: slack
  namespace: flux-system
spec:
  type: slack
  channel: flux-notifications
  secretRef:
    name: slack-webhook-url
---
apiVersion: v1
kind: Secret
metadata:
  name: slack-webhook-url
  namespace: flux-system
stringData:
  address: https://hooks.slack.com/services/YOUR/WEBHOOK/URL

Kustomization 失败警报

apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
  name: kustomization-failures
  namespace: flux-system
spec:
  providerRef:
    name: slack
  eventSeverity: error
  eventSources:
    - kind: Kustomization
      name: "*"
  exclusionList:
    - ".*health check failed.*"

HelmRelease 事件警报

apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
  name: helm-releases
  namespace: flux-system
spec:
  providerRef:
    name: slack
  eventSeverity: info
  eventSources:
    - kind: HelmRelease
      name: "*"
      namespace: "*"
  summary: "Helm 发布 {{ .InvolvedObject.name }} 在 {{ .InvolvedObject.namespace }}"

Microsoft Teams 提供商

apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
  name: msteams
  namespace: flux-system
spec:
  type: msteams
  secretRef:
    name: msteams-webhook-url
---
apiVersion: v1
kind: Secret
metadata:
  name: msteams-webhook-url
  namespace: flux-system
stringData:
  address: https://outlook.office.com/webhook/YOUR/WEBHOOK/URL

GitHub Webhooks 接收器

apiVersion: notification.toolkit.fluxcd.io/v1
kind: Receiver
metadata:
  name: github-receiver
  namespace: flux-system
spec:
  type: github
  events:
    - "ping"
    - "push"
  secretRef:
    name: github-webhook-token
  resources:
    - kind: GitRepository
      name: flux-system
---
apiVersion: v1
kind: Secret
metadata:
  name: github-webhook-token
  namespace: flux-system
type: Opaque
stringData:
  token: <webhook-secret>

多集群设置

舰队仓库结构

fleet-infra/
├── clusters/
│   ├── production/
│   │   ├── flux-system/
│   │   └── cluster-config.yaml
│   ├── staging/
│   │   ├── flux-system/
│   │   └── cluster-config.yaml
│   └── development/
│       ├── flux-system/
│       └── cluster-config.yaml
├── infrastructure/
│   ├── base/
│   └── overlays/
│       ├── production/
│       ├── staging/
│       └── development/
└── apps/
    ├── base/
    └── overlays/
        ├── production/
        ├── staging/
        └── development/

集群特定配置

生产集群 (clusters/production/cluster-config.yaml):

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infrastructure
  namespace: flux-system
spec:
  interval: 10m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/overlays/production
  prune: true
  wait: true
  postBuild:
    substitute:
      cluster_name: production
      cluster_region: us-east-1
      replicas: "3"
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 10m0s
  dependsOn:
    - name: infrastructure
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/overlays/production
  prune: true
  postBuild:
    substitute:
      cluster_name: production
      domain: prod.example.com

使用 Cluster API 的多集群

使用 Cluster API 管理多个集群:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: cluster-staging
  namespace: flux-system
spec:
  interval: 10m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./clusters/staging
  prune: true
  kubeConfig:
    secretRef:
      name: staging-kubeconfig
---
apiVersion: v1
kind: Secret
metadata:
  name: staging-kubeconfig
  namespace: flux-system
type: Opaque
data:
  value: <base64-encoded-kubeconfig>

依赖管理

基础设施层依赖

# 基础基础设施
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: crds
  namespace: flux-system
spec:
  interval: 1h
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/crds
  prune: false # 从不自动清理 CRDs
---
# 依赖 CRDs
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: cert-manager
  namespace: flux-system
spec:
  interval: 10m
  dependsOn:
    - name: crds
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/cert-manager
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: cert-manager
      namespace: cert-manager
---
# 依赖 cert-manager
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: ingress-nginx
  namespace: flux-system
spec:
  interval: 10m
  dependsOn:
    - name: cert-manager
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/ingress-nginx

应用程序依赖

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: database
  namespace: flux-system
spec:
  interval: 10m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/database
  healthChecks:
    - apiVersion: apps/v1
      kind: StatefulSet
      name: postgresql
      namespace: database
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: backend
  namespace: flux-system
spec:
  interval: 5m
  dependsOn:
    - name: database
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/backend
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: frontend
  namespace: flux-system
spec:
  interval: 5m
  dependsOn:
    - name: backend
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/frontend

最佳实践

1. 资源组织

  • 分离关注点:将基础设施、应用程序和集群配置放在单独的目录中
  • 使用覆盖:利用 Kustomize 覆盖进行环境特定配置
  • 命名空间隔离:为不同团队或应用程序使用单独的命名空间

2. 协调间隔

  • 基础设施:1 小时(稳定资源,变化较少)
  • 应用程序:10 分钟(平衡响应能力和 API 负载)
  • 开发:1 分钟-5 分钟(在活跃开发期间获得更快的反馈)
  • 源仓库:1 分钟-5 分钟(快速检测变更)

3. 清理策略

  • 启用清理:为 Kustomizations 设置 prune: true 以清理已删除的资源
  • CRDs 例外:为 CRD Kustomizations 设置 prune: false 以防止意外删除
  • 生产前测试:首先在非生产环境中测试清理

4. 健康检查

始终为关键资源定义健康检查:

spec:
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: critical-app
      namespace: apps
    - apiVersion: v1
      kind: Service
      name: critical-service
      namespace: apps

5. 暂停协调

需要时临时暂停协调:

# 暂停 Kustomization
flux suspend kustomization apps

# 恢复协调
flux resume kustomization apps

6. 强制协调

触发立即协调:

# 协调特定 Kustomization
flux reconcile kustomization apps --with-source

# 协调 HelmRelease
flux reconcile helmrelease my-app -n apps

7. 监控和调试

# 检查 Flux 组件状态
flux check

# 获取所有 Flux 资源
flux get all

# 获取带有详细信息的特定资源
flux get kustomization infrastructure

# 查看日志
flux logs --level=error --all-namespaces

# 导出当前集群状态
flux export source git flux-system
flux export kustomization --all

8. 版本控制

  • 频繁提交:小的、原子性的提交更容易调试
  • 有意义的消息:描述内容和原因,而不仅仅是内容
  • 分支保护:为主/生产分支要求审查
  • 标签发布:使用 Git 标签进行应用程序版本跟踪

9. 安全

  • 加密秘密:始终使用 SOPS 或外部秘密管理器
  • RBAC:为多租户实施严格的 RBAC 策略
  • 网络策略:为命名空间隔离定义网络策略
  • 镜像扫描:在 CI/CD 中集成容器镜像扫描
  • 策略执行:使用 OPA Gatekeeper 或 Kyverno 等工具

10. 灾难恢复


# 备份 Flux 配置
flux export source git --all > sources.yaml
flux export kustomization --all > kustomizations.yaml
flux export helmrelease --all > helmreleases.yaml

# 从备份恢复
kubectl apply -f sources.yaml
kubectl apply -f kustomizations.yaml
kubectl apply -f helmreleases.yaml

常见模式

使用 Flagger 进行渐进式交付

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: flagger
  namespace: flagger-system
spec:
  interval: 10m
  chart:
    spec:
      chart: flagger
      version: "1.x"
      sourceRef:
        kind: HelmRepository
        name: flagger
        namespace: flux-system
---
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: my-app
  namespace: apps
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  service:
    port: 80
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange:
          min: 99
        interval: 1m

外部秘密操作员集成

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: external-secrets
  namespace: flux-system
spec:
  interval: 10m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/external-secrets
  prune: true
---
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: aws-secretsmanager
  namespace: apps
spec:
  provider:
    aws:
      service: SecretsManager
      region: us-east-1
      auth:
        jwt:
          serviceAccountRef:
            name: external-secrets-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: app-secrets
  namespace: apps
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secretsmanager
    kind: SecretStore
  target:
    name: app-secrets
    creationPolicy: Owner
  data:
    - secretKey: db-password
      remoteRef:
        key: prod/app/database
        property: password

故障排除

常见问题

问题:Kustomization 卡在“进行中”状态

# 检查 Kustomization 状态
flux get kustomization infrastructure

# 查看详细事件
kubectl describe kustomization infrastructure -n flux-system

# 检查日志
kubectl logs -n flux-system deploy/kustomize-controller

问题:HelmRelease 安装失败

# 获取 HelmRelease 状态
flux get helmrelease my-app -n apps

# 查看 Helm 发布历史
helm history my-app -n apps

# 检查 Helm 控制器日志
kubectl logs -n flux-system deploy/helm-controller

问题:镜像自动化未更新清单

# 检查 ImageRepository 状态
flux get image repository my-app

# 检查 ImagePolicy 状态
flux get image policy my-app

# 查看镜像自动化日志
kubectl logs -n flux-system deploy/image-reflector-controller
kubectl logs -n flux-system deploy/image-automation-controller

问题:源协调失败

# 检查 GitRepository 状态
flux get source git flux-system

# 查看源控制器日志
kubectl logs -n flux-system deploy/source-controller

# 手动协调
flux reconcile source git flux-system

调试模式

启用调试日志记录:

# 为控制器打补丁以启用调试日志记录
kubectl patch deployment kustomize-controller \
  -n flux-system \
  --type='json' \
  -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--log-level=debug"}]'

性能优化

减少 API 服务器负载

spec:
  interval: 1h # 增加稳定资源的间隔
  retryInterval: 5m # 错误时减少重试频率

优化 Git 操作

apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: flux-system
  namespace: flux-system
spec:
  interval: 5m
  ref:
    branch: main
  url: https://github.com/org/repo
  ignore: |
    # 减少克隆大小
    *.md
    docs/
    examples/

并行协调

在控制器中启用并行协调:

flux install \
  --components-extra=image-reflector-controller,image-automation-controller \
  --reconcile-interval=1h \
  --kustomize-concurrency=10 \
  --helm-concurrency=10

总结

Flux CD 提供了一种强大的声明式方法,通过 GitOps 管理 Kubernetes 部署。关键要点:

  1. 一次引导:使用 flux bootstrap 在集群中设置 Flux
  2. 组织周到:为清晰度和可维护性结构化您的仓库
  3. 分层依赖:在应用程序之前构建基础设施
  4. 保护秘密:使用 SOPS 或外部秘密管理器
  5. 主动监控:设置警报并定期检查 Flux 状态
  6. 小心自动化:首先在非生产环境中使用镜像自动化
  7. 多租户:利用命名空间和 RBAC 进行团队隔离
  8. 测试变更:在生产之前在较低环境中验证

关键决策点

选择 GitRepository vs HelmRepository:

  • GitRepository:用于自定义清单、Kustomize 覆盖或 Git 中的 Helm 图表
  • HelmRepository:用于公共/私有 Helm 图表仓库

选择 Kustomization vs HelmRelease:

  • Kustomization:用于原始清单、ConfigMaps、Secrets、Kustomize 覆盖
  • HelmRelease:用于打包的 Helm 图表,具有值自定义

镜像自动化策略:

  • 直接提交:开发/暂存环境,快速迭代
  • PR 工作流:需要审查和批准的生产环境
  • 禁用:具有手动部署门控的任务关键型生产

多租户方法:

  • 命名空间隔离:团队共享集群,按命名空间分离
  • 集群隔离:每个团队获得专用集群
  • 混合:核心团队共享,外部团队隔离

秘密管理:

  • SOPS:Git 原生,age/pgp 加密,适合小团队
  • 外部秘密操作员:集成 AWS Secrets Manager、Vault、GCP Secret Manager
  • Sealed Secrets:Kubernetes 原生,单向加密

通过遵循这些模式和实践,您可以构建可靠、自动化的部署管道,随着您的组织规模扩展。