Python可观测性模式Skill python-observability-patterns

本技能专注于为Python应用程序构建全面的可观测性体系,涵盖结构化日志记录(structlog)、应用指标监控(Prometheus)和分布式链路追踪(OpenTelemetry)三大支柱。它提供了在生产环境中实现日志聚合、性能指标收集和请求链路追踪的代码模式和最佳实践,是构建高可靠、易维护的后端服务和微服务的关键技能。关键词:Python监控、日志记录、应用性能管理、微服务可观测性、Prometheus、OpenTelemetry、structlog。

DevOps 0 次安装 1 次浏览 更新于 2/28/2026

name: python-observability-patterns description: “Python应用程序的可观测性模式。触发词:日志记录、指标、追踪、OpenTelemetry、Prometheus、可观测性、监控、structlog、关联ID。” compatibility: “Python 3.10+。需要 structlog、opentelemetry-api、prometheus-client。” allowed-tools: “读取 写入” depends-on: [python-async-patterns] related-skills: [python-fastapi-patterns, python-cli-patterns]

Python 可观测性模式

生产环境应用程序的日志记录、指标和追踪。

使用 structlog 进行结构化日志记录

import structlog

# 配置 structlog
structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer(),
    ],
    wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
    context_class=dict,
    logger_factory=structlog.PrintLoggerFactory(),
)

logger = structlog.get_logger()

# 用法
logger.info("user_created", user_id=123, email="test@example.com")
# 输出: {"event": "user_created", "user_id": 123, "email": "test@example.com", "level": "info", "timestamp": "2024-01-15T10:00:00Z"}

请求上下文传播

import structlog
from contextvars import ContextVar
from uuid import uuid4

request_id_var: ContextVar[str] = ContextVar("request_id", default="")

def bind_request_context(request_id: str | None = None):
    """将请求ID绑定到日志记录上下文。"""
    rid = request_id or str(uuid4())
    request_id_var.set(rid)
    structlog.contextvars.bind_contextvars(request_id=rid)
    return rid

# FastAPI 中间件
@app.middleware("http")
async def request_context_middleware(request, call_next):
    request_id = request.headers.get("X-Request-ID") or str(uuid4())
    bind_request_context(request_id)
    response = await call_next(request)
    response.headers["X-Request-ID"] = request_id
    structlog.contextvars.clear_contextvars()
    return response

Prometheus 指标

from prometheus_client import Counter, Histogram, Gauge, generate_latest
from fastapi import FastAPI, Response

# 定义指标
REQUEST_COUNT = Counter(
    "http_requests_total",
    "HTTP请求总数",
    ["method", "endpoint", "status"]
)

REQUEST_LATENCY = Histogram(
    "http_request_duration_seconds",
    "HTTP请求延迟",
    ["method", "endpoint"],
    buckets=[0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
)

ACTIVE_CONNECTIONS = Gauge(
    "active_connections",
    "活动连接数"
)

# 记录指标的中间件
@app.middleware("http")
async def metrics_middleware(request, call_next):
    ACTIVE_CONNECTIONS.inc()
    start = time.perf_counter()

    response = await call_next(request)

    duration = time.perf_counter() - start
    REQUEST_COUNT.labels(
        method=request.method,
        endpoint=request.url.path,
        status=response.status_code
    ).inc()
    REQUEST_LATENCY.labels(
        method=request.method,
        endpoint=request.url.path
    ).observe(duration)
    ACTIVE_CONNECTIONS.dec()

    return response

# 指标端点
@app.get("/metrics")
async def metrics():
    return Response(
        content=generate_latest(),
        media_type="text/plain"
    )

OpenTelemetry 追踪

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

# 设置
provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="localhost:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

tracer = trace.get_tracer(__name__)

# 手动埋点
async def process_order(order_id: int):
    with tracer.start_as_current_span("process_order") as span:
        span.set_attribute("order_id", order_id)

        with tracer.start_as_current_span("validate_order"):
            await validate(order_id)

        with tracer.start_as_current_span("charge_payment"):
            await charge(order_id)

快速参考

用途
structlog 结构化日志记录
prometheus-client 指标收集
opentelemetry 分布式追踪
指标类型 使用场景
Counter 总请求数、错误数
Histogram 延迟、大小
Gauge 当前连接数、队列大小

附加资源

  • ./references/structured-logging.md - structlog 配置、格式化程序
  • ./references/metrics.md - Prometheus 模式、自定义指标
  • ./references/tracing.md - OpenTelemetry、分布式追踪

资产

  • ./assets/logging-config.py - 生产环境日志配置

另请参阅

先决条件:

  • python-async-patterns - 异步上下文传播

相关技能:

  • python-fastapi-patterns - 用于指标/追踪的API中间件
  • python-cli-patterns - CLI日志记录模式

集成技能:

  • python-database-patterns - 数据库查询追踪