A/B 测试设计技能

概述

专门用于统计实验设计与分析能力的技能。使产品团队能够设计严谨的实验、计算样本量，并以统计置信度解释结果。

能力

实验设计

计算实验所需的样本量
设计实验变体和假设
定义成功指标和护栏指标
创建实验文档模板
设计多变量测试（A/B/n）
规划序贯实验和贝叶斯实验

统计分析

验证结果的统计显著性
计算实际显著性和效应量
检测交互效应和细分群体
执行功效分析
计算置信区间
处理多重比较校正

决策支持

推荐发布/迭代/终止决策
识别特定细分群体的影响
评估长期与短期效应
生成实验报告
跟踪实验速度指标

目标流程

此技能与以下流程集成：

product-market-fit.js - 用于产品市场契合度假设的验证实验
conversion-funnel-analysis.js - 转化漏斗优化实验
beta-program.js - 测试阶段的 A/B 测试

输入模式

{
  "type": "object",
  "properties": {
    "experimentType": {
      "type": "string",
      "enum": ["ab", "multivariate", "sequential", "bandit"],
      "description": "要设计的实验类型"
    },
    "hypothesis": {
      "type": "string",
      "description": "要测试的假设"
    },
    "primaryMetric": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "baseline": { "type": "number" },
        "mde": { "type": "number", "description": "最小可检测效应" }
      }
    },
    "guardrailMetrics": {
      "type": "array",
      "items": { "type": "string" },
      "description": "不应倒退的指标"
    },
    "trafficAllocation": {
      "type": "number",
      "description": "用于实验的流量百分比"
    },
    "confidenceLevel": {
      "type": "number",
      "default": 0.95,
      "description": "统计置信水平"
    }
  },
  "required": ["experimentType", "hypothesis", "primaryMetric"]
}

输出模式

{
  "type": "object",
  "properties": {
    "experimentPlan": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "hypothesis": { "type": "string" },
        "variants": { "type": "array", "items": { "type": "object" } },
        "sampleSize": { "type": "number" },
        "duration": { "type": "string" },
        "metrics": { "type": "object" }
      }
    },
    "powerAnalysis": {
      "type": "object",
      "properties": {
        "requiredSampleSize": { "type": "number" },
        "estimatedDuration": { "type": "string" },
        "power": { "type": "number" }
      }
    },
    "implementation": {
      "type": "object",
      "properties": {
        "trackingEvents": { "type": "array", "items": { "type": "string" } },
        "segmentation": { "type": "array", "items": { "type": "string" } },
        "rolloutPlan": { "type": "string" }
      }
    },
    "analysisFramework": {
      "type": "object",
      "properties": {
        "primaryAnalysis": { "type": "string" },
        "secondaryAnalyses": { "type": "array", "items": { "type": "string" } },
        "decisionCriteria": { "type": "object" }
      }
    }
  }
}

使用示例

const experimentDesign = await executeSkill('ab-test-design', {
  experimentType: 'ab',
  hypothesis: '在定价页面添加社会认同证明可将转化率提高 10%',
  primaryMetric: {
    name: 'pricing_page_conversion',
    baseline: 0.05,
    mde: 0.10
  },
  guardrailMetrics: ['revenue_per_visitor', 'bounce_rate'],
  trafficAllocation: 50,
  confidenceLevel: 0.95
});

依赖项

用于功效分析的统计库
实验平台集成（Optimizely、LaunchDarkly 等）