name: mongodb-schema-design version: “2.1.0” description: 掌握MongoDB模式设计和数据建模模式。学习嵌入与引用、关系、规范化和模式演化。用于设计数据库、规范化数据或优化查询时。 sasmp_version: “1.3.0” bonded_agent: 03-mongodb-data-modeling bond_type: PRIMARY_BOND

生产级技能配置

capabilities:

嵌入策略
引用策略
关系建模
模式模式
演化规划

input_validation: required_context: - 用例领域 - 访问模式 optional_context: - 数据量估计 - 更新频率 - 查询需求

output_format: schema_design: 对象 collection_definitions: 数组 relationship_diagram: 字符串 validation_rules: 对象 evolution_strategy: 字符串

error_handling: common_errors: - code: SCHEMA001 condition: “无边界数组增长” recovery: “对于一对多关系使用引用而非嵌入” - code: SCHEMA002 condition: “文档超过16MB” recovery: “分割文档、使用GridFS或引用外部数据” - code: SCHEMA003 condition: “非规范化更新异常” recovery: “实现同步机制或减少非规范化”

prerequisites: mongodb_version: “4.0+” required_knowledge: - 文档模型 - 增删改查操作 design_inputs: - “访问模式列表” - “预期数据量”

testing: unit_test_template: | // 验证模式设计 const doc = { /* 示例文档 */ } const validation = await db.command({ collMod: ‘collection’, validator: jsonSchema }) expect(validation.ok).toBe(1)

MongoDB模式设计

掌握数据建模和模式模式。

快速开始

一对一：嵌入式

// 用户与单个地址 - 如果总是被一起访问则嵌入
{
  _id: ObjectId('...'),
  name: 'John',
  email: 'john@example.com',
  address: {
    street: '123 Main St',
    city: 'New York',
    zip: '10001'
  }
}

一对多：嵌入数组

// 用户与多个标签 - 如果大小有限则嵌入
{
  _id: ObjectId('...'),
  name: 'John',
  tags: ['mongodb', 'database', 'nosql'],
  posts: [
    { _id: 1, title: '帖子1', content: '...' },
    { _id: 2, title: '帖子2', content: '...' }
  ]
}

一对多：引用

// 用户与多个订单 - 如果可能较大则引用
{
  _id: ObjectId('user1'),
  name: 'John',
  email: 'john@example.com'
}

// 订单集合
{
  _id: ObjectId('order1'),
  customerId: ObjectId('user1'),
  total: 99.99
}

多对多：引用数组

// 产品与类别
{
  _id: ObjectId('product1'),
  name: 'Laptop',
  categoryIds: [
    ObjectId('electronics'),
    ObjectId('computers')
  ]
}

// 类别集合
{
  _id: ObjectId('electronics'),
  name: 'Electronics'
}

模式模式

属性模式

// 灵活存储变体属性
{
  _id: ObjectId('...'),
  productName: 'T-Shirt',
  attributes: [
    { key: 'color', value: 'blue' },
    { key: 'size', value: 'L' },
    { key: 'material', value: 'cotton' }
  ]
}

多态模式

// 同一集合中的不同文档类型
{
  _id: ObjectId('...'),
  type: 'email',
  to: 'user@example.com',
  subject: 'Hello'
}

{
  _id: ObjectId('...'),
  type: 'sms',
  phoneNumber: '+1234567890',
  message: 'Hi there'
}

树结构：邻接列表

// 父子关系
{
  _id: ObjectId('...'),
  name: 'Electronics',
  parent: null
}

{
  _id: ObjectId('...'),
  name: 'Computers',
  parent: ObjectId('electronics')
}

版本化模式

// 跟踪文档历史
{
  _id: ObjectId('...'),
  name: 'Product',
  description: '最新描述',
  versions: [
    { v: 1, name: 'Product', description: '原始', date: ISODate(...) },
    { v: 2, name: 'Product', description: '更新', date: ISODate(...) }
  ]
}

设计原则

嵌入优势

单一查询获取相关数据
相关文档的原子更新
无需连接

引用优势

避免数据重复
文档更小
灵活关系
可以独立增长

决策树

相关数据是否无边界增长？
  是 → 使用引用
  否 → 考虑嵌入

相关数据是否经常被单独访问？
  是 → 使用引用
  否 → 考虑嵌入

更新是否需要跨文档原子性？
  是 → 使用嵌入
  否 → 使用引用

Python设计示例

# 用户与嵌入式地址
users.insert_one({
    'name': 'John',
    'email': 'john@example.com',
    'address': {
        'street': '123 Main St',
        'city': 'New York'
    }
})

# 用户与订单引用
users.insert_one({
    '_id': ObjectId('...'),
    'name': 'John'
})

orders.insert_one({
    'userId': ObjectId('...'),
    'total': 99.99
})

# 使用$lookup查询
users.aggregate([
    { '$lookup': {
        'from': 'orders',
        'localField': '_id',
        'foreignField': 'userId',
        'as': 'orders'
    }}
])

最佳实践

✅ 当数据总被一起访问时嵌入 ✅ 对于无边界数组使用引用 ✅ 保持文档大小在16MB以下 ✅ 设计时考虑查询模式 ✅ 为性能小心非规范化 ✅ 规划模式演化 ✅ 使用验证模式 ✅ 记录设计决策