TensorFlow模型训练器Skill tensorflow-trainer

TensorFlow模型训练器是一个用于自动化深度学习模型训练、评估和部署的专业技能。它支持Keras API和自定义训练循环,集成了分布式训练策略、TensorBoard可视化、回调函数管理以及生产环境模型导出功能。该技能能够处理从数据加载、模型训练到SavedModel/TFLite导出的完整机器学习流水线,适用于大规模分布式训练和边缘设备部署场景。关键词:TensorFlow训练,Keras模型,分布式训练,TensorBoard,模型导出,深度学习,机器学习,AI模型部署,回调函数,混合精度训练。

深度学习 3 次安装 39 次浏览 更新于 2/23/2026

名称: tensorflow-训练器 描述: 具备回调函数、分布式策略和TensorBoard集成的TensorFlow/Keras模型训练技能。 允许使用的工具:

  • 读取
  • 写入
  • Bash
  • Glob
  • Grep

tensorflow-训练器

概述

具备回调函数、分布式策略、TensorBoard集成以及生产就绪模型导出能力的TensorFlow/Keras模型训练技能。

能力

  • 使用回调函数的Keras模型训练
  • 使用tf.GradientTape的自定义训练循环
  • 分布式策略配置(MirroredStrategy, MultiWorkerMirroredStrategy, TPUStrategy)
  • TensorBoard日志记录与可视化
  • 用于TF Serving的SavedModel导出
  • 用于边缘部署的TFLite转换
  • 混合精度训练

目标流程

  • 带有实验跟踪的模型训练流水线
  • 分布式训练编排
  • 模型部署流水线

工具与库

  • TensorFlow
  • Keras
  • TensorBoard
  • TensorFlow Serving
  • TensorFlow Lite

输入模式

{
  "type": "object",
  "required": ["modelConfig", "dataConfig", "trainingConfig"],
  "properties": {
    "modelConfig": {
      "type": "object",
      "properties": {
        "modelPath": { "type": "string" },
        "modelType": { "type": "string", "enum": ["sequential", "functional", "subclassed"] }
      }
    },
    "dataConfig": {
      "type": "object",
      "properties": {
        "trainPath": { "type": "string" },
        "valPath": { "type": "string" },
        "batchSize": { "type": "integer" },
        "prefetch": { "type": "boolean" }
      }
    },
    "trainingConfig": {
      "type": "object",
      "properties": {
        "epochs": { "type": "integer" },
        "optimizer": { "type": "string" },
        "learningRate": { "type": "number" },
        "loss": { "type": "string" },
        "metrics": { "type": "array", "items": { "type": "string" } },
        "callbacks": { "type": "array", "items": { "type": "string" } },
        "distributionStrategy": { "type": "string" }
      }
    },
    "exportConfig": {
      "type": "object",
      "properties": {
        "savedModelPath": { "type": "string" },
        "tflitePath": { "type": "string" },
        "servingSignatures": { "type": "array", "items": { "type": "string" } }
      }
    }
  }
}

输出模式

{
  "type": "object",
  "required": ["status", "metrics", "modelPath"],
  "properties": {
    "status": {
      "type": "string",
      "enum": ["success", "error", "early_stopped"]
    },
    "metrics": {
      "type": "object",
      "properties": {
        "loss": { "type": "number" },
        "valLoss": { "type": "number" },
        "accuracy": { "type": "number" },
        "valAccuracy": { "type": "number" },
        "epochsTrained": { "type": "integer" }
      }
    },
    "modelPath": {
      "type": "string"
    },
    "savedModelPath": {
      "type": "string"
    },
    "tensorboardLogDir": {
      "type": "string"
    },
    "history": {
      "type": "object",
      "description": "包含每个epoch所有指标的完整训练历史记录"
    }
  }
}

使用示例

{
  kind: 'skill',
  title: '训练TensorFlow模型',
  skill: {
    name: 'tensorflow-trainer',
    context: {
      modelConfig: {
        modelPath: 'models/cnn_model.py',
        modelType: 'functional'
      },
      dataConfig: {
        trainPath: 'data/train',
        valPath: 'data/val',
        batchSize: 64,
        prefetch: true
      },
      trainingConfig: {
        epochs: 50,
        optimizer: 'adam',
        learningRate: 0.001,
        loss: 'sparse_categorical_crossentropy',
        metrics: ['accuracy'],
        callbacks: ['early_stopping', 'model_checkpoint', 'tensorboard']
      }
    }
  }
}