监控系统健康,跟踪性能指标,并分析遥测数据。
聊天命令
系统指标
/metrics 显示当前指标
/metrics system CPU,内存,延迟
/metrics api API性能统计
/metrics ws WebSocket健康
交易指标
/metrics trades 交易执行统计
/metrics fills 填充率指标
/metrics latency 订单延迟统计
/metrics errors 错误率
自定义指标
/metrics track <name> <value> 跟踪自定义指标
/metrics query <name> 查询指标历史
/metrics alert <name> > 100 设置指标警报
导出与报告
/metrics export csv 导出到CSV
/metrics report daily 生成日报告
/metrics dashboard 打开指标仪表板
TypeScript API参考
创建指标服务
import { createMetricsService } from 'clodds/metrics';
const metrics = createMetricsService({
// 收集
collectInterval: 5000, // ms
retention: '30d',
// 存储
storage: 'sqlite',
dbPath: './metrics.db',
// 导出
enablePrometheus: true,
prometheusPort: 9090,
});
// 开始收集
await metrics.start();
系统指标
const system = await metrics.getSystemMetrics();
console.log('=== 系统健康 ===');
console.log(`CPU使用率: ${system.cpuUsage}%`);
console.log(`内存: ${system.memoryUsed}MB / ${system.memoryTotal}MB`);
console.log(`正常运行时间: ${system.uptimeHours}小时`);
console.log(`活动连接: ${system.activeConnections}`);
console.log(`事件循环延迟: ${system.eventLoopLag}ms`);
API指标
const api = await metrics.getApiMetrics();
console.log('=== API性能 ===');
console.log(`总请求数: ${api.totalRequests}`);
console.log(`请求/秒: ${api.requestsPerSecond}`);
console.log(`平均延迟: ${api.avgLatency}ms`);
console.log(`P50延迟: ${api.p50Latency}ms`);
console.log(`P95延迟: ${api.p95Latency}ms`);
console.log(`P99延迟: ${api.p99Latency}ms`);
console.log(`错误率: ${api.errorRate}%`);
console.log('
按端点:');
for (const endpoint of api.byEndpoint) {
console.log(` ${endpoint.path}: ${endpoint.avgLatency}ms (${endpoint.calls}次调用)`);
}
WebSocket指标
const ws = await metrics.getWebSocketMetrics();
console.log('=== WebSocket健康 ===');
console.log(`活动连接: ${ws.activeConnections}`);
console.log(`消息/秒: ${ws.messagesPerSecond}`);
console.log(`平均消息大小: ${ws.avgMessageSize}字节`);
console.log(`重新连接: ${ws.reconnections}`);
console.log(`丢失的消息: ${ws.droppedMessages}`);
console.log('
按Feed:');
for (const feed of ws.byFeed) {
console.log(` ${feed.name}: ${feed.messagesPerSecond}/s, ${feed.latency}ms延迟`);
}
交易执行指标
const trades = await metrics.getTradeMetrics();
console.log('=== 交易执行 ===');
console.log(`总订单数: ${trades.totalOrders}`);
console.log(`填充率: ${trades.fillRate}%`);
console.log(`部分填充: ${trades.partialFillRate}%`);
console.log(`拒绝率: ${trades.rejectionRate}%`);
console.log(`平均填充时间: ${trades.avgFillTime}ms`);
console.log(`平均滑点: ${trades.avgSlippage}%`);
console.log('
按平台:');
for (const platform of trades.byPlatform) {
console.log(` ${platform.name}:`);
console.log(` 填充率: ${platform.fillRate}%`);
console.log(` 平均延迟: ${platform.avgLatency}ms`);
}
延迟分解
const latency = await metrics.getLatencyBreakdown();
console.log('=== 延迟分解 ===');
console.log(`总订单延迟: ${latency.total}ms`);
console.log(` 信号处理: ${latency.signalProcessing}ms`);
console.log(` 订单构建: ${latency.orderConstruction}ms`);
console.log(` 网络往返: ${latency.networkRoundTrip}ms`);
console.log(` 交易所处理: ${latency.exchangeProcessing}ms`);
console.log(` 确认: ${latency.confirmation}ms`);
错误指标
const errors = await metrics.getErrorMetrics();
console.log('=== 错误率 ===');
console.log(`总错误数: ${errors.totalErrors}`);
console.log(`错误率: ${errors.errorRate}%`);
console.log(`每小时错误数: ${errors.errorsPerHour}`);
console.log('
按类型:');
for (const type of errors.byType) {
console.log(` ${type.name}: ${type.count} (${type.percentage}%)`);
}
console.log('
按平台:');
for (const platform of errors.byPlatform) {
console.log(` ${platform.name}: ${platform.errorRate}%`);
}
自定义指标
// 跟踪自定义指标
metrics.track('edge_detected', 1, {
market: 'trump-2028',
edgeSize: 0.05,
});
// 增加计数器
metrics.increment('trades_executed');
// 设置量表
metrics.gauge('active_positions', 5);
// 记录时间
const timer = metrics.startTimer('order_execution');
// ...执行订单...
timer.end();
// 直方图
metrics.histogram('slippage', 0.003, {
platform: 'polymarket',
});
查询指标
const query = await metrics.query({
metric: 'edge_detected',
period: '7d',
aggregation: 'sum',
groupBy: 'market',
});
console.log('按市场检测边缘:');
for (const row of query.results) {
console.log(` ${row.market}: ${row.value}次检测`);
}
指标警报
// 设置警报阈值
metrics.setAlert({
metric: 'error_rate',
condition: '>',
threshold: 5, // > 5%错误率
window: '5m',
action: 'notify',
});
metrics.setAlert({
metric: 'latency_p99',
condition: '>',
threshold: 1000, // > 1000ms
window: '1m',
action: 'escalate',
});
// 警报处理程序
metrics.on('alert', (alert) => {
console.log(`🚨 警报: ${alert.metric} ${alert.condition} ${alert.threshold}`);
console.log(` 当前值: ${alert.currentValue}`);
});
导出指标
// 导出到CSV
await metrics.export({
format: 'csv',
metrics: ['api_latency', 'trade_fill_rate', 'error_rate'],
period: '30d',
outputPath: './metrics-export.csv',
});
// 导出到Prometheus
const prometheusFormat = metrics.toPrometheus();
// 导出到JSON
const jsonMetrics = await metrics.toJSON({
period: '24h',
});
生成报告
const report = await metrics.generateReport({
type: 'daily',
include: ['summary', 'api', 'trades', 'errors'],
});
console.log('=== 日指标报告 ===');
console.log(`日期: ${report.date}`);
console.log(`
摘要:`);
console.log(` 正常运行时间: ${report.summary.uptime}%`);
console.log(` 总请求数: ${report.summary.totalRequests}`);
console.log(` 总交易数: ${report.summary.totalTrades}`);
console.log(` 错误率: ${report.summary.errorRate}%`);
console.log(`
亮点:`);
for (const highlight of report.highlights) {
console.log(` - ${highlight}`);
}
实时流
// 实时流指标
const stream = metrics.stream(['cpu', 'memory', 'latency']);
stream.on('data', (data) => {
console.log(`CPU: ${data.cpu}%, 内存: ${data.memory}MB, 延迟: ${data.latency}ms`);
});
// 停止流
stream.stop();
指标类型
| 类型 |
描述 |
示例 |
| 计数器 |
单调递增 |
trades_total |
| 量表 |
特定时间点的值 |
active_positions |
| 直方图 |
分布 |
latency_ms |
| 计时器 |
持续时间测量 |
order_execution_time |
内置指标
| 类别 |
指标 |
| 系统 |
cpu_usage, memory_used, uptime, connections |
| API |
request_count, latency_p50/p95/p99, error_rate |
| WebSocket |
messages_per_sec, lag, reconnections |
| 交易 |
orders_total, fill_rate, slippage, execution_time |
| 错误 |
error_count, error_rate_by_type |
最佳实践
- 监控延迟百分位数 — P99比平均值更重要
- 主动设置警报 — 在用户注意到之前捕捉问题
- 跟踪自定义指标 — 特定于业务的KPIs
- 查看日报告 — 及早发现趋势
- 导出分析 — 使用外部工具进行深入分析