名称: pycse 描述: 当执行带有置信区间的回归分析、解决ODE、拟合模型到实验数据或缓存昂贵的科学计算时使用 - 提供围绕scipy的便利包装器，自动计算线性、非线性和多项式回归的置信区间和预测边界

pycse - Python 科学与工程计算

概述

pycse 扩展了 numpy/scipy 的便利函数，自动返回回归的置信区间，使统计分析更快且减少错误。而不是手动提取协方差矩阵和计算置信区间，pycse 直接返回它们。

核心价值: 将 100+ 行的 scipy 样板代码转换为 10 行的清晰、可重用代码。

何时使用

使用 pycse 当：

拟合模型到实验数据并需要参数置信区间
执行回归分析（线性、非线性、多项式）
使用统计标准比较模型（BIC, R²）
生成带有错误边界的预测
缓存昂贵的计算结果
从 Google Sheets 读取数据到 pandas
解决 ODE（包装 scipy 并方便接口）

不要使用当:

scipy 单独满足您的需求（两者都有效）
需要自定义优化超越最小二乘
处理 pycse 不支持的模型

快速参考

任务	pycse 函数	返回
线性回归	`regress(A, y, alpha=0.05)`	`p, pint, se`
非线性回归	`nlinfit(model, x, y, p0, alpha=0.05)`	`p, pint, se`
多项式拟合	`polyfit(x, y, deg, alpha=0.05)`	`p, pint, se`
预测区间	`predict(X, y, pars, XX, alpha=0.05)`	`prediction, intervals`
非线性预测	`nlpredict(X, y, model, loss, popt, xnew)`	`prediction, bounds`
模型比较	`bic(x, y, model, popt)`	`bic_value`
线性 BIC	`lbic(X, y, popt)`	`bic_value`
R-squared	`Rsquared(y, Y)`	`r2_value`
ODE 求解器	`ivp(f, tspan, y0, **kwargs)`	`solution`

所有回归函数返回: (p, pint, se) 其中：

p = 拟合参数
pint = 参数的置信区间
se = 标准误差

常见模式

带有置信区间的非线性回归

import numpy as np
import pycse

# 数据
time = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
concentration = np.array([100, 82, 67, 55, 45, 37, 30, 25, 20, 17, 14])

# 模型: C(t) = C0 * exp(-k * t)
def model(t, C0, k):
    return C0 * np.exp(-k * t)

# 拟合带有 95% 置信区间
p, pint, se = pycse.nlinfit(model, time, concentration, [100, 0.1])

print(f"C0 = {p[0]:.2f} ± {pint[0,1] - p[0]:.2f}")
print(f"k = {p[1]:.4f} ± {pint[1,1] - p[1]:.4f}")

# 就这样！无需手动提取协方差或计算 t 分布。

与 scipy 比较: 需要提取协方差、计算标准误差、查找 t 分布、手动计算区间（约 50+ 行）。

线性回归

import numpy as np
import pycse

# 数据矩阵 A 和观测 y
A = np.array([[1, 2], [1, 3], [1, 4], [1, 5]])  # [截距, x]
y = np.array([3, 5, 7, 9])

# 拟合: y = p[0] + p[1]*x
p, pint, se = pycse.regress(A, y)

print(f"截距: {p[0]:.2f}, 95% CI: [{pint[0,0]:.2f}, {pint[0,1]:.2f}]")
print(f"斜率: {p[1]:.2f}, 95% CI: [{pint[1,0]:.2f}, {pint[1,1]:.2f}]")

多项式拟合

import numpy as np
import pycse

x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.array([1.5, 3.8, 8.2, 14.9, 23.5, 34.8, 48.2, 64.1])

# 拟合二次方程: y = p[0] + p[1]*x + p[2]*x^2
p, pint, se = pycse.polyfit(x, y, deg=2)

print(f"系数: {p}")
print(f"95% CI: {pint}")

带有错误边界的预测

import numpy as np
import pycse

# 拟合后（见上例）
x_new = np.array([11, 12, 13])

# 线性预测
X_new = np.column_stack([np.ones(len(x_new)), x_new])
y_pred, intervals = pycse.predict(A, y, p, X_new)

print(f"预测: {y_pred}")
print(f"95% 区间: {intervals}")

# 非线性预测
y_pred_nl, bounds = pycse.nlpredict(time, concentration, model,
                                     lambda p: np.sum((concentration - model(time, *p))**2),
                                     p, x_new)

模型比较

import pycse

# 拟合两个模型
p1, _, _ = pycse.polyfit(x, y, deg=1)  # 线性
p2, _, _ = pycse.polyfit(x, y, deg=2)  # 二次

# 使用 BIC 比较（越低越好）
bic1 = pycse.lbic(X1, y, p1)
bic2 = pycse.lbic(X2, y, p2)

print(f"线性 BIC: {bic1:.2f}")
print(f"二次 BIC: {bic2:.2f}")
print(f"更好模型: {'二次' if bic2 < bic1 else '线性'}")

# R-squared 用于拟合优度
r2 = pycse.Rsquared(y, model(x, *p))
print(f"R² = {r2:.4f}")

独特功能

持久性基于哈希的缓存

缓存昂贵计算到磁盘 - 特别适用于分子动力学、DFT 计算或长时间运行的模拟。

from pycse.hashcache import HashCache, JsonCache, SqlCache

# 装饰器方法
@HashCache()
def expensive_simulation(param1, param2):
    # 长时间运行的计算
    result = complex_calculation(param1, param2)
    return result

# 第一次调用: 运行计算并缓存
result1 = expensive_simulation(1.0, 2.0)

# 第二次调用相同参数: 从缓存检索（即时）
result2 = expensive_simulation(1.0, 2.0)

# SqlCache 支持搜索缓存结果
@SqlCache(name='my_sim_cache')
def simulation(x, y):
    return complex_calc(x, y)

# 搜索缓存
cache = SqlCache(name='my_sim_cache')
results = cache.search({'x': 1.0})  # 查找所有 x=1.0 的缓存结果

缓存类型:

HashCache: 基于 Pickle（最快）
JsonCache: JSON 格式（可读，与 maggma 兼容）
SqlCache: SQLite 带有 search() 功能

Google Sheets 集成

from pycse.utils import read_gsheet

# 直接从 Google Sheet 读取到 pandas DataFrame
url = "https://docs.google.com/spreadsheets/d/YOUR_SHEET_ID/edit"
df = pycse.utils.read_gsheet(url)

# 现在与 pycse 函数一起使用
x = df['time'].values
y = df['concentration'].values
p, pint, se = pycse.nlinfit(model, x, y, p0)

模糊比较

对于带有容差的浮点比较：

from pycse.utils import feq, fgt, flt, fge, fle

# 检查值是否足够接近目标
if pycse.utils.feq(calculated_pi, np.pi, epsilon=1e-6):
    print("收敛!")

# 模糊比较
if pycse.utils.fgt(value, threshold, epsilon=1e-8):
    print("值超过阈值（在容差内）")

安装

pip install pycse

要求: Python 3.6+, numpy, scipy

常见错误

❌ 忘记非线性拟合的初始猜测:

# 会失败 - nlinfit 需要初始参数猜测
p, pint, se = pycse.nlinfit(model, x, y)  # 缺少 p0!

✅ 正确:

p, pint, se = pycse.nlinfit(model, x, y, p0=[100, 0.1])

❌ regress() 形状错误:

# regress 期望 A 是 2D 形状 (n_observations, n_parameters)
A = x  # 1D 数组 - 错误!
p, pint, se = pycse.regress(A, y)

✅ 正确:

# 添加截距列
A = np.column_stack([np.ones(len(x)), x])  # 形状: (n, 2)
p, pint, se = pycse.regress(A, y)

何时 pycse vs scipy

使用 pycse 当:

您需要置信区间（pycse 自动返回）
在工作流中进行多次回归（一致接口）
想要带有错误边界的预测区间
需要缓存昂贵计算
与 Google Sheets 集成

使用 scipy 当:

您需要自定义优化方法
进行复杂约束优化
需要 pycse 未公开的功能
构建低级计算工具

两者都有效！ pycse 包装 scipy 以便利，不是替换。

附加资源

GitHub: https://github.com/jkitchin/pycse
文档: https://kitchingroup.cheme.cmu.edu/pycse/
示例: 400+ 页在 pycse 书中覆盖科学计算主题