名称: atheris 类型: 模糊测试器描述: > Atheris是一个基于libFuzzer的覆盖引导Python模糊测试器。用于模糊测试纯Python代码和Python C扩展。

Atheris

Atheris是一个基于libFuzzer的覆盖引导Python模糊测试器。它支持模糊测试纯Python代码和Python C扩展，并集成了AddressSanitizer以检测内存损坏问题。

何时使用

模糊测试器	最适合	复杂度
Atheris	Python代码和C扩展	低-中
Hypothesis	基于属性的测试	低
python-afl	AFL风格模糊测试	中

选择Atheris当：

需要对纯Python代码进行覆盖引导模糊测试
测试Python C扩展的内存损坏问题
希望集成libFuzzer生态系统
需要AddressSanitizer支持

快速开始

import sys
import atheris

@atheris.instrument_func
def test_one_input(data: bytes):
    if len(data) == 4:
        if data[0] == 0x46:  # "F"
            if data[1] == 0x55:  # "U"
                if data[2] == 0x5A:  # "Z"
                    if data[3] == 0x5A:  # "Z"
                        raise RuntimeError("You caught me")

def main():
    atheris.Setup(sys.argv, test_one_input)
    atheris.Fuzz()

if __name__ == "__main__":
    main()

运行：

python fuzz.py

安装

Atheris支持32位和64位Linux以及macOS。我们推荐在Linux上进行模糊测试，因为管理更简单且通常更快。

前提条件

Python 3.7或更高版本
最新版本的clang（建议最新版本）
对于Docker用户：Docker Desktop

Linux/macOS

uv pip install atheris

Docker环境（推荐）

对于具有所有依赖项配置的完全可操作的Linux环境：

# https://hub.docker.com/_/python
ARG PYTHON_VERSION=3.11

FROM python:$PYTHON_VERSION-slim-bookworm

RUN python --version

RUN apt update && apt install -y \
    ca-certificates \
    wget \
    && rm -rf /var/lib/apt/lists/*

# LLVM为Debian 12（Bookworm）构建版本15-19
# https://apt.llvm.org/bookworm/dists/
ARG LLVM_VERSION=19

RUN echo "deb http://apt.llvm.org/bookworm/ llvm-toolchain-bookworm-$LLVM_VERSION main" > /etc/apt/sources.list.d/llvm.list
RUN echo "deb-src http://apt.llvm.org/bookworm/ llvm-toolchain-bookworm-$LLVM_VERSION main" >> /etc/apt/sources.list.d/llvm.list
RUN wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key > /etc/apt/trusted.gpg.d/apt.llvm.org.asc

RUN apt update && apt install -y \
    build-essential \
    clang-$LLVM_VERSION \
    && rm -rf /var/lib/apt/lists/*

ENV APP_DIR "/app"
RUN mkdir $APP_DIR
WORKDIR $APP_DIR

ENV VIRTUAL_ENV "/opt/venv"
RUN python -m venv $VIRTUAL_ENV
ENV PATH "$VIRTUAL_ENV/bin:$PATH"

# https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#step-1-compiling-your-extension
ENV CC="clang-$LLVM_VERSION"
ENV CFLAGS "-fsanitize=address,fuzzer-no-link"
ENV CXX="clang++-$LLVM_VERSION"
ENV CXXFLAGS "-fsanitize=address,fuzzer-no-link"
ENV LDSHARED="clang-$LLVM_VERSION -shared"
ENV LDSHAREDXX="clang++-$LLVM_VERSION -shared"
ENV ASAN_SYMBOLIZER_PATH="/usr/bin/llvm-symbolizer-$LLVM_VERSION"

# 允许Atheris查找模糊测试器消毒剂共享库
# https://github.com/google/atheris#building-from-source
RUN LIBFUZZER_LIB=$($CC -print-file-name=libclang_rt.fuzzer_no_main-$(uname -m).a) \
    python -m pip install --no-binary atheris atheris

# https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#option-a-sanitizerlibfuzzer-preloads
ENV LD_PRELOAD "$VIRTUAL_ENV/lib/python3.11/site-packages/asan_with_fuzzer.so"

# 1. 暂时跳过内存分配失败，它们常见且影响小（DoS）
# 2. https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#leak-detection
ENV ASAN_OPTIONS "allocator_may_return_null=1,detect_leaks=0"

CMD ["/bin/bash"]

构建和运行：

docker build -t atheris .
docker run -it atheris

验证

python -c "import atheris; print(atheris.__version__)"

编写测试程序

纯Python的测试程序结构

import sys
import atheris

@atheris.instrument_func
def test_one_input(data: bytes):
    """
    模糊测试入口点。使用随机字节序列调用。

    参数:
        data: 由模糊测试器生成的随机字节
    """
    # 如果需要，添加输入验证
    if len(data) < 1:
        return

    # 调用目标函数
    try:
        your_target_function(data)
    except ValueError:
        # 应捕获预期的异常
        pass
    # 让意外的异常崩溃（这是我们要找的！）

def main():
    atheris.Setup(sys.argv, test_one_input)
    atheris.Fuzz()

if __name__ == "__main__":
    main()

测试程序规则

做	不做
使用`@atheris.instrument_func`进行覆盖	忘记检测目标代码
捕获预期异常	不加区分地捕获所有异常
使用`atheris.instrument_imports()`处理库	在`atheris.Setup()`后导入模块
保持测试程序确定性	使用随机性或基于时间的行为

另请参见： 有关详细测试程序编写技术、处理复杂输入的模式以及高级策略，请参阅模糊测试-测试程序-编写技术技能。

模糊测试纯Python代码

对于模糊测试更广泛的应用程序或库部分，使用检测函数：

import atheris
with atheris.instrument_imports():
    import your_module
    from another_module import target_function

def test_one_input(data: bytes):
    target_function(data)

atheris.Setup(sys.argv, test_one_input)
atheris.Fuzz()

检测选项：

atheris.instrument_func - 单个函数检测的装饰器
atheris.instrument_imports() - 用于检测所有导入模块的上下文管理器
atheris.instrument_all() - 检测系统范围内的所有Python代码

模糊测试Python C扩展

Python C扩展需要使用特定标志进行编译以支持检测和消毒剂。

环境配置

如果使用提供的Dockerfile，这些已经配置好。对于本地设置：

export CC="clang"
export CFLAGS="-fsanitize=address,fuzzer-no-link"
export CXX="clang++"
export CXXFLAGS="-fsanitize=address,fuzzer-no-link"
export LDSHARED="clang -shared"

示例：模糊测试cbor2

从源代码安装扩展：

CBOR2_BUILD_C_EXTENSION=1 python -m pip install --no-binary cbor2 cbor2==5.6.4

--no-binary标志确保C扩展在本地编译并启用检测。

创建cbor2-fuzz.py：

import sys
import atheris

# _cbor2确保导入C库
from _cbor2 import loads

def test_one_input(data: bytes):
    try:
        loads(data)
    except Exception:
        # 我们寻找内存损坏，而非Python异常
        pass

def main():
    atheris.Setup(sys.argv, test_one_input)
    atheris.Fuzz()

if __name__ == "__main__":
    main()

运行：

python cbor2-fuzz.py

重要： 当在本地运行（非Docker中）时，必须手动设置LD_PRELOAD。

语料库管理

创建初始语料库

mkdir corpus
# 添加种子输入
echo "test data" > corpus/seed1
echo '{"key": "value"}' > corpus/seed2

使用语料库运行：

python fuzz.py corpus/

语料库最小化

Atheris继承自libFuzzer的语料库最小化：

python fuzz.py -merge=1 new_corpus/ old_corpus/

另请参见： 有关语料库创建策略、字典和种子选择，请参阅模糊测试-语料库技术技能。

运行活动

基本运行

python fuzz.py

使用语料库目录

python fuzz.py corpus/

常见选项

# 运行10分钟
python fuzz.py -max_total_time=600

# 限制输入大小
python fuzz.py -max_len=1024

# 运行多个工作器
python fuzz.py -workers=4 -jobs=4

解释输出

输出	含义
`NEW cov: X`	发现新覆盖，语料库扩展
`pulse cov: X`	周期性状态更新
`exec/s: X`	每秒执行次数（吞吐量）
`corp: X/Yb`	语料库大小：X个输入，Y字节总计
`ERROR: libFuzzer`	检测到崩溃

消毒剂集成

AddressSanitizer (ASan)

当使用提供的Docker环境或使用适当标志编译时，AddressSanitizer自动集成。

对于本地设置：

export CFLAGS="-fsanitize=address,fuzzer-no-link"
export CXXFLAGS="-fsanitize=address,fuzzer-no-link"

配置ASan行为：

export ASAN_OPTIONS="allocator_may_return_null=1,detect_leaks=0"

LD_PRELOAD配置

对于原生扩展模糊测试：

export LD_PRELOAD="$(python -c 'import atheris; import os; print(os.path.join(os.path.dirname(atheris.__file__), "asan_with_fuzzer.so"))')"

另请参见： 有关详细消毒剂配置、常见问题和高级标志，请参阅address-sanitizer和undefined-behavior-sanitizer技术技能。

常见消毒剂问题

问题	解决方案
`LD_PRELOAD`未设置	导出`LD_PRELOAD`指向`asan_with_fuzzer.so`
内存分配失败	设置`ASAN_OPTIONS=allocator_may_return_null=1`
泄漏检测噪声	设置`ASAN_OPTIONS=detect_leaks=0`
缺少符号化器	设置`ASAN_SYMBOLIZER_PATH`为`llvm-symbolizer`

高级用法

技巧和窍门

技巧	为什么有帮助
早期使用`atheris.instrument_imports()`	确保所有导入都检测覆盖
以小的`max_len`开始	更快的初始模糊测试，逐步增加
对结构化格式使用字典	帮助模糊测试器理解格式令牌
运行多个并行实例	更好的覆盖探索

自定义检测

微调检测内容：

import atheris

# 仅检测特定模块
with atheris.instrument_imports():
    import target_module
# 不检测测试程序代码

def test_one_input(data: bytes):
    target_module.parse(data)

性能调优

设置	影响
`-max_len=N`	更小的值 = 更快执行
`-workers=N -jobs=N`	并行模糊测试以更快覆盖
`ASAN_OPTIONS=fast_unwind_on_malloc=0`	更好的堆栈跟踪，更慢执行

UndefinedBehaviorSanitizer (UBSan)

添加UBSan以捕获额外错误：

export CFLAGS="-fsanitize=address,undefined,fuzzer-no-link"
export CXXFLAGS="-fsanitize=address,undefined,fuzzer-no-link"

注意：如果使用容器化设置，请修改Dockerfile中的标志。

真实世界示例

示例：纯Python解析器

import sys
import atheris
import json

@atheris.instrument_func
def test_one_input(data: bytes):
    try:
        # 模糊测试Python的JSON解析器
        json.loads(data.decode('utf-8', errors='ignore'))
    except (ValueError, UnicodeDecodeError):
        pass

def main():
    atheris.Setup(sys.argv, test_one_input)
    atheris.Fuzz()

if __name__ == "__main__":
    main()

示例：HTTP请求解析

import sys
import atheris

with atheris.instrument_imports():
    from urllib3 import HTTPResponse
    from io import BytesIO

def test_one_input(data: bytes):
    try:
        # 模糊测试HTTP响应解析
        fake_response = HTTPResponse(
            body=BytesIO(data),
            headers={},
            preload_content=False
        )
        fake_response.read()
    except Exception:
        pass

def main():
    atheris.Setup(sys.argv, test_one_input)
    atheris.Fuzz()

if __name__ == "__main__":
    main()

故障排除

问题	原因	解决方案
无覆盖增加	不良种子语料库或目标未检测	添加更好的种子，验证`instrument_imports()`
执行速度慢	ASan开销或大输入	减少`max_len`，使用`ASAN_OPTIONS=fast_unwind_on_malloc=1`
导入错误	在检测前导入模块	将导入移到`instrument_imports()`上下文中
无ASan输出的段错误	缺少`LD_PRELOAD`	设置`LD_PRELOAD`为`asan_with_fuzzer.so`路径
构建失败	错误编译器或缺少标志	验证`CC`、`CFLAGS`和clang版本

Atheris GitHub仓库 官方仓库，包含安装说明、示例和文档，用于模糊测试纯Python和原生扩展。

原生扩展模糊测试指南 全面指南，涵盖编译标志、LD_PRELOAD设置、消毒剂配置和Python C扩展的故障排除。

持续模糊测试Python C扩展 Trail of Bits博客文章，涵盖CI/CD集成、ClusterFuzzLite设置和持续集成管道中模糊测试Python C扩展的真实世界示例。

ClusterFuzzLite Python集成 使用ClusterFuzzLite将Atheris模糊测试集成到CI/CD管道中的指南，用于自动化持续模糊测试。

视频资源

视频和教程可在主Atheris文档和libFuzzer资源中找到。

技能	使用场景
模糊测试-测试程序-编写	编写有效测试程序的详细指南
address-sanitizer	模糊测试期间的内存错误检测
undefined-behavior-sanitizer	在C扩展中捕获未定义行为
覆盖分析	测量和改进代码覆盖
模糊测试-语料库	构建和管理种子语料库

技能	何时考虑
hypothesis	基于属性的测试，带类型感知生成
python-afl	当Atheris不可用时，用于Python的AFL风格模糊测试

名称: atheris 类型: 模糊测试器 描述: > Atheris是一个基于libFuzzer的覆盖引导Python模糊测试器。 用于模糊测试纯Python代码和Python C扩展。