PuppeteerSkill puppeteer

Puppeteer是一款由Google开发的Node.js库,用于自动化控制无头Chrome或Chromium浏览器。它主要用于网页爬虫、自动化测试、PDF生成和网页截图。通过模拟用户操作,Puppeteer可以高效地处理动态网页内容,是前端开发、测试和DevOps领域的强大工具。

测试 0 次安装 2 次浏览 更新于 2/28/2026

name: puppeteer description: 使用Puppeteer(Google)进行浏览器自动化和PDF生成。支持无头Chrome控制,用于网页爬虫、截图、PDF生成和自动化测试。 metadata: short-description: 浏览器自动化和PDF生成 source: repository: https://github.com/puppeteer/puppeteer license: Apache-2.0 stars: 89k+

Puppeteer 工具

描述

用于PDF生成、截图、网页爬虫和测试的无头Chrome/Chromium自动化工具。

来源

安装

npm install puppeteer

使用示例

从HTML生成PDF

import puppeteer from 'puppeteer';

async function generatePDF(html: string, outputPath: string) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.setContent(html, { waitUntil: 'networkidle0' });
  
  await page.pdf({
    path: outputPath,
    format: 'A4',
    margin: { top: '20mm', right: '20mm', bottom: '20mm', left: '20mm' },
    printBackground: true,
  });
  
  await browser.close();
}

// 使用示例
const html = `
  <html>
    <head><style>body { font-family: Arial; }</style></head>
    <body><h1>发票 #001</h1><p>总计:$100.00</p></body>
  </html>
`;
await generatePDF(html, 'invoice.pdf');

截图

async function takeScreenshot(url: string, outputPath: string) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.setViewport({ width: 1920, height: 1080 });
  await page.goto(url, { waitUntil: 'networkidle2' });
  
  await page.screenshot({
    path: outputPath,
    fullPage: true,
    type: 'png',
  });
  
  await browser.close();
}

网页爬虫

async function scrapeData(url: string) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.goto(url, { waitUntil: 'domcontentloaded' });
  
  const data = await page.evaluate(() => {
    const items = document.querySelectorAll('.product');
    return Array.from(items).map(item => ({
      title: item.querySelector('h2')?.textContent?.trim(),
      price: item.querySelector('.price')?.textContent?.trim(),
    }));
  });
  
  await browser.close();
  return data;
}

表单自动化

async function submitForm(url: string, formData: Record<string, string>) {
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();
  
  await page.goto(url);
  
  // 填写表单字段
  for (const [selector, value] of Object.entries(formData)) {
    await page.type(selector, value);
  }
  
  // 提交
  await page.click('button[type="submit"]');
  await page.waitForNavigation();
  
  await browser.close();
}

PDF选项

interface PDFOptions {
  path?: string;
  scale?: number;                    // 0.1 - 2, 默认 1
  displayHeaderFooter?: boolean;
  headerTemplate?: string;
  footerTemplate?: string;
  printBackground?: boolean;
  landscape?: boolean;
  pageRanges?: string;               // '1-5, 8, 11-13'
  format?: 'Letter' | 'Legal' | 'A4' | 'A3';
  width?: string;
  height?: string;
  margin?: { top, right, bottom, left };
}

标签

浏览器, pdf, 截图, 自动化, 爬虫

兼容性

  • Codex: ✅
  • Claude Code: ✅