name: puppeteer description: 使用Puppeteer(Google)进行浏览器自动化和PDF生成。支持无头Chrome控制,用于网页爬虫、截图、PDF生成和自动化测试。 metadata: short-description: 浏览器自动化和PDF生成 source: repository: https://github.com/puppeteer/puppeteer license: Apache-2.0 stars: 89k+
Puppeteer 工具
描述
用于PDF生成、截图、网页爬虫和测试的无头Chrome/Chromium自动化工具。
来源
- 代码库:puppeteer/puppeteer
- 许可证:Apache-2.0
- 维护者:Google
安装
npm install puppeteer
使用示例
从HTML生成PDF
import puppeteer from 'puppeteer';
async function generatePDF(html: string, outputPath: string) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(html, { waitUntil: 'networkidle0' });
await page.pdf({
path: outputPath,
format: 'A4',
margin: { top: '20mm', right: '20mm', bottom: '20mm', left: '20mm' },
printBackground: true,
});
await browser.close();
}
// 使用示例
const html = `
<html>
<head><style>body { font-family: Arial; }</style></head>
<body><h1>发票 #001</h1><p>总计:$100.00</p></body>
</html>
`;
await generatePDF(html, 'invoice.pdf');
截图
async function takeScreenshot(url: string, outputPath: string) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.goto(url, { waitUntil: 'networkidle2' });
await page.screenshot({
path: outputPath,
fullPage: true,
type: 'png',
});
await browser.close();
}
网页爬虫
async function scrapeData(url: string) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url, { waitUntil: 'domcontentloaded' });
const data = await page.evaluate(() => {
const items = document.querySelectorAll('.product');
return Array.from(items).map(item => ({
title: item.querySelector('h2')?.textContent?.trim(),
price: item.querySelector('.price')?.textContent?.trim(),
}));
});
await browser.close();
return data;
}
表单自动化
async function submitForm(url: string, formData: Record<string, string>) {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto(url);
// 填写表单字段
for (const [selector, value] of Object.entries(formData)) {
await page.type(selector, value);
}
// 提交
await page.click('button[type="submit"]');
await page.waitForNavigation();
await browser.close();
}
PDF选项
interface PDFOptions {
path?: string;
scale?: number; // 0.1 - 2, 默认 1
displayHeaderFooter?: boolean;
headerTemplate?: string;
footerTemplate?: string;
printBackground?: boolean;
landscape?: boolean;
pageRanges?: string; // '1-5, 8, 11-13'
format?: 'Letter' | 'Legal' | 'A4' | 'A3';
width?: string;
height?: string;
margin?: { top, right, bottom, left };
}
标签
浏览器, pdf, 截图, 自动化, 爬虫
兼容性
- Codex: ✅
- Claude Code: ✅