Selenium vs Playwright vs Puppeteer 2026: 35-55 pages/min winner

January 9, 2026 · 7 min read

Software Developer & Automation Specialist

I build production AI agents, web scrapers, and automation pipelines. Most of what I publish here comes from the actual problems they run into: proxies that get banned, anti-bot stacks that fingerprint your client, RAG that drifts when the underlying data moves. Stack: Python, TypeScript, Go, FastAPI, LangChain, Crawlee, Playwright, deployed on AWS, GCP, and Cloudflare.

Quick Answer

For new scraping projects in 2026, Playwright wins: it runs Chromium, Firefox, and WebKit from one install with built-in auto-wait and trace viewer, hitting ~35–55 pages/min sequentially on static URLs. Puppeteer 25 is a tighter Chrome/Firefox CDP wrapper with WebDriver BiDi support and lower idle RAM. Selenium 4 still leads when WebDriver Grid, Java/C#, or BiDi network logging are non-negotiable.

If you are choosing a driver for web scraping and automation in 2026, the decision is mostly about protocol, waiting model, and browser coverage—not brand loyalty. This guide compares Selenium, Playwright, and Puppeteer feature by feature, sketches realistic performance expectations, shows minimal starter code for each, and ends with Playwright on Apify Crawlee as the default production path.

Quick verdict

Playwright is the best choice for web scraping in 2026 — faster than Selenium, better supported than Puppeteer, with built-in auto-waiting and multi-browser support. Selenium 4 is best for legacy test suites or BiDi-mandated environments.

Use Puppeteer when you are Chrome-only (or Chrome + Firefox via BiDi), Node-only, and want a minimal CDP wrapper. Use Selenium when you must integrate with existing WebDriver-based QA, non-Node stacks, or Selenium Grid that already standardised on WebDriver.

Comparison at a glance

Dimension	Selenium	Puppeteer	Playwright
Performance	Slowest for typical DOM automation (WebDriver round-trips; manual waits add latency)	Fast on Chromium (direct CDP over WebSocket)	Fast on Chromium; comparable to Puppeteer for same work; less polling than Selenium
API quality	Verbose; waits are mostly explicit (`WebDriverWait`, expected conditions)	Lean, low-level CDP-centric API	Strong auto-waiting, locators, tracing, codegen
Browser support	Chrome, Firefox, Safari/WebKit, Edge (via drivers)	Chromium only (Chrome/Edge family)	Chromium, Firefox, WebKit out of the box
Headless support	Yes (driver + browser flags)	Yes (headless by default in v25; old "shell" mode opt-in)	Yes (consistent headless across engines)
Community & docs	Huge (oldest ecosystem); tons of Stack Overflow answers	Large (Chrome automation); Node-centric	Very large; active scraping/automation content
Best use case	Legacy WebDriver tests, orgs standardised on Selenium grids	Chrome-only bots, PDF/screenshot microservices, Node CDP scripts	Default for new scraping: multi-browser, reliable waits, Crawlee integration

Feature-by-feature comparison

Feature	Selenium	Puppeteer	Playwright
Wire protocol	W3C WebDriver + BiDi (Selenium 4)	CDP (WebSocket) or WebDriver BiDi (Puppeteer 25)	Playwright protocol (WebSocket; CDP where needed)
Auto-waiting	Manual (expected conditions, sleeps)	Partial; you often wait for selectors/navigation yourself	Built-in actionability checks before clicks/fills
Contexts / isolation	New driver/session per profile (heavy)	BrowserContexts exist but ergonomics weaker than Playwright	First-class BrowserContext (cookies, storage, proxy per context)
Parallelism	Scale via grid/workers; more moving parts	Parallel pages; watch memory and shared state	Parallel contexts/pages; designed for concurrency
Language bindings	Java, Python, C#, Ruby, JS, …	Node.js	Node, Python, .NET, Java
Stealth / fingerprinting	Community patches (fragile across versions)	Same cat-and-mouse as any Chromium driver	Same; pair with proxies and sensible crawl policy
Screenshots / PDF	Yes	Yes	Yes
Tracing / debugging	Varies by binding	CDP tools	Built-in trace viewer, codegen, UI mode

Scraping takeaway: Selenium pays a per-command HTTP tax and pushes wait complexity to you. Puppeteer and Playwright talk WebSocket to the browser and can react to events instead of polling. Playwright goes further with unified cross-browser binaries and actionability before interactions—fewer flakes on React/Vue SPAs.

Performance and benchmarks

Micro-benchmarks differ by site, hardware, and whether you measure raw navigation, DOM queries, or end-to-end “scrape 1k listings”. Published comparisons and production experience generally align on:

WebDriver (Selenium) — Higher latency per operation than CDP/WebSocket drivers because commands are serialized HTTP requests to chromedriver (or other drivers). Throughput drops when you add explicit sleeps or aggressive polling for dynamic UI.
Puppeteer vs Playwright (Chromium) — Often similar raw speed for equivalent CDP work. Differences show up in ergonomics (auto-wait, contexts, fixtures) and multi-browser needs, not a universal 2× gap either way.
Flake rate — Playwright’s auto-waiting usually wins on SPA-heavy targets because fewer race conditions mean fewer retries and lower total wall time in production.

Treat vendor numbers as order-of-magnitude hints. Your own benchmark should use representative URLs, realistic concurrency, and the same proxy and headless settings you run in production.

When to use each

Choose Playwright for new scrapers, multi-browser checks, SPA sites, and Crawlee-based jobs on Apify. It is the best general-purpose scraping + automation stack in 2026.
Choose Puppeteer if you are all-in on Chromium, want a smaller API surface, or maintain existing Puppeteer code with no Firefox/WebKit requirement.
Choose Selenium if you already run Selenium Grid, corporate QA mandates WebDriver, or you need a specific legacy binding. For greenfield data extraction, it is rarely the best default.

Getting started: minimal examples

Examples are intentionally tiny—replace selectors and URLs with your targets.

Selenium (Python)

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.Chrome()
driver.get("https://example.com")
wait = WebDriverWait(driver, 15)
el = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1")))
print(el.text)
driver.quit()

Playwright (Node.js)

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch({ headless: true });
  const page = await browser.newPage();
  await page.goto('https://example.com', { waitUntil: 'networkidle' });
  const text = await page.locator('h1').innerText();
  console.log(text);
  await browser.close();
})();

Puppeteer (Node.js)

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch(); // headless by default in Puppeteer 25
  const page = await browser.newPage();
  await page.goto('https://example.com', { waitUntil: 'networkidle2' });
  const text = await page.$eval('h1', (el) => el.innerText);
  console.log(text);
  await browser.close();
})();

For Python Playwright, install playwright and run playwright install; the API mirrors the Node version closely.

Using Playwright with Apify Crawlee

Crawlee is an open-source crawling and browser-automation library that wraps Playwright (and Puppeteer) with queues, retries, session rotation, storage, and scaling hooks—the pieces raw scripts usually bolt on by hand.

On Apify you run Crawlee-based Actors on managed browsers and proxies, with scheduling, webhooks, and datasets for JSON/CSV export. That is the practical path from “works on my laptop” to scheduled production scraping without operating your own grid.

Concrete next step: create a free Apify account, start from a Playwright + Crawlee Actor template in the store, and route output to a dataset for downstream pipelines.

Run browser scraping on Apify’s free plan →

Frequently Asked Questions

For typical DOM automation, yes: WebDriver’s HTTP round-trip model and manual waiting usually lose wall-clock time versus WebSocket drivers. Absolute numbers depend on the site, concurrency, and proxies—benchmark your own URLs.

Either works. Playwright adds stronger auto-waiting, richer fixtures, and an easier path to Firefox/WebKit if requirements change. Puppeteer stays valid for small Chromium-only services.

No. It remains relevant for WebDriver-centric test orgs and multi-language grids. For new data-extraction projects, Playwright (often via Crawlee) is usually the better default.

They solve different layers. Scrapy excels at large-scale static HTML crawling. Use Playwright when JavaScript rendering, login flows, or complex UI interactions are required—often as a downstream fetcher in a hybrid design.

Use Crawlee for queueing, retries, and sessions; run on Apify for infrastructure, proxies, and datasets. Avoid unbounded parallel browser instances on a single VM—bound concurrency and memory.

No driver is invisible. Combine realistic concurrency, residential/datacenter proxy strategy, and site-appropriate throttling. For hard targets, evaluate dedicated unblocking products such as Bright Data: https://get.brightdata.com/8xa6yqyp2zxn

Start building: Apify free plan ($5 monthly platform credits) → · Browse Playwright/Crawlee templates in the Store →

Quick verdict​

Comparison at a glance​

Feature-by-feature comparison​

Performance and benchmarks​

When to use each​

Getting started: minimal examples​

Selenium (Python)​

Playwright (Node.js)​

Puppeteer (Node.js)​

Using Playwright with Apify Crawlee​