use-apify.com

Playwright: guides & tutorials

Automate Chromium, Firefox, and WebKit with one API: resilient waits, contexts, and network hooks for modern scraping—ideal for Apify browser Actors.

17 articlesPage 1 of 2

View all tags

Playwright automates Chromium, Firefox, and WebKit with one API, making it a top choice for scraping JavaScript-heavy sites that plain HTTP requests cannot reach. It offers resilient auto-waiting, browser contexts for isolation, and network interception for blocking or capturing requests. These guides show how to build scrapers that survive modern dynamic pages.

Playwright shines on infinite scroll, logins, and content that loads after the initial response, and it pairs cleanly with Crawlee for queuing and Apify for cloud runs. Below you will find tutorials for browser automation, anti-detection tactics, and patterns for adding proxies and retries so your Playwright scrapers stay stable at scale.

Browser Automation for Web Scraping: Playwright, Puppeteer, and Selenium Deep Dive (2026)

March 19, 2026 · 7 min read

Yassine El Haddad

Software Developer & Automation Specialist

When simple HTTP requests fail — because content is JavaScript-rendered, login is required, or pagination is AJAX-driven — you need a real browser. Playwright, Puppeteer, and Selenium are the three dominant tools. This deep dive covers when to use each, advanced techniques (network interception, CDP access, fingerprint evasion), and how to run them at scale on Apify.

Apify5 min read

Crawlee Node.js Tutorial: Production Web Scraping Without the Boilerplate (2026)

March 19, 2026 · 5 min read

Yassine El Haddad

Software Developer & Automation Specialist

Crawlee is an open-source Node.js framework from Apify that bundles everything a production scraper needs: request deduplication, auto-retry, proxy rotation, session management, persistent storage, and Playwright/Puppeteer/HTTP crawlers under one API.

Where raw Playwright requires wiring all those pieces manually, Crawlee provides them out of the box — letting you focus on extraction logic.

Freshness note: Examples verified against Crawlee 3.x (March 2026). Install crawlee@latest to get the current release.

Guide6 min read

Handling Dynamic Websites: AJAX, Infinite Scroll, and Single-Page Apps (2026)

March 19, 2026 · 6 min read

Yassine El Haddad

Software Developer & Automation Specialist

Static HTML scrapers fail on modern sites: content loaded via AJAX, infinite scroll feeds, and single-page apps (SPAs) render in the browser, not the initial response. View Source shows a near-empty shell; Inspect Element reveals the full DOM. This guide covers three approaches—API interception (fastest), Playwright waits (reliable), and scroll/click triggers (for infinite scroll)—plus SPA-specific techniques and a Playwright vs httpx vs Crawlee comparison. Deploy on Apify for managed execution.

Anti Detection7 min read

How to Bypass Cloudflare When Web Scraping (2026): Every Method Ranked

March 19, 2026 · 7 min read

Yassine El Haddad

Software Developer & Automation Specialist

Cloudflare Bot Management (including Turnstile, Bot Score, and Managed Rules) is the most common blocker scrapers hit in 2026. It combines TLS fingerprinting, JavaScript challenges, behavioral analysis, and IP reputation scoring — none of which raw requests or fetch can handle.

This guide ranks every bypass method by effectiveness, complexity, and cost.

Legal note: Only scrape data you have a legitimate reason to access. Cloudflare protection is the site's choice; bypassing it may violate ToS and in some jurisdictions, the CFAA. Always check robots.txt and review terms before scraping.

IPRoyal4 min read

IPRoyal Residential Proxies Setup Guide: Python, Node.js, and Playwright (2026)

March 19, 2026 · 4 min read

Yassine El Haddad

Software Developer & Automation Specialist

IPRoyal provides residential rotating proxies with a 32M+ IP pool, starting at approximately $7/GB for small purchases and scaling to ~$1.75/GB at 500 GB — with bandwidth that never expires. This guide covers the exact configuration for Python, Node.js, Playwright, and Crawlee.

Freshness note: Endpoint and format verified March 2026. Check IPRoyal dashboard for current credentials format.

Browser automation8 min read

Playwright vs Puppeteer vs Selenium 2026: 3 Browsers, 1 Winner

March 19, 2026 · 8 min read

Yassine El Haddad

Software Developer & Automation Specialist

Quick Answer

Default to Playwright for new browser automation: one install drives Chromium, Firefox, and WebKit with built-in auto-wait, ~50–80 MB per context, and Crawlee/Apify integration. Pick Puppeteer 25 for Chrome-only Node services that benefit from raw CDP or the new WebDriver BiDi transport. Keep Selenium 4 when you need Java/C#, Selenium Grid, or BiDi network logging across legacy suites.

In 2026, three tools dominate browser automation: Playwright (Microsoft), Puppeteer 25 (Google, now with WebDriver BiDi), and Selenium 4 (cross-vendor, BiDi-capable). The right choice depends on your use case: new projects should default to Playwright; Puppeteer fits Chrome-focused, lightweight needs; Selenium remains for Java/C# teams and legacy systems. Run Playwright on Apify.

Node.js6 min read

Playwright Web Scraping Tutorial 2026: From Zero to Production

March 19, 2026 · 6 min read

Yassine El Haddad

Software Developer & Automation Specialist

Playwright is the dominant headless browser for web scraping in 2026 — faster than Selenium, more reliable than Puppeteer, and with native support for Chromium, Firefox, and WebKit. This tutorial takes you from install to a production-ready scraper in under an hour.

Freshness note: Examples updated for Playwright v1.58.x (current as of March 2026). Check the official changelog for the latest release notes.

AI6 min read

Stagehand: AI-Powered Browser Automation That Combines Code and Natural Language (2026)

March 19, 2026 · 6 min read

Yassine El Haddad

Software Developer & Automation Specialist

Stagehand is BrowserBase's open-source framework that extends Playwright with LLM-powered natural language commands. You can page.act("click the login button") or page.extract({ instruction: "...", schema: {...} }) instead of writing brittle selectors. Stagehand shines on complex, dynamic UIs; plain Playwright remains better for high-volume scraping where LLM latency and cost matter. Use Apify for scalable browser automation.

Guide9 min read

Complete Guide to Web Scraping with Python in 2026: Tools, Code, and Best Practices

March 19, 2026 · 9 min read

Yassine El Haddad

Software Developer & Automation Specialist

Python remains the dominant language for web scraping in 2026. Whether you need static HTML parsing, JavaScript-rendered pages, or production-grade crawlers, the Python ecosystem delivers: requests, BeautifulSoup, httpx, Playwright, Scrapy, and Crawlee for Python. This guide covers the full stack—libraries, comparison tables, code examples, and data storage—so you can choose and build with confidence. Try Apify for managed Python Actors or run Crawlee Python locally.

Beginner5 min read

What Is a Headless Browser? Complete Guide for Web Scraping (2026)

March 19, 2026 · 5 min read

Yassine El Haddad

Software Developer & Automation Specialist

A headless browser is a full web browser (Chromium, Firefox, or WebKit) that runs without a graphical interface. It executes JavaScript, renders HTML/CSS, handles cookies, and behaves exactly like a visible browser — but can be controlled programmatically and runs on servers without a display.

For web scraping, headless browsers are the solution for sites that don't work with simple HTTP requests.

Guides on this site

Frequently asked questions

Frequently Asked Questions

Playwright is a browser automation library by Microsoft that controls Chromium, Firefox, and WebKit. Use it for scraping when a site requires JavaScript to render its content — product listings loaded via React, prices fetched via AJAX, or login flows that set session cookies. If you can get the data from a plain HTTP request (no JavaScript needed), simpler tools like Cheerio are faster and cheaper. Playwright is the right choice when the site only works in a real browser.

Playwright supports Chromium, Firefox, and WebKit; Puppeteer supports Chromium only. Playwright's API is more modern and actively maintained. Both work well for scraping, but Playwright is the current recommendation for new projects. Puppeteer has a larger StackOverflow footprint, which can be helpful when debugging obscure issues. If you are starting fresh, choose Playwright — the multi-browser support and better async patterns are worth it.

Deploy your Playwright scraper as an Apify actor and use Apify's built-in scheduler — no server to maintain, no cron job to manage. Apify runs the browser in the cloud, handles proxy rotation, stores results in a dataset, and can notify you via webhook or email when the run completes. For teams that want everything managed, this is significantly faster to production than setting up your own infrastructure.

Playwright has excellent documentation and the core API is intuitive — most developers can write a working scraper within a few hours. The main learning curve is debugging async timing issues (waiting for elements to appear after navigation) and handling anti-bot detection on protected sites. This blog has step-by-step tutorials for common Playwright scraping patterns if you are getting started.

Related topics

Guides on this site

Frequently asked questions