Skip to main content

Bright Data Scraping Browser: AI-Powered Headless Browsing (2026)

· 7 min read
Yassine El Haddad
Software Developer & Automation Specialist

I build production AI agents, web scrapers, and automation pipelines. Most of what I publish here comes from the actual problems they run into: proxies that get banned, anti-bot stacks that fingerprint your client, RAG that drifts when the underlying data moves. Stack: Python, TypeScript, Go, FastAPI, LangChain, Crawlee, Playwright, deployed on AWS, GCP, and Cloudflare.

Bright Data Scraping Browser is a cloud-hosted headless Chrome that connects via Chrome DevTools Protocol (CDP). It's Playwright- and Puppeteer-compatible, with built-in fingerprinting, CAPTCHA solving, and IP rotation — no proxy configuration required. For JavaScript-heavy sites behind Cloudflare, LinkedIn, or Amazon-level anti-bot, it's the fastest path to reliable extraction. Try Bright Data Scraping Browser.

What Bright Data Scraping Browser is

Scraping Browser is a managed browser service. You connect your Playwright or Puppeteer script to a Bright Data CDP endpoint. Bright Data handles:

  • Browser fingerprint randomization — Canvas, WebGL, fonts, screen resolution
  • CAPTCHA solving — Automated resolution of reCAPTCHA, hCaptcha, and similar
  • IP rotation — Residential or datacenter proxies built into the session
  • Request unblocking — Retries, cookie handling, header normalization

You write standard Playwright or Puppeteer code. No proxy setup, no CAPTCHA libraries — just connect and scrape.

How it works

  1. Authenticate — Bright Data gives you a CDP URL: wss://brd-customer-CUSTOMER_ID-zone-scraping_browser:PASSWORD@brd.superproxy.io:9222
  2. Connectplaywright.chromium.connect() or puppeteer.connect() with that URL
  3. Scrape — Use normal page.goto(), page.locator(), etc. Bright Data manages the rest.

The browser runs in Bright Data's cloud. Traffic exits through their proxy network. Each new page or session can use a different IP. CAPTCHAs are solved automatically (when supported).

Setup: Playwright + Bright Data

const { chromium } = require('playwright');

const CDP_URL = process.env.BRIGHT_DATA_SCRAPING_BROWSER_URL;

async function scrapeWithBrightData(url) {
const browser = await chromium.connectOverCDP(CDP_URL);
const page = await browser.newPage();

try {
await page.goto(url, { waitUntil: 'networkidle' });
const title = await page.title();
const content = await page.locator('main').first().innerText();
return { title, content };
} finally {
await browser.close();
}
}

scrapeWithBrightData('https://example.com').then(console.log);

Replace BRIGHT_DATA_SCRAPING_BROWSER_URL with your Bright Data CDP URL. The format is in the Bright Data dashboard under Scraping Browser → Integration.

Code example: JavaScript-heavy site

For sites that load content via AJAX or single-page apps, add waitForSelector or waitForLoadState:

const { chromium } = require('playwright');

async function scrapeSPA(url) {
const browser = await chromium.connectOverCDP(process.env.BRIGHT_DATA_CDP_URL);
const page = await browser.newPage();

await page.goto(url, { waitUntil: 'domcontentloaded' });
await page.waitForSelector('.product-listing', { timeout: 15_000 });

const products = await page.locator('.product-card').evaluateAll((nodes) =>
nodes.map((el) => ({
title: el.querySelector('.title')?.textContent?.trim(),
price: el.querySelector('.price')?.textContent?.trim(),
}))
);

await browser.close();
return products;
}

Scraping Browser handles Cloudflare and similar challenges before your script runs. If the page loads, extraction proceeds as usual.

When to use Scraping Browser

ScenarioUse Scraping Browser?
LinkedIn, Amazon, Cloudflare-protected✓ Yes — built-in unblocking
Heavy JavaScript, SPAs✓ Yes — full browser rendering
Simple HTML sites✗ No — overkill, use HTTP + proxy
Raw speed critical✗ No — slower than direct requests
CAPTCHA-heavy flows✓ Yes — automated solving
High-volume, low-complexity✗ No — datacenter proxy + HTTP cheaper

Use Scraping Browser when anti-bot, JS rendering, or CAPTCHA are blocking you. For static HTML or APIs, Apify proxy configuration with datacenter proxies is more cost-effective.

Pricing

Scraping Browser is priced per GB consumed (data transferred through the browser):

PlanPrice per GBTypical Use
Pay-as-you-go~$8.4/GBAd-hoc, low volume
Starter~$7/GB (71 GB included)~$499/mo
Professional~$6/GB (166 GB included)~$999/mo
Business~$5/GB (399 GB included)~$1,999/mo

For comparison, datacenter proxies are typically $1–2/GB. Scraping Browser adds browser rendering, fingerprinting, and CAPTCHA solving — reserve it for sites that require it. Bright Data pricing

Comparison: Scraping Browser vs alternatives

AttributeBright Data Scraping BrowserPlaywright Cloud (Apify)Browserless
UnblockingBuilt-in, strongestVia proxy add-onLimited
CAPTCHA solvingIncludedExternal/optionalNone
Playwright/PuppeteerNative CDPNativeNative
Proxy modelIncluded, residential/datacenterBring your ownOptional
PricingPer GB (~$5–8/GB)Compute unitsPer-session/subscription
Best forAnti-bot sites, enterpriseApify ecosystem, ActorsSelf-host, simple setups

Winner for hard targets: Bright Data — maximum unblocking out of the box. Winner for Apify users: Playwright Cloud integrates with Actors. See Bright Data vs Apify for a full comparison.

When NOT to use

  • Simple HTML sites — HTTP client + proxy is faster and cheaper
  • APIs — No browser needed; use fetch or axios with proxies
  • Highest throughput — Browser rendering adds latency; direct requests scale better
  • Budget-sensitive, high volume — Consider Apify + datacenter proxies first

Common pitfalls and fixes

  • Connection refused — Ensure the CDP URL is correct and your account has Scraping Browser enabled. Check for typos in the zone name or password.
  • Slow page loads — Scraping Browser adds latency for fingerprinting and proxy routing. For time-sensitive flows, consider caching or parallelizing requests.
  • Inconsistent extraction — Add explicit waits (waitForSelector, waitForLoadState) before reading the DOM. SPAs often need networkidle or a content-specific selector.

Integrate with Apify

You can use Bright Data Scraping Browser from Apify Actors. Configure the CDP URL as an environment variable and connect in your Actor's launchContext or equivalent. Some Apify Actors support custom browser endpoints — check the Actor docs. For full proxy control within Apify, see the proxy configuration guide.

Alternatively, use Bright Data's proxy network directly in Apify: configure a custom proxy URL in your Actor input to route traffic through Bright Data. This gives you Bright Data's IPs without the full Scraping Browser stack — useful when you only need proxy rotation, not browser-level unblocking.

Apify Affiliate Banner 728x90Apify Affiliate Banner 728x90Apify Affiliate Banner 300x50Apify Affiliate Banner 300x50
Match the tool to the target

Use Scraping Browser only when HTTP + proxy fails. For most sites, Apify Actors with residential proxies are sufficient. Save Scraping Browser for LinkedIn, Cloudflare, and similar hard targets.



Try Bright Data Scraping Browser | Explore Apify Actors

Frequently Asked Questions

A cloud-hosted headless Chrome that connects via CDP. It's Playwright/Puppeteer compatible with built-in fingerprinting, CAPTCHA solving, and IP rotation. No proxy configuration needed.

Use playwright.chromium.connectOverCDP() with the CDP URL from Bright Data. The URL format is wss://brd-customer-*-zone-scraping_browser:PASSWORD@brd.superproxy.io:9222

Use Scraping Browser for anti-bot sites (LinkedIn, Cloudflare), JS-heavy SPAs, and CAPTCHA-heavy flows. Use HTTP + proxy for simple HTML or APIs.

Pricing is per GB consumed, typically $5–8.4/GB depending on plan. More expensive than datacenter proxies ($1–2/GB) but includes browser rendering and unblocking.

Yes. Pass the Bright Data CDP URL as an env var in your Apify Actor. Some Actors support custom browser endpoints. You can also use Bright Data proxies with Apify's proxy configuration.

Yes. Bright Data includes automated CAPTCHA solving for reCAPTCHA and similar challenges. No separate CAPTCHA service needed.

Common mistakes and fixes

Connection timeout to Bright Data CDP endpoint

Verify AUTH string format. Check network allows outbound to brd-customer-*.luminati.io. Ensure your account has Scraping Browser enabled.

Pages load but data extraction fails

Add waitForSelector or waitForLoadState before extraction. JavaScript-heavy sites may need longer waitFor options.