New Amazon API: We've just released a brand new way to scrape Amazon at scale Start Free Trial 🐝

Playwright web scraping: How to make your scripts faster

29 December 2025 | 28 min read

Playwright web scraping can be fast, reliable, and surprisingly simple if you know where the time actually goes. This guide breaks down the practical techniques that make Playwright scripts run quicker without turning them into fragile hacks.

We'll cover setup choices, browser modes, navigation timing, resource blocking, parallel execution, and basic anti-bot strategies. Everything is focused on real performance wins, not theory. If you already use Playwright and want it to feel snappier in production, this article walks you through exactly how to do that.

Playwright web scraping: How to make your scripts faster

Quick answer (TL;DR)

If Playwright feels slow, you're probably waiting too long and downloading too much. The fastest wins are: wait for domcontentloaded, block heavy assets, avoid blind sleeps, and wait for specific selectors.

// Node 20+ (ESM)
//
// This is intentionally simple + production-ish:
// - fast navigation (`domcontentloaded`)
// - block heavy assets (images/fonts/media + common analytics)
// - no blind sleeps (wait for specific selectors)
// - fewer Playwright round trips (bulk DOM read via evaluate)
// - basic stealth-ish tuning (UA/viewport/headers) without doing cringe evasion

import { chromium } from 'playwright';

const URL = process.env.URL ?? 'https://example.com';

const browser = await chromium.launch({
  headless: true,
  // SlowMo is useful for debugging, but it *literally* slows things down.
  // slowMo: 50,
});

const context = await browser.newContext({
  // Light fingerprint tuning (not magic, just avoids default-bot vibes)
  userAgent:
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' +
    '(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
  viewport: { width: 1366, height: 768 },
  locale: 'en-US',
  timezoneId: 'UTC',

  // Optional: reduce "where did this request come from?" suspicion for some sites
  extraHTTPHeaders: {
    'Accept-Language': 'en-US,en;q=0.9',
  },
});

const page = await context.newPage();

// (Optional) If you do API-heavy sites, keeping a single context and reusing pages
// across many URLs is often faster than creating a new context per URL.

// --- 1) Resource blocking / request hygiene ----------------------------------
// This is one of the biggest speed wins in scraping.
// Tip: keep JS enabled by default; blocking JS often breaks pages.
// Start with images/fonts/media, then add patterns you know you don't need.
await page.route('**/*', (route) => {
  const req = route.request();
  const type = req.resourceType();

  // 1) Block heavy asset types
  if (type === 'image' || type === 'font' || type === 'media') {
    return route.abort();
  }

  // 2) Block obvious trackers by hostname (exact or subdomain)
  let host = '';
  try {
    host = new URL(req.url()).hostname;
  } catch {
    // If URL parsing fails, just let it through
    return route.continue();
  }

  // Keep this list short + boring. Add only what you're sure you don't need.
  const blockedHosts = new Set([
    'www.google-analytics.com',
    'ssl.google-analytics.com',
    'analytics.google.com',
    'www.googletagmanager.com',
    'stats.g.doubleclick.net',
    'doubleclick.net',
    'connect.facebook.net',
    'static.hotjar.com',
    'script.hotjar.com',
    'cdn.segment.com',
  ]);

  const isBlockedHost =
    blockedHosts.has(host) ||
    // Optional: block subdomains for these bases (still pretty safe)
    host.endsWith('.doubleclick.net');

  if (isBlockedHost) {
    return route.abort();
  }

  return route.continue();
});

// --- 2) Disable animations ---------------------------------------------------
// Avoid flaky waits + reduce some page work.
// Note: addStyleTag affects the current document, and Playwright applies it
// after navigation unless you use addInitScript. We'll use addInitScript
// so it runs on every navigation *before* the page paints.
await page.addInitScript(() => {
  const style = document.createElement('style');
  style.textContent = `
    *, *::before, *::after {
      transition: none !important;
      animation: none !important;
      scroll-behavior: auto !important;
    }
  `;
  document.documentElement.appendChild(style);
});

// --- 3) Fast navigation + targeted waiting ----------------------------------
// 'domcontentloaded' is usually the sweet spot for scraping.
// 'load' waits for images, fonts, etc. (which we're blocking anyway).
await page.goto(URL, { waitUntil: 'domcontentloaded', timeout: 30_000 });

// Always wait for something that proves your target content exists.
// Avoid page.waitForTimeout() unless you're debugging a flaky site.
await page.waitForSelector('h1', { timeout: 10_000 });

// --- 4) Bulk DOM read in one evaluate() --------------------------------------
// Playwright calls like page.textContent() are round trips.
// Doing one DOM extraction inside evaluate is usually faster and cleaner.
const data = await page.evaluate(() => {
  const text = (sel) => document.querySelector(sel)?.textContent?.trim() ?? null;

  return {
    url: location.href,
    title: text('h1'),
    // Grab a few links as a demo
    links: Array.from(document.querySelectorAll('a'))
      .map((a) => a.getAttribute('href'))
      .filter(Boolean)
      .slice(0, 10),
  };
});

console.log(JSON.stringify(data, null, 2));

await context.close();
await browser.close();

Resource blocking alone can be a game changer. Check out: How to block resources in Playwright and Python?

Choosing the right setup for fast Playwright scraping

When people say Playwright web scraping is slow, it's usually not Playwright itself. It's the setup. A few early decisions can make your scripts feel clean and fast, or heavy and annoying.

One important choice is why you're using Playwright in the first place. If the site renders content with JavaScript, blocks simple HTTP clients, or relies on user-like behavior, Playwright makes sense. If it doesn't, Playwright is probably the wrong tool — and that's a good thing. You can also check: Why do we need Playwright?

  • Playwright is a full browser automation tool. That's why it works so well on modern sites. It runs real browsers and behaves like a real user. That power comes with overhead, so your goal is to avoid unnecessary work from the start.
  • Another key factor is language choice. Node.js tends to feel faster because Playwright is implemented and optimized for JavaScript first, with fewer cross-language hops between your code and the browser. Node usually gets features first, startup time is usually lower, and examples are everywhere. Python is still totally valid, but it often has a bit more overhead and slightly slower execution.
    • Since we'll be using Node 20+, that's a solid base. Modern Node gives you better async handling, faster startup, and fewer weird edge cases. That alone already helps performance.
  • Browser choice also matters. Chromium is usually the fastest and most predictable for scraping. Firefox and WebKit are great for testing, but if speed is the priority, Chromium is the default for a reason.
  • Finally, keep your environment lean. Don't install extra browsers you don't need. Don't run headful mode unless debugging. And don't load Playwright into a giant app if a small script will do.

All of this sounds basic, but these choices compound fast.

Installing Playwright with Python and Node.js

Installing Playwright is straightforward in both ecosystems, but there are a few practical differences worth knowing.

Node.js

Create a new project and install Playwright:

npm init -y
npm install playwright

Install the browser binaries (Chromium, Firefox, WebKit):

npx playwright install

If you only care about speed and want Chromium only:

npx playwright install chromium

That's it. You can now run Playwright scripts immediately.

Python

Install Playwright with pip:

pip install playwright

Learn about web scraping with Python and Playwright.

Then install the browser binaries:

playwright install

If you only want Chromium:

playwright install chromium

Python users should pay attention to async usage. Mixing sync code into async Playwright flows can silently kill performance.

If you're new to Playwright or deciding which language to use, this page covers common questions and tradeoffs in plain terms:
Common questions about web scraping with Playwright

Headless vs headful mode: speed tradeoffs

So, let's get this straight:

  • Headless mode is faster because the browser skips everything related to rendering pixels on your screen. There's no visible window, no real GPU compositing, and far less work spent painting frames and animations. For Playwright web scraping, headless should be your default.
    • The speed difference isn't usually dramatic for a single page, but it adds up fast when you scrape many pages or run scripts in parallel. Less rendering work means lower CPU usage, lower memory pressure, and more stable throughput under load.
  • Headful mode still matters. It's extremely useful when you're figuring out selectors, debugging scrolling or lazy-loaded content, or understanding why a page isn't reaching the state you expect. Adding a small slowMo delay can also make timing issues obvious by forcing each action to be visible.

Heads up: "Headless" isn't one perfectly consistent mode. Behavior can differ between Playwright's bundled Chromium (default) and running system Chrome/Edge via channel. If you see weird CI-vs-local diffs (rendering, timing, bot checks), try the same script with a different channel and see if it reproduces.

The important part is discipline: headful is a debugging tool, not a production setting. Leaving it on by accident is a common reason Playwright scripts feel "mysteriously slow."

Headless example:

import { chromium } from 'playwright';

const browser = await chromium.launch({ headless: true });

Headful example for debugging:

import { chromium } from 'playwright';

const browser = await chromium.launch({ headless: false, slowMo: 50 });

Use headful to fix problems. Use headless to go fast.

Browser choice: Chromium vs Firefox vs WebKit

Not all browsers behave the same when scraping, and the choice can affect both speed and stability. In practice, Chromium is usually the fastest and most reliable option for Playwright web scraping.

Chromium starts up quicker, tends to handle JavaScript-heavy pages more smoothly, and has fewer edge cases with modern front-end frameworks. Because most sites are developed and tested primarily against Chromium-based browsers, you're also less likely to hit unexpected rendering or timing issues.

So, here's the rundown:

  • Chromium is the default for a reason. It offers the best balance of startup speed, execution speed, and compatibility. If performance is your priority, this should be your first choice.
  • Firefox can be slightly slower to launch and sometimes behaves differently with newer JavaScript or CSS features. It's useful when you need browser diversity or want to verify behavior across engines, but it's rarely the fastest option.
  • WebKit is excellent for testing Safari-specific behavior, but for scraping workloads it's often the slowest and the most sensitive to timing issues.

Unless you have a specific reason to test multiple engines, sticking to Chromium keeps your setup simpler, faster, and more predictable. For most scraping jobs, that predictability matters more than theoretical browser coverage.

Optimizing page load and navigation speed

Navigation is where Playwright web scraping can burn most of your time. By default, Playwright is conservative about when a page is considered "ready." It waits for more than just HTML: images, fonts, analytics scripts, ads, and other third-party resources all count toward a full load.

For scraping, that default is often unnecessary. You usually don't care about pixels finishing, ads loading, or background scripts doing their thing. You care about one thing: when the data you want exists in the DOM. So, your job is simple: wait for the earliest moment that data is available, extract it, and move on.

Once you optimize navigation timing, selectors start to matter more. Instead of waiting for "the page," you wait for proof that the content you need is there. That usually means solid, specific selectors tied to real page structure.

If you need a quick refresher on CSS selectors in Playwright, this is handy: How to find elements by CSS selectors in Playwright?

Using page.goto() with waitUntil

When you call page.goto(), Playwright has to decide when navigation is "done." The waitUntil option controls that decision, and it has a direct impact on speed.

  • domcontentloaded fires when the initial HTML is downloaded and parsed. At that point, the DOM exists and can usually be queried. For many scraping tasks, this is the earliest moment you actually need.
  • The load event, in turn, happens later and waits for additional resources like images, stylesheets, and fonts. Those are often irrelevant for scraping, which makes load slower than necessary.

In practice, domcontentloaded is the best default for Playwright web scraping. It returns control earlier while still being reliable on most sites.

Basic example:

import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();

// Returns as soon as the DOM is ready, without waiting for images or fonts
await page.goto('https://example.com', { waitUntil: 'domcontentloaded' });

// now you can start selecting and extracting
const title = await page.textContent('h1');

await browser.close();

Key points:

  • browser.newPage() opens a new tab in the default browser context.
  • waitUntil: 'domcontentloaded' usually returns much earlier than waitUntil: 'load'.
    • This works best when the data you need is rendered synchronously or very early.
    • For JavaScript-heavy sites that load data after initial render, you should pair domcontentloaded with a targeted waitForSelector().
  • page.textContent('h1') extracts content using a CSS selector.
  • browser.close() shuts down cleanly (don't skip this!)

Finally, avoid relying on networkidle event for scraping. It waits for network silence and can be slow or unreliable on pages that keep background requests alive.

Reducing wait times with page.waitForSelector()

Blind sleeps are one of the easiest ways to slow down Playwright web scraping. Fixed delays guess how long a page might take to load. Sometimes they're too short and cause flaky failures. Most of the time they're too long and quietly waste seconds on every page.

page.waitForSelector() waits for exactly what you need and nothing more. As soon as the element appears in the DOM, Playwright continues immediately. No guessing, no padding.

Basic example:

await page.goto('https://example.com', { waitUntil: 'domcontentloaded' });

// Wait only for the element that proves the data exists
await page.waitForSelector('.product-card');

const titles = await page.$$eval('.product-card h2', els =>
  els.map(el => el.textContent)
);

A few important details:

  • waitForSelector() resolves when the element matches the selector and is visible (default: state: 'visible'). Use state: 'attached' to resolve when the element exists in the DOM.
  • It automatically times out if the selector never appears, which makes failures explicit.
  • The more specific your selector, the faster and more reliable the wait.
  • This works especially well after waitUntil: 'domcontentloaded', where the DOM exists but data may still be loading.

Avoid generic selectors like body or .container. They often appear too early and don't actually guarantee your target data is ready.

Disabling animations and transitions via page.addStyleTag()

Animations and transitions look nice for humans, but they slow down automation. They delay clicks, scrolling, and element state changes, and they're a common source of flaky interactions in Playwright web scraping. Disabling animations makes page interactions more immediate and predictable, which helps both speed and stability.

You can inject a small CSS snippet to turn animations off:

await page.addStyleTag({
  content: `
    *,
    *::before,
    *::after {
      transition: none !important;
      animation: none !important;
    }
  `
});

This works well if you only navigate once. However, addStyleTag() applies after the current page has loaded. If your script navigates multiple times, you'll need to inject the CSS again after each navigation.

For scraping scripts that visit many pages, a better option is injecting the CSS before any page loads using addInitScript():

await page.addInitScript(() => {
  const style = document.createElement('style');
  style.textContent = `
    *,
    *::before,
    *::after {
      transition: none !important;
      animation: none !important;
    }
  `;
  document.documentElement.appendChild(style);
});

With animations disabled from the start, interactions become faster and more consistent. Fewer artificial delays, fewer timing issues, and scripts that behave the same run after run.

Using timeouts to fail fast and stay fast

Timeouts don't make Playwright faster directly, but they prevent your scrapers from getting slow in the worst possible way: hanging forever. When a page is blocked, broken, or behaving unexpectedly, fast failure keeps your overall runtime under control.

By default, Playwright's timeouts are fairly generous. For scraping, tighter limits usually make more sense. You want scripts to give up quickly when something goes wrong and move on to the next task. A common pattern is setting global defaults:

page.setDefaultTimeout(10_000);
page.setDefaultNavigationTimeout(30_000);

This means selector waits fail after 10 seconds, and full navigations fail after 30 seconds unless overridden.

A few practical rules:

  • Shorter timeouts surface problems earlier instead of masking them.
  • Fast failures matter even more when running scrapers in parallel.
  • Combine timeouts with precise selectors for predictable behavior.
  • If a site is consistently slow, fix the cause instead of increasing timeouts.

Reducing bandwidth with resource blocking

If you want the biggest speed boost in Playwright web scraping, this is usually it. Most pages load a lot of stuff you don't need: images, fonts, videos, ads, analytics, and third-party trackers. All of that costs bandwidth, CPU, and time.

Resource blocking cuts that noise. Pages become lighter, navigation finishes sooner, and your script spends more time extracting data instead of waiting for junk to load. This also scales extremely well. The more pages you scrape, the bigger the payoff. Blocking unnecessary resources on one page might save a few hundred milliseconds. Blocking them across thousands of pages saves minutes.

If you want a deeper walkthrough and Python examples too, this one is a good reference: How to block resources in Playwright and Python?

Blocking images, fonts, and media using route.abort()

Playwright can intercept every network request a page makes. You register a route handler, inspect each request, and decide whether it should be allowed or blocked. This gives you very fine-grained control over what actually gets downloaded.

Here's a minimal example that blocks common heavy assets:

import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();

// Intercept all requests
await page.route('**/*', (route) => {
  const type = route.request().resourceType();

  // Block heavy, non-essential assets
  if (type === 'image' || type === 'font' || type === 'media') {
    return route.abort();
  }

  // Allow everything else (HTML, JS, XHR, fetch, etc.)
  return route.continue();
});

await page.goto('https://example.com', { waitUntil: 'domcontentloaded' });

// scrape your stuff here

await browser.close();

What's happening here:

  • page.route('**/*', ...) intercepts all outgoing requests.
  • route.request().resourceType() tells you what kind of asset it is.
  • route.abort() blocks it, so the browser doesn't download it.
  • route.continue() lets the request go through normally.

This is safe for many scraping tasks because the data you care about is usually delivered via HTML or XHR/fetch responses, not images or fonts. If the page looks broken, that's fine. You're scraping data, not judging design.

One important rule: don't block JavaScript by default. Blocking scripts often breaks pages entirely. Start with images, fonts, and media, then expand cautiously if needed.

Filtering third-party scripts and trackers

A lot of page weight comes from things you don't need at all: analytics, ads, tracking pixels, A/B testing scripts, and various third-party widgets. These requests slow down navigation, increase network chatter, and add noise, but they almost never matter for scraping.

Unlike images or fonts, these resources are usually JavaScript. Blocking them by URL pattern instead of resource type lets you keep first-party scripts working while cutting out the junk. That balance is important. You want fewer requests, not a broken page.

Example:

await page.route('**/*', (route) => {
  const req = route.request();
  const type = req.resourceType();

  // Only bother checking stuff trackers commonly use
  if (!['script', 'xhr', 'fetch', 'ping', 'beacon'].includes(type)) {
    return route.continue();
  }

  let url, pageHost;
  try {
    url = new URL(req.url());
    pageHost = new URL(page.url()).hostname;
  } catch {
    return route.continue();
  }

  // Rough "same site" check
  const base = (h) => h.split('.').slice(-2).join('.');
  const isThirdParty = base(url.hostname) !== base(pageHost);
  if (!isThirdParty) return route.continue();

  // Known tracker hosts / domains (keep short + boring)
  const blocked = [
    'google-analytics.com',
    'googletagmanager.com',
    'doubleclick.net',
    'facebook.net',
    'hotjar.com',
    'segment.com',
  ];

  if (blocked.some((d) => url.hostname === d || url.hostname.endsWith(`.${d}`))) {
    return route.abort();
  }

  return route.continue();
});

A few practical rules:

  • Start conservative. Block only obvious third-party trackers first.
  • Prefer blocking by domain or substring, not broad patterns.
  • Avoid blocking site-owned scripts unless you're sure they're irrelevant.
  • If a page stops rendering data, unblock the last thing you added and retry.

Used carefully, URL-based blocking can significantly reduce request count and speed up navigation without breaking functionality.

DevTools-based resource monitoring for optimization

Before blocking resources aggressively, it helps to understand what a page actually loads. Guessing works sometimes, but a quick look in browser DevTools gives you real data and saves a lot of trial and error.

Open the site in a normal browser, open DevTools, and switch to the Network tab. Reload the page and let it fully settle. You'll usually see dozens or hundreds of requests, many of which have nothing to do with the data you want.

A few things to look for:

  • Sort by size to spot heavy images, fonts, and media files.
  • Sort by time or waterfall to find requests that delay page readiness.
  • Check domains to separate first-party resources from third-party trackers.
  • Look at request types (JS, XHR, fetch, image) to understand what drives rendering vs data loading.

Patterns show up quickly. Analytics and ad platforms tend to repeat across pages. Image CDNs and font providers are easy wins. Long-running scripts that don't affect your target content are good blocking candidates. Once you've identified what's heavy and unnecessary, you can confidently block those resources in Playwright using page.route(). This makes your optimization deliberate instead of experimental.

A good rule of thumb: observe first, block second. A few minutes in DevTools can save hours of debugging later and keeps your scraping scripts fast without breaking pages.

Advanced techniques to speed up Playwright scrapers

Once the basics are dialed in, there are a few techniques that can make Playwright web scraping even faster. These go beyond simple waits and blocking assets. Instead of treating pages as black boxes, you start reacting to what actually changes and skipping work you don't need.

At this level, the goal is efficiency. Fewer DOM interactions, fewer page transitions, and less waiting for things that don't affect your data. Used correctly, these techniques can dramatically reduce scrape time on complex or JavaScript-heavy sites.

Many of these optimizations pair well with network-level scraping, where you extract data directly from XHR or fetch responses instead of reading the rendered page. If you're interested in that approach, this is a good read: How to capture background requests and responses in Playwright?

Using scrollIntoViewIfNeeded() for infinite scroll

Infinite scroll pages are a common speed trap. The usual approach is scrolling the entire page in a loop and sprinkling in fixed delays, which wastes time and often misses content or scrolls too far.

A more efficient approach is scrolling only when necessary and only to elements you actually care about. Playwright's scrollIntoViewIfNeeded() does exactly that. It scrolls just enough to bring an element into view, and only if it isn't already visible.

Basic example:

const items = await page.$$('.item');

for (const item of items) {
  await item.scrollIntoViewIfNeeded();
  // extract data from item here
}

This works because scrollIntoViewIfNeeded() is conditional. If the element is already visible, nothing happens. No unnecessary movement and no artificial sleeps. This technique works best on pages where items are already in the DOM but not yet visible.

For pages that load new items dynamically as you scroll, you can combine it with selector-based waits. Example with dynamic loading:

// Wait for at least one item to exist
await page.waitForSelector('.item');

// Scroll to the last known item to trigger loading more
const items = await page.$$('.item');
const lastItem = items[items.length - 1];

await lastItem.scrollIntoViewIfNeeded();

After scrolling, wait for new items to appear, then repeat. You scroll only when there's something new to load, which keeps infinite scrolling fast and controlled.

One important limitation: if a site loads content only in response to global scroll events (not element visibility), this method may not trigger loading. In those cases, a controlled page.mouse.wheel() or page.evaluate(() => window.scrollBy(...)) loop may still be necessary.

Evaluating JavaScript for smoother DOM interaction

Every Playwright call that crosses from Node into the browser has a cost. Calls like page.textContent(), page.getAttribute(), or repeated page.$() queries each involve a round trip between your script and the browser process. On small pages that's fine. On large pages or inside loops, that overhead adds up fast.

Running logic directly inside the page with page.evaluate() avoids that back-and-forth. You execute JavaScript in the browser context, collect the data you need, and send it back in one response.

Example:

const data = await page.evaluate(() => {
  return Array.from(document.querySelectorAll('.item')).map(el => ({
    title: el.querySelector('h2')?.textContent,
    price: el.querySelector('.price')?.textContent
  }));
});

Instead of dozens of Playwright calls, this makes a single trip into the page and returns plain JSON-friendly data. Fewer protocol hops mean smoother execution and better performance, especially when scraping lists or tables.

A few practical guidelines:

  • Use evaluate() when you need to extract many related elements.
  • Keep the logic simple and self-contained. You can't access Node variables or modules inside evaluate().
  • Avoid putting complex business logic there; use it for data extraction, not orchestration.
  • Always return serializable data (strings, numbers, arrays, objects).

Intercepting and modifying requests with route.fulfill()

Sometimes the fastest request is the one you don't make. With route.fulfill(), you can intercept a network request and respond with your own data instead of letting the browser hit the network. This is useful when a page repeatedly requests the same predictable resources: configuration files, feature flags, localization data, or static API responses that don't affect the content you're scraping.

Instead of waiting on the network every time, you short-circuit the request entirely. Example:

await page.route('**/config.json', (route) => {
  route.fulfill({
    status: 200,
    contentType: 'application/json',
    body: JSON.stringify({ featureEnabled: false })
  });
});

From the page's perspective, this looks like a perfectly normal response. No external request is made, no latency is added, and the page continues loading immediately.

A few important guidelines:

  • Use route.fulfill() only for predictable, non-critical data.
  • Avoid faking API responses that control authentication, pricing, or business logic unless you fully understand the impact.
  • Keep mocked responses minimal: just enough to satisfy the page.
  • This works best for config files, flags, experiments, and static metadata.

Parallel scraping with multiple browser contexts

If you want real speed, you need parallelism. Running pages one by one leaves most of your CPU and network idle. In Playwright, the safest and most efficient way to scale is using multiple browser contexts, not multiple browser instances.

Browser contexts are isolated like incognito windows. Each context has its own cookies, storage, and session state, but they all share the same underlying browser process. That makes them much lighter and faster than launching separate browsers.

Basic pattern:

import { chromium } from 'playwright';

const CONCURRENCY = 5;
const URL = 'https://example.com';

const browser = await chromium.launch();

try {
  const contexts = await Promise.all(
    Array.from({ length: CONCURRENCY }, () => browser.newContext())
  );

  // Make sure every context gets closed even if some worker throws
  await Promise.all(
    contexts.map(async (context) => {
      let page;
      try {
        page = await context.newPage();
        await page.goto(URL, { waitUntil: 'domcontentloaded', timeout: 30_000 });

        // scrape stuff here

      } finally {
        // Close page first, then context (both are safe to call even if partially created)
        try { await page?.close(); } catch {}
        try { await context.close(); } catch {}
      }
    })
  );
} finally {
  // Always close the browser, even if Promise.all rejects
  await browser.close();
}

Key points to note:

  • Contexts are cheap compared to full browser instances. Reusing a context across multiple pages is faster than recreating it, as long as cookies and session state aren't an issue.
  • Isolation between contexts reduces cross-request contamination from cookies and localStorage.
  • Sharing a single browser process keeps memory usage and startup costs low.
  • Running pages in parallel allows network and CPU work to overlap efficiently.

A few practical rules:

  • Start small. Concurrency of 3–5 contexts is a good baseline.
  • Increase gradually and watch CPU, memory, and error rates.
  • Too much parallelism can slow everything down or trigger blocks.
  • Combine this with resource blocking for best results.

Proxy and anti-bot strategies for faster access

Speed isn't just about how fast your code runs. If a site blocks you, rate-limits you, or throws challenges, everything slows down or stops. Smart anti-bot handling often makes Playwright web scraping faster simply because you spend less time retrying requests and less time dealing with failures.

At this stage, performance and access are tightly linked. A scraper that looks reasonably normal and behaves predictably will often run faster than an aggressively optimized one that keeps getting interrupted. The goal here isn't to be invisible or "undetectable." It's to look normal enough that the site lets you work without friction. Fewer challenges, fewer blocks, fewer retries, and smoother overall performance.

Playwright itself is a neutral automation tool. It's owned and maintained by Microsoft, which is worth knowing for trust, stability, and long-term support: Who owns Playwright?

Using proxies (rotate per context/session)

Using a single IP is fine for small jobs or internal tools. At scale, it's one of the fastest ways to get throttled or blocked. Rotating residential proxies spread requests across many IPs, which keeps access stable and avoids slowdowns caused by rate limits and challenges.

In practice, proxies are usually attached at the browser launch or browser context level. Each browser or context then runs through a different IP, giving you natural-looking traffic distribution. Rotation usually means choosing a proxy per context/batch, not per request.

Basic example at browser launch:

import { chromium } from 'playwright';

const browser = await chromium.launch({
  proxy: {
    server: 'http://proxy-host:port',
    username: 'user',
    password: 'pass'
  }
});

This works well when each browser instance represents a single identity. For higher throughput, many setups rotate proxies by creating multiple contexts, each configured with a different proxy.

  • Fewer blocks and captchas mean fewer retries.
  • Less throttling keeps navigation times consistent.
  • Long-running jobs don't slow down over time as IPs get flagged.

A few practical rules:

  • Prefer residential or mobile proxies for sites with aggressive bot protection.
  • Don't rotate too aggressively. One proxy per context or per batch is usually enough.
  • Bad or overloaded proxies can slow things down more than no proxy at all.
  • Always test proxy latency. A slow proxy can erase all your Playwright optimizations.

Randomizing user-agent and viewport

Using the same fingerprint over and over is another easy way to get flagged. You don't need heavy stealth tricks to improve this. Small, realistic variations already go a long way and cost almost nothing in performance.

The simplest and safest knobs to turn are the user agent and viewport size. These values are cheap to change and commonly vary between real users.

You can randomize them per browser context:

const context = await browser.newContext({
  userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
  viewport: { width: 1366, height: 768 }
});

In practice, you rotate these values across runs or batches. For example, alternate between a few common desktop resolutions and a small set of modern Chrome user agents.

A few important guidelines:

  • Keep values realistic and boring. Common desktop sizes and real browser UAs work best.
  • Avoid extreme or inconsistent combinations (for example, mobile UA with a huge desktop viewport).
  • Don't randomize on every single page unless you have a reason. Per context or per session is enough.
  • Fingerprint variation helps access stability, which indirectly improves speed by reducing blocks and retries.

Avoiding detection with stealth plugins and delays

Stealth tools can help, but they're not magic, and they're rarely your biggest speed win. Think of them as friction reducers, not invisibility cloaks. The biggest gains still come from good waits, clean navigation, sane request patterns, and not doing obviously bot-like things.

If you want a practical stealth layer in Playwright, the most commonly used option is playwright-extra with its stealth plugin. It applies a set of small patches that smooth over obvious automation signals without requiring heavy customization.

Typical setup:

import { chromium } from 'playwright-extra';
import StealthPlugin from 'playwright-extra-plugin-stealth';

chromium.use(StealthPlugin());

const browser = await chromium.launch({ headless: true });

What this usually helps with:

  • Reducing obvious automation flags (navigator.webdriver, etc.)
  • Smoothing browser fingerprints just enough to avoid basic checks
  • Improving stability on sites with light bot detection

All in all, these plugins are optional. For many sites, solid waits, resource blocking, and sane request rates are enough without any stealth plugin.

What it doesn't do:

  • It won't bypass serious bot protection on its own
  • It won't fix bad scraping logic
  • It won't compensate for aggressive request rates or broken waits

In addition to stealth patches, small, natural delays between actions can help without killing performance. The key is moderation.

await page.click('.next-button');
await page.waitForTimeout(200 + Math.random() * 200);

A few practical rules:

  • Avoid perfectly timed loops and instant action chains.
  • Don't add delays everywhere: only between major actions.
  • Randomize slightly, not wildly.
  • Prefer fewer retries over slower retries.

Slight imperfection often makes scripts more stable and effectively faster in real-world runs, because you spend less time dealing with blocks, retries, and broken sessions. Stealth is a seasoning, not the main ingredient.

Start scraping faster with Playwright

Playwright web scraping can be very fast if you tune it right. Smart waits, resource blocking, clean navigation, and controlled parallelism already get you most of the way there. For many projects, that's more than enough.

There are cases where browser automation still starts to feel heavy. Large-scale scraping, aggressive anti-bot systems, or setups where managing proxies, fingerprints, and retries becomes a job on its own. At that point, speed isn't just about code anymore — it's about infrastructure.

If you hit that wall, ScrapingBee can be a solid alternative or complement to Playwright. It handles browsers, proxies, and blocking automatically, so you can focus on extracting data instead of fighting defenses and babysitting infrastructure.

Playwright remains a great tool for control and flexibility. Knowing when to optimize it (and when to lean on a web scraping API) is what keeps your scraping fast, reliable, and sane.

Conclusion

Speed in Playwright web scraping comes down to intent. Don't wait for what you don't need. Don't load what you won't use. And don't fight the page when you can work with it.

Most slow scripts aren't slow because Playwright is heavy. They're slow because they treat every page like a full browser session instead of a targeted data grab. Once you tighten navigation timing, block unnecessary resources, and parallelize safely, Playwright becomes a very efficient scraping tool.

At that point, performance stops being mysterious. You know exactly where time goes, and you control it.

If you want to go deeper, these reads are worth your time:

Dial it in, keep it simple, and your scrapers will fly.

Frequently asked questions (FAQs)

Is Playwright good for web scraping?

Yes. Playwright web scraping works well on modern sites that rely heavily on JavaScript or block basic HTTP clients. It behaves like a real browser, which makes it reliable for dynamic content, login flows, and interactive pages where traditional scraping tools fail.

How do you make Playwright faster?

You make Playwright faster by waiting less and loading less. Use domcontentloaded, block heavy assets, avoid blind sleeps, evaluate JavaScript in bulk, and run tasks in parallel with browser contexts. Most speed issues come from defaults, not limitations.

Should I use Chromium, Firefox, or WebKit for scraping?

Chromium is usually the best choice for scraping. It starts faster, behaves more predictably on modern websites, and has fewer quirks. Firefox and WebKit are useful for testing, but they're rarely faster or more stable for scraping workloads.

How do I debug slow Playwright scripts?

Run scripts in headful mode, enable slow motion, and watch where time is spent. Check navigation waits, selector waits, and network requests in DevTools. Slow scripts usually wait for the wrong events or load resources they don't need.

image description
Kevin Sahin

Kevin worked in the web scraping industry for 10 years before co-founding ScrapingBee. He is also the author of the Java Web Scraping Handbook.