Apify n8n Integration: Complete Guide (Actors, Datasets & Workflows)

February 10, 2026 · 4 min read

Software Developer & Automation Specialist

I build production AI agents, web scrapers, and automation pipelines. Most of what I publish here comes from the actual problems they run into: proxies that get banned, anti-bot stacks that fingerprint your client, RAG that drifts when the underlying data moves. Stack: Python, TypeScript, Go, FastAPI, LangChain, Crawlee, Playwright, deployed on AWS, GCP, and Cloudflare.

n8n is a workflow automation tool that pairs well with Apify because it handles long-running jobs, large JSON payloads, and branching logic without the tight synchronous limits of simpler connectors. This guide covers setup, the official Apify node, webhook patterns for slow Actors, example workflows, and operational tips.

Quick Answer

Apify integrates with n8n via the official Apify node. Trigger Actor runs, read datasets, and build automated data pipelines without writing code.

Why pair Apify with n8n?

Apify runs Actors (scrapers, crawlers, automations) in the cloud and stores results in datasets and key-value stores.
n8n orchestrates triggers (schedule, webhook, manual), transforms data, and pushes to CRMs, warehouses, Slack, OpenAI, and more.

Asynchronous wins: A scrape may take minutes. n8n can start a run, exit, and resume when Apify calls your webhook—avoiding fragile long HTTP holds.

Setup

1. Apify account and token

Sign up at Apify.
Open Settings → Integrations → API tokens and create a token with the scopes you need.
In n8n, store it as a credential (HTTP Header Auth or the Apify node’s dedicated credential).

2. Install the official Apify community node

In n8n: Settings → Community Nodes → Install
Package: n8n-nodes-apify (maintained by Apify).

You get nodes such as Run Actor, Run Task, Get Dataset Items, and related operations with first-class Apify response shapes.

3. Quick test workflow

Manual trigger → Apify: Run Actor (pick a small Actor, e.g. a test scraper).
Either wait for finish (short runs only) or run async and continue via webhook (below).
Apify: Get Dataset Items → Spreadsheet or HTTP to your stack.

Key nodes and patterns

Pattern	When to use	Outline
Sync “wait for finish”	Runs under ~1–2 minutes	Single execution; simple mental model.
Async run + webhook	Production scrapes	n8n Webhook trigger URL pasted into Actor webhooks on SUCCEEDED; second workflow fetches dataset.
Schedule → Actor	Daily/hourly data	Cron trigger → Run Actor → Get Dataset Items → destination.
HTTP + API	Air-gapped n8n	`POST /v2/acts/:actorId/runs` then `GET /v2/datasets/:id/items` with Bearer token.

Pagination: Large datasets should use limit/offset or the node’s iteration—never assume one response holds millions of rows.

Example workflows

A) Competitor pricing → Google Sheets

Cron (e.g. daily).
Run Actor with your e-commerce or universal scraper Actor ID and input (start URLs).
Get Dataset Items.
Google Sheets append or upsert on SKU/URL.

B) Leads → CRM with filtering

Webhook (from your form or CRM) supplies a list of company domains.
Run Actor (e.g. website or directory Actor).
IF node: require email and job title match.
HubSpot / Salesforce create or update contact.

C) Docs crawl → embeddings (RAG)

Weekly Cron.
Run Actor: Website Content Crawler (or similar) against your docs root URL.
Split In Batches on Markdown/text items.
OpenAI (embeddings) → vector DB node or HTTP to Pinecone/Qdrant.

Tips for reliable pipelines

Prefer webhooks for anything that might exceed HTTP timeouts in n8n or intermediate proxies.
Validate JSON: missing fields after a site redesign should not crash the whole run—use IF / Set with defaults.
Rate-limit downstream APIs: Airtable, HubSpot, and others will 429 if you blast thousands of rows; add Wait or batch size caps.
Log runId and datasetId in your own tables for audit and replay.
Idempotency: use stable keys (URL, SKU) so retries do not duplicate rows.

For deeper API detail, see Apify + n8n integration in our docs.

Next step

Pick one high-value Actor, run it manually from n8n, then add a schedule once the dataset shape is stable.

Frequently Asked Questions

Yes. Install the Apify community node package in n8n to run Actors, tasks, and read datasets with supported operations.

Run the Actor asynchronously and use Apify webhooks to call n8n when the run succeeds or fails; then fetch dataset items in a new execution.

Often yes for heavy workloads: n8n handles branching, large JSON, and self-hosting better; Zapier’s short step timeouts can break long scrapes.

No for many flows: the Apify node plus built-in n8n nodes are enough. Code nodes are optional for complex transforms.

Each Actor’s page on Apify documents input schema and examples. Start from the Apify Store and clone an Actor’s input from a successful manual run.

Why pair Apify with n8n?​

Setup​

1. Apify account and token​

2. Install the official Apify community node​

3. Quick test workflow​

Key nodes and patterns​

Example workflows​

A) Competitor pricing → Google Sheets​

B) Leads → CRM with filtering​

C) Docs crawl → embeddings (RAG)​

Tips for reliable pipelines​

Common mistakes and fixes