Apify n8n Integration: Complete Guide (Actors, Datasets & Workflows)
n8n is a workflow automation tool that pairs well with Apify because it handles long-running jobs, large JSON payloads, and branching logic without the tight synchronous limits of simpler connectors. This guide covers setup, the official Apify node, webhook patterns for slow Actors, example workflows, and operational tips.
Apify integrates with n8n via the official Apify node. Trigger Actor runs, read datasets, and build automated data pipelines without writing code.
Why pair Apify with n8n?
- Apify runs Actors (scrapers, crawlers, automations) in the cloud and stores results in datasets and key-value stores.
- n8n orchestrates triggers (schedule, webhook, manual), transforms data, and pushes to CRMs, warehouses, Slack, OpenAI, and more.
Asynchronous wins: A scrape may take minutes. n8n can start a run, exit, and resume when Apify calls your webhook—avoiding fragile long HTTP holds.
Setup
1. Apify account and token
- Sign up at Apify.
- Open Settings → Integrations → API tokens and create a token with the scopes you need.
- In n8n, store it as a credential (HTTP Header Auth or the Apify node’s dedicated credential).
2. Install the official Apify community node
In n8n: Settings → Community Nodes → Install
Package: n8n-nodes-apify (maintained by Apify).
You get nodes such as Run Actor, Run Task, Get Dataset Items, and related operations with first-class Apify response shapes.
3. Quick test workflow
- Manual trigger → Apify: Run Actor (pick a small Actor, e.g. a test scraper).
- Either wait for finish (short runs only) or run async and continue via webhook (below).
- Apify: Get Dataset Items → Spreadsheet or HTTP to your stack.
Key nodes and patterns
| Pattern | When to use | Outline |
|---|---|---|
| Sync “wait for finish” | Runs under ~1–2 minutes | Single execution; simple mental model. |
| Async run + webhook | Production scrapes | n8n Webhook trigger URL pasted into Actor webhooks on SUCCEEDED; second workflow fetches dataset. |
| Schedule → Actor | Daily/hourly data | Cron trigger → Run Actor → Get Dataset Items → destination. |
| HTTP + API | Air-gapped n8n | POST /v2/acts/:actorId/runs then GET /v2/datasets/:id/items with Bearer token. |
Pagination: Large datasets should use limit/offset or the node’s iteration—never assume one response holds millions of rows.
Example workflows
A) Competitor pricing → Google Sheets
- Cron (e.g. daily).
- Run Actor with your e-commerce or universal scraper Actor ID and input (start URLs).
- Get Dataset Items.
- Google Sheets append or upsert on SKU/URL.
B) Leads → CRM with filtering
- Webhook (from your form or CRM) supplies a list of company domains.
- Run Actor (e.g. website or directory Actor).
- IF node: require email and job title match.
- HubSpot / Salesforce create or update contact.
C) Docs crawl → embeddings (RAG)
- Weekly Cron.
- Run Actor: Website Content Crawler (or similar) against your docs root URL.
- Split In Batches on Markdown/text items.
- OpenAI (embeddings) → vector DB node or HTTP to Pinecone/Qdrant.
Tips for reliable pipelines
- Prefer webhooks for anything that might exceed HTTP timeouts in n8n or intermediate proxies.
- Validate JSON: missing fields after a site redesign should not crash the whole run—use IF / Set with defaults.
- Rate-limit downstream APIs: Airtable, HubSpot, and others will 429 if you blast thousands of rows; add Wait or batch size caps.
- Log
runIdanddatasetIdin your own tables for audit and replay. - Idempotency: use stable keys (URL, SKU) so retries do not duplicate rows.
For deeper API detail, see Apify + n8n integration in our docs.
Pick one high-value Actor, run it manually from n8n, then add a schedule once the dataset shape is stable.
Yes. Install the Apify community node package in n8n to run Actors, tasks, and read datasets with supported operations.
Run the Actor asynchronously and use Apify webhooks to call n8n when the run succeeds or fails; then fetch dataset items in a new execution.
Often yes for heavy workloads: n8n handles branching, large JSON, and self-hosting better; Zapier’s short step timeouts can break long scrapes.
No for many flows: the Apify node plus built-in n8n nodes are enough. Code nodes are optional for complex transforms.
Each Actor’s page on Apify documents input schema and examples. Start from the Apify Store and clone an Actor’s input from a successful manual run.




