Lead Generation with Web Scraping: Data, Sources & CRM Workflow (2026)
Purchased lists go stale fast, overlap with competitors, and bounce. Scraping public business and professional signals—then enriching, validating, and syncing—gives you a pipeline you control.
Web scraping automates B2B lead collection from Google Maps, LinkedIn, company directories, and websites. Tools like Apify turn hours of manual prospecting into minutes of configured runs plus quality checks.
This article covers what data to collect, where to get it, an automation workflow, enrichment, and CRM integration. For ready-made tools, start with lead generation in the Apify Store.
What data to collect (and why)
| Field | Why it matters | Typical source |
|---|---|---|
| Company / place name | Messaging, dedupe | Maps, directories |
| Website / domain | Key for email guessing & tech lookup | Maps, footer, LinkedIn |
| Location & category | ICP fit | Maps, industry sites |
| Phone | Call workflows, dedupe | Maps, site |
| Role / title | Personalization | LinkedIn, listings |
| Person name | Sequences | LinkedIn, bylines |
| Source URL | Audit trail | Every scrape |
Rule: one stable business key (domain + region, or Maps placeId where available) before you enrich people.
Best sources for B2B leads
- Google Maps — Local density: restaurants, clinics, agencies, contractors. Structured names, phones, sites, categories.
- LinkedIn — Titles and companies for decision-makers (stay on public data and tool terms).
- Company directories & niche listings — Industry portals, review sites, registries (search the Store for the domain).
- Company websites — Contact pages, team pages, generic inboxes, social links.
Browse lead-generation Actors →
Hard targets (strict anti-bot, large residential needs) sometimes pair Apify runs with dedicated proxy vendors such as Bright Data; most SMB lead gen works with default Apify platform options.
Automation workflow (step by step)
Use this as a template; swap Actors to match your sources.
| Step | Action | Output |
|---|---|---|
| 1 | Define ICP — geo, category, company size, title keywords | Written filter rules |
| 2 | Discover — run a Google Maps or directory Actor with tight queries | Table of businesses + websites |
| 3 | Extract contacts from sites — feed website URLs to a contact / email Actor | Emails, phones, socials |
| 4 | Add humans — optional LinkedIn Actor pass on company or role queries | Names, titles, profile URLs |
| 5 | Normalize — one row per business; standard domains & phones | Clean CSV/JSON |
| 6 | Enrich — email finders (e.g. Hunter, Apollo) from name + domain | Guessed emails + scores |
| 7 | Validate — SMTP or verification vendor | “Valid” only for outreach |
| 8 | Dedupe — domain + market or place ID | CRM-safe file |
| 9 | Sync — webhook → Make, Zapier, or n8n → HubSpot, Pipedrive, Salesforce | Live CRM |
Orchestration sketch:
Maps / directory Actor → dataset → contact Actor → (optional) LinkedIn Actor → enrich → validate → CRM
On Apify, each step is a run; use webhooks or scheduled exports to avoid manual downloads.
Example chain (local B2B)
- Input: query
commercial HVAC contractors, city, max results (start small, e.g. 25). - Maps run: collect
name,phone,website,category,rating. - Website pass: Contact Details Scraper or Email & Phone Extractor on each
website. - Filter: drop rows with no domain; flag
info@/contact@vs person-like patterns if your team cares. - Validate before any cold email.
For Maps-specific setup, see Scrape Google Maps. For LinkedIn boundaries, see Scrape LinkedIn.
Enrichment: what “done” looks like
- Minimum viable lead: company + domain + validated channel (email or phone you’re allowed to use).
- Sales-ready lead: + role/title + LinkedIn or proof of seniority + ICP tags.
- Never skip validation for email: high bounces hurt domain reputation (many teams aim < 5% bounce).
CRM integration
- No-code: Apify webhook on successful run → Zapier / Make → create or update Company + Contact.
- Automation-heavy: n8n + Apify for branching (e.g. only if email valid).
- Custom: Pull Apify dataset API from your own worker and upsert into CRM APIs.
Map fields explicitly: domain → company match, email → contact key, source URL → custom property for compliance review.
Lead quality vs volume
One hundred validated, ICP-tight leads beat ten thousand generic rows. Filter early on category, geography, rating, and title keywords—not only after you pay for enrichment.
Use category search to find maintained Actors before writing custom code.
Legal and compliance (short)
Laws and platform terms vary by country and channel. Use public data, respect robots/terms where they apply, document purpose, and give recipients clear opt-out for email. This is not legal advice—see Is web scraping legal? and counsel for your markets.
Google Maps for local businesses (name, phone, website, category), LinkedIn for public professional signals (titles, companies), niche directories for industry lists, and company websites for published contacts. Chain sources so each step adds a field you actually use.
Apify runs pre-built Actors on a schedule, stores results in datasets, and connects to Zapier, Make, n8n, and APIs—so discovery, extraction, and handoff to CRM require far less manual copy-paste than browser-only workflows.
LinkedIn typically does not expose verified personal emails in public views. Common pattern: scrape public name + company, derive the corporate domain, then use an email finder and validator. Always comply with LinkedIn’s terms and applicable privacy laws.
Mailbox providers track bounces. Sustained high bounce rates hurt deliverability for your whole domain. Validation (NeverBounce, ZeroBounce, or similar) reduces risk before CRM import or sequences.
Normalize domains, standardize phone formats, dedupe on domain plus market or a stable place ID from Maps, then use CRM upsert APIs or matching rules in your automation tool.
It depends on jurisdiction, data type (B2B vs personal), and how you use and market to contacts. GDPR, CAN-SPAM, CASL, and platform terms all matter. Consult qualified counsel for your use case; our legality guide is an overview only.




