Skip to main content

How to Extract YouTube Transcripts Without the YouTube API (2026)

· 7 min read
Yassine El Haddad
Software Developer & Automation Specialist

I build production AI agents, web scrapers, and automation pipelines. Most of what I publish here comes from the actual problems they run into: proxies that get banned, anti-bot stacks that fingerprint your client, RAG that drifts when the underlying data moves. Stack: Python, TypeScript, Go, FastAPI, LangChain, Crawlee, Playwright, deployed on AWS, GCP, and Cloudflare.

What is a YouTube transcript scraper?

A YouTube transcript scraper extracts the spoken text from YouTube videos — either from existing captions or by transcribing the audio when captions are unavailable. The YouTube Data API v3 does not provide transcript data. A dedicated scraper reads the same public caption infrastructure used by the YouTube player's "Open transcript" panel, without requiring a Google Cloud project or API key.

YouTube Transcript Scraper — Captions & AI Fallback is the actor this guide uses throughout. It handles single videos, full playlists, and entire channels — and falls back to built-in Whisper AI for videos that have no captions.

YouTube Data API vs. a transcript scraper

YouTube Data API v3Transcript scraper
Provides transcript textNoYes
Requires Google Cloud projectYesNo
Has a daily quotaYes — 10,000 units freeNo
Handles caption-free videosN/AYes (with AI fallback)
Output formatsN/AJSON, text, LLM, SRT, VTT
PriceFree (within quota)Pay-per-result

Step 1 — Open the Actor

Go to apify.com/codepoetry/youtube-transcript-ai-scraper and click Try for free. Every new Apify account gets free credits — no credit card required.

Step 2 — Paste YouTube URLs

The startUrls field accepts:

  • Individual videos: https://www.youtube.com/watch?v=dQw4w9WgXcQ
  • Short links: https://youtu.be/dQw4w9WgXcQ
  • Playlists: https://www.youtube.com/playlist?list=PL...
  • Channels: https://www.youtube.com/@channelname

Playlists and channels expand into individual videos automatically. Use maxResults to cap how many videos are processed (default: 10).

Step 3 — Choose output formats

FormatFieldUse it when
JSONtranscript_jsonTimestamped segments [{start, end, text}] — for subtitle editing or custom parsing
Texttranscript_textPlain text joined with spaces — for search indexing or simple NLP
LLMtranscript_llmFiller tokens stripped — for RAG pipelines, summarisation, fine-tuning
SRTtranscript_srtStandard subtitle file — for video players and editing software
VTTtranscript_vttWebVTT — for HTML5 <video> elements

Request multiple formats in a single run by setting outputFormats: ["json", "llm"].

Step 4 — AI Fallback for caption-free videos

Most transcript tools fail silently when a video has no captions. YouTube Transcript Scraper — Captions & AI Fallback handles this differently:

  1. Check — the actor looks for native captions (manual first, then auto-generated) in your requested languages.
  2. Extract — if found, captions are fetched and formatted immediately. No audio download. Cost: $0.001 per video (free plan).
  3. Transcribe — if no captions exist and aiFallback: true, the actor downloads the audio and runs a bundled faster-whisper model on Apify's compute. Cost: $0.012/min of video duration (free plan).

Spend safeguards:

  • maxAiMinutes: N — hard cap on total AI minutes per run. Videos beyond the limit get an AI_MINUTES_LIMIT_REACHED error item; the run continues for the rest.
  • skipAiFallbackIfLongerThan: N — skip AI for any video longer than N minutes. Useful for channels that mix short clips with long recordings.
  • dryRun: true — scan all URLs and return would_need_ai and estimated_ai_min per video without transcribing or charging anything. Always use this before running an unknown playlist.

Step 5 — Export results

Results land in the Apify dataset. Download as JSON, CSV, or Excel from the Storage tab. Or consume programmatically:

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("codepoetry/youtube-transcript-ai-scraper").call(
run_input={
"startUrls": [{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}],
"languages": ["en"],
"outputFormats": ["json", "llm"],
"aiFallback": False,
}
)

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["metadata"]["title"])
print(item["transcript_llm"][:300])

Install the client: pip install apify-client

Get your API token from Apify Console → Settings → Integrations.

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('codepoetry/youtube-transcript-ai-scraper').call({
startUrls: [{ url: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ' }],
languages: ['en'],
outputFormats: ['json', 'llm'],
aiFallback: false,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(item.metadata.title);
console.log(item.transcript_llm?.slice(0, 300));
});

Install the client: npm install apify-client

Output structure

Each successful item includes:

{
"metadata": {
"id": "dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up",
"channel": "Rick Astley",
"duration": 213,
"view_count": 1757728410,
"upload_date": "20091025"
},
"language": "en",
"is_auto_generated": false,
"is_ai_generated": false,
"transcript_json": [
{ "start": 18.5, "end": 21.0, "text": "We're no strangers to love" }
],
"transcript_text": "We're no strangers to love You know the rules and so do I ...",
"transcript_llm": "We're no strangers to love You know the rules and so do I ..."
}

Failed items include an error_code field (NO_CAPTIONS_AVAILABLE, LANGUAGE_NOT_FOUND, AGE_RESTRICTED, etc.) so you can filter programmatically without stopping the batch.

Use cases

  • RAG / LLM knowledge base — use transcript_llm to build domain-specific corpora from lecture recordings, conference talks, or expert interviews
  • Subtitle generation — request SRT or VTT for your own videos where you want manually-reviewed captions as a starting point
  • Content research — extract spoken keywords from a competitor's channel at scale
  • SEO — turn video transcripts into text content that ranks alongside YouTube results on Google
  • Sentiment and NLP — run topic modelling or named entity recognition over a large transcript corpus
Apify Affiliate Banner 728x90Apify Affiliate Banner 728x90Apify Affiliate Banner 300x50Apify Affiliate Banner 300x50
Frequently Asked Questions

No. YouTube Transcript Scraper — Captions & AI Fallback accesses publicly available caption data — the same data visible in the YouTube player's Open transcript panel. No Google Cloud project or YouTube API credentials are needed. You authenticate with your Apify account only.

With aiFallback disabled (default), the actor returns an error item with error_code NO_CAPTIONS_AVAILABLE and continues processing the rest of the batch. With aiFallback enabled, it downloads the audio and transcribes it using a built-in faster-whisper model — no external API key required. The output has the same structure as a native caption result, with is_ai_generated: true.

Use maxAiMinutes to set a hard cap on AI minutes per run. Use skipAiFallbackIfLongerThan to skip videos above a certain length. Run dryRun: true first to preview which videos need AI and what the estimated cost would be — Dry Runs charge nothing.

Yes. Paste the channel URL (youtube.com/@channelname) and the actor expands it into individual videos automatically. Use maxResults to cap how many are fetched. Videos are returned newest-first for channels; playlists follow their defined order.

transcript_llm is the plain text transcript with [Music], (laughter), and other filler tokens stripped and whitespace normalised. It is ready to pass directly to a LangChain Document, a LlamaIndex TextNode, or any vector store ingestion pipeline without post-processing.