claude-real-video Lets Any LLM Watch Videos Locally
Get the Tech newsletter
Daily tech — startups, AI labs, chips, the launches that shape the next decade. Free.
- claude-real-video (crv) is an MIT-licensed open-source tool that turns any video URL or local file into a folder of key frames, a transcript, and a MANIFEST.txt that Claude, ChatGPT, or Gemini can ingest directly.
- Unlike Gemini's native pipeline, which samples frames at a fixed 1fps interval, claude-real-video uses a single ffmpeg select pass to grab every scene change with a density floor (configurable via --fps-floor), so fast-cut reels and static screencasts are both covered.
- The tool dedupes frames using real pixel difference on downscaled RGB against a sliding window (--dedup-window), avoiding the blind spots of perceptual hashes on flat colors or equal-luma hue changes; --report outputs report.html showing every keep/drop decision with its diff percentage.
- claude-real-video reuses existing subtitles (.srt/.vtt sidecar or embedded track) before falling back to OpenAI Whisper for transcription, and can preserve the full lossless audio track via --keep-audio for models that can listen (Gemini, GPT-4o).
- Installation is
pip install claude-real-videowith an optional[whisper]extra; ffmpeg, ffprobe, and (for URLs) yt-dlp must be on PATH, and it runs on macOS, Windows, and Linux with Python 3.10+. - Login-gated videos (e.g., private YouTube/Instagram content) are supported via a Netscape-format cookie file passed with --cookies, with the README explicitly warning users not to ship credentials in repos.
Why it matters: Most existing video-to-LLM paths either ignore the picture (ChatGPT reads transcripts), refuse video entirely (Claude), or upload to a cloud with rigid 1fps sampling that misses fast cuts (Gemini). claude-real-video's local, scene-change-driven extraction gives developers analyzing video more meaningful frames per context token — a direct cost-and-accuracy win for anyone building on top of Claude, ChatGPT, or Gemini without sending data to a third party.




