Skip to content
Back to all tools

Crawl4AI

Open-source LLM-friendly web crawler

Consider with caveatsAPIFree tier
Jack Phillips
Audited by Jack Phillips · Updated June 2026
Visit site

Overall score

3.9/ 5
SME fit4/5
flat pricing + free tier · technical setup
JTBD5/5
clearly named, measurable job
Integration4/5
API + 6 integrations
Trust2/5
growing, founded 2024
Quality5/5
4.8 on GitHub (40,000 reviews)
Compliance2/5
customer-choice residency

About

Crawl4AI is an Apache 2.0 open-source Python crawler and scraper built specifically for LLM and RAG pipelines. It outputs clean Markdown, supports CSS, XPath, or LLM-based extraction, and ships with browser hooks, stealth mode, and parallel crawling.

Best for: Developer-led SMEs and data teams building RAG or agent pipelines who want a free, self-hosted scraper they can fully control.

Pricing

  • Open Source

    Monthly
    n/a
    Annual /mo
    Free
    Billing
    flat
    Notes
    Full Python library;Docker image;all features;community support · Apache 2.0 license; self-hosted

Key features

  • Clean Markdown output for LLMs
  • CSS, XPath, and LLM-based extraction
  • Parallel crawling with chunking
  • Stealth mode and proxy support
  • Docker self-hosting
  • Hooks and session re-use

Integrations

OpenAIAnthropicOllamaLangChainLlamaIndexDocker

Trust & compliance

Stage range
Idea → Growth
Founded
2024
Status
active
SOC 2
unknown
GDPR
unknown
Data residency
customer_choice
External rating
4.8 on GitHub (40000 reviews)
Last verified
Jun 2026

Reviews

Be the first to share your experience.

Pairs well with