Skip to content
Back to all tools

Groq

Sub-second LPU inference — Llama 3.1 8B at 840 tokens/sec for $0.05/M input

APIFree tier
Jack Phillips
Audited by Jack Phillips · Updated June 2026
Visit site

Overall score

3.1/ 5
SME fit3/5
usage-metered pricing + free tier
JTBD4/5
solid named JTBD
Integration3/5
API
Trust5/5
mature, founded 2016
Quality1/5
no public rating
Compliance2/5
compliance unknown

About

Groq runs language models on its own LPU (Language Processing Unit) hardware, optimized for inference speed. Throughput is the differentiator: Llama 3.1 8B at 840 tokens/sec, GPT OSS 20B at 1,000 tokens/sec — multiples faster than GPU-based competitors. Pricing is per-token PAYG at the cheapest end of the inference market.

Best for: Real-time AI experiences where latency matters — voice agents, streaming chat, autocomplete. The free API key + cheap per-token rates make Groq the default speed-comparison benchmark before considering Together AI or OpenAI.

Pricing

  • Pay-as-you-go

    Monthly
    Free
    Annual /mo
    Free
    Billing
    usage
    Notes
    Free API key on signup;Llama 3.1 8B Instant: $0.05/M input, $0.08/M output;GPT OSS 20B: $0.075/M input, $0.30/M output;GPT OSS 120B: $0.15/M input, $0.60/M output;Llama 3.3 70B: $0.59/M input, $0.79/M output;OpenAI-compatible API · Linear pricing across the catalog. No platform fee.
  • Enterprise

    Monthly
    n/a
    Annual /mo
    n/a
    Billing
    flat
    Notes
    Enterprise-only models (Minimax M2.5, Qwen3-VL 32B);On-premises deployments;Custom SLAs;Volume pricing · Contact sales — required for on-prem and gated models.

Key features

  • Custom LPU hardware (vs GPU competitors)
  • Llama 3.1 8B: 840 tokens/sec at $0.05/M input
  • Free API key with no specific limit on signup
  • OpenAI-compatible endpoints
  • Wide model catalog (Llama, GPT OSS, Qwen, DeepSeek)
  • Sub-second response times
  • Linear, predictable token pricing

Integrations

OpenAI-compatible APILangChainLlamaIndexHeliconeOpenRouter

Trust & compliance

Stage range
Solopreneur → Seed
Founded
2016
Status
active
SOC 2
unknown
GDPR
unknown
Data residency
unknown
External rating
n/a
Last verified
Jun 2026

Reviews

Be the first to share your experience.

Pairs well with