Summary

Groq provides a high-speed AI inference engine, the LPU™ Inference Engine, available through cloud and on-premise solutions. They offer API access for developers to integrate various openly-available AI models, including large language models, text-to-speech, and automatic speech recognition models. Groq also provides enterprise solutions for large-scale deployments and custom model requests.

Features
4/13
See all

Must Have

4 of 5

Conversational AI

API Access

Fine-Tuning & Custom Models

Enterprise Solutions

Safety & Alignment Framework

Other

0 of 8

Image Generation

Code Generation

Multimodal AI

Research & Publications

Security & Red Teaming

Synthetic Media Provenance

Threat Intelligence Reporting

Global Affairs & Policy

Pricing
Usage-based
See all

Llama 4 Scout (17Bx16E)

$0.11 per use
  • 460 Tokens per Second

Llama 4 Scout (17Bx16E)

$0.34 per use
  • 460 Tokens per Second

Llama 4 Maverick (17Bx128E)

$0.20 per use
  • 581 Tokens per Second

Llama 4 Maverick (17Bx128E)

$0.60 per use
  • 581 Tokens per Second

Llama Guard 4 12B 128k

$0.20 per use
  • 325 Tokens per Second

Llama Guard 4 12B 128k

$0.20 per use
  • 325 Tokens per Second

DeepSeek R1 Distill Llama 70B

$0.75 per use
  • 275 Tokens per Second

DeepSeek R1 Distill Llama 70B

$0.99 per use
  • 275 Tokens per Second

Qwen3 32B 131k

$0.29 per use
  • 491 Tokens per Second

Qwen3 32B 131k

$0.59 per use
  • 491 Tokens per Second

Qwen QwQ 32B (Preview) 128k

$0.29 per use
  • 400 Tokens per Second

Qwen QwQ 32B (Preview) 128k

$0.39 per use
  • 400 Tokens per Second

Mistral Saba 24B

$0.79 per use
  • 330 Tokens per Second

Mistral Saba 24B

$0.79 per use
  • 330 Tokens per Second

Llama 3.3 70B Versatile 128k

$0.59 per use
  • 275 Tokens per Second

Llama 3.3 70B Versatile 128k

$0.79 per use
  • 275 Tokens per Second

Llama 3.1 8B Instant 128k

$0.05 per use
  • 750 Tokens per Second

Llama 3.1 8B Instant 128k

$0.08 per use
  • 750 Tokens per Second

Llama 3 70B 8k

$0.59 per use
  • 330 Tokens per Second

Llama 3 70B 8k

$0.79 per use
  • 330 Tokens per Second

Llama 3 8B 8k

$0.05 per use
  • 1250 Tokens per Second

Llama 3 8B 8k

$0.08 per use
  • 1250 Tokens per Second

Gemma 2 9B 8k

$0.20 per use
  • 500 Tokens per Second

Gemma 2 9B 8k

$0.20 per use
  • 500 Tokens per Second

Llama Guard 3 8B 8k

$0.20 per use
  • 765 Tokens per Second

Llama Guard 3 8B 8k

$0.20 per use
  • 765 Tokens per Second

PlayAI Dialog v1.0

$50.00 per use
  • 140 Characters /s

Whisper V3 Large

$0.11 per use
  • 189x Speed Factor

Whisper Large v3 Turbo

$0.04 per use
  • 216x Speed Factor

Distil-Whisper

$0.02 per use
  • 250x Speed Factor
Rationale

Groq offers an AI inference engine with API access for developers, supporting various large language models for conversational AI. They explicitly mention 'Enterprise Access' for custom and large-scale needs, and their pricing page states 'Other models are available for specific customer requests including fine tuned models,' indicating support for custom models. While they focus on inference speed, the core functionalities align with the OpenAI Platform's offerings for developers and enterprises.