Go Back

Fireworks AI

fireworks.ai
Summary

Fireworks AI provides a platform for developers and enterprises to build and deploy generative AI applications. It offers fast inference for open-source LLMs and image models, along with tools for fine-tuning, custom model deployment, and enterprise-grade features like security, compliance, and scalability.

Features
8/13
See all

Must Have

5 of 5

Conversational AI

API Access

Safety & Alignment Framework

Fine-Tuning & Custom Models

Enterprise Solutions

Other

3 of 8

Image Generation

Code Generation

Multimodal AI

Research & Publications

Security & Red Teaming

Synthetic Media Provenance

Threat Intelligence Reporting

Global Affairs & Policy

Pricing
Usage-based
See all

Serverless Inference - Text and Vision - Less than 4B parameters

$0.10 per use

Serverless Inference - Text and Vision - 4B - 16B parameters

$0.20 per use

Serverless Inference - Text and Vision - More than 16B parameters

$0.90 per use

Serverless Inference - Text and Vision - MoE 0B - 56B parameters (e.g. Mixtral 8x7B)

$0.50 per use

Serverless Inference - Text and Vision - MoE 56.1B - 176B parameters (e.g. DBRX, Mixtral 8x22B)

$1.20 per use

Serverless Inference - Text and Vision - DeepSeek V3

$0.90 per use

Serverless Inference - Text and Vision - DeepSeek R1 (Fast)

$3.00 per use

Serverless Inference - Text and Vision - DeepSeek R1 0528 (Fast)

$3.00 per use

Serverless Inference - Text and Vision - DeepSeek R1 (Basic)

$0.55 per use

Serverless Inference - Text and Vision - Meta Llama 3.1 405B

$3.00 per use

Serverless Inference - Text and Vision - Meta Llama 4 Maverick (Basic)

$0.22 per use

Serverless Inference - Text and Vision - Meta Llama 4 Scout (Basic)

$0.15 per use

Serverless Inference - Text and Vision - Qwen3 235B

$0.22 per use

Serverless Inference - Text and Vision - Qwen3 30B

$0.15 per use

Speech to Text (STT) - Whisper-v3-large

$0.00 per use

Speech to Text (STT) - Whisper-v3-large-turbo

$0.00 per use

Speech to Text (STT) - Streaming transcription service

$0.00 per use

Image Generation - All Non-Flux Models (SDXL, Playground, etc)

$0.00 per use

Image Generation - All Non-Flux Models (SDXL, Playground, etc) with ControlNet

$0.00 per use

Image Generation - FLUX.1 [dev]

$0.00 per use
  • N/A on serverless

Image Generation - FLUX.1 [schnell]

$0.00 per use
  • N/A on serverless

Embeddings - up to 150M

$0.01 per use

Embeddings - 150M - 350M

$0.02 per use

Fine Tuning - Models up to 16B parameters

$0.50 per use

Fine Tuning - Models 16.1B - 80B

$3.00 per use

Fine Tuning - DeepSeek R1 / V3

$10.00 per use

On-Demand Deployments - A100 80 GB GPU

$2.90 per use

On-Demand Deployments - H100 80 GB GPU

$5.80 per use

On-Demand Deployments - H200 141 GB GPU

$6.99 per use

On-Demand Deployments - B200 180 GB GPU

$11.99 per use

On-Demand Deployments - AMD MI300X

$4.99 per use
Rationale

Fireworks AI offers a platform for deploying and fine-tuning generative AI models, including LLMs and image models, with a strong emphasis on fast inference. They provide API access, support for custom models and fine-tuning, and dedicated enterprise solutions with security and compliance features. The platform explicitly mentions support for various modalities like text, audio, and image, and features like function calling and JSON mode for structured outputs, which are key to conversational AI and multimodal capabilities. Their enterprise offerings also highlight security and compliance, aligning with the safety and alignment framework.