Fireworks AI
fireworks.aiSummary
Fireworks AI provides a platform for developers and enterprises to build and deploy generative AI applications. It offers fast inference for open-source LLMs and image models, along with tools for fine-tuning, custom model deployment, and enterprise-grade features like security, compliance, and scalability.
Features8/13
See allMust Have
5 of 5
Conversational AI
API Access
Safety & Alignment Framework
Fine-Tuning & Custom Models
Enterprise Solutions
Other
3 of 8
Image Generation
Code Generation
Multimodal AI
Research & Publications
Security & Red Teaming
Synthetic Media Provenance
Threat Intelligence Reporting
Global Affairs & Policy
PricingUsage-based
See allServerless Inference - Text and Vision - Less than 4B parameters
Serverless Inference - Text and Vision - 4B - 16B parameters
Serverless Inference - Text and Vision - More than 16B parameters
Serverless Inference - Text and Vision - MoE 0B - 56B parameters (e.g. Mixtral 8x7B)
Serverless Inference - Text and Vision - MoE 56.1B - 176B parameters (e.g. DBRX, Mixtral 8x22B)
Serverless Inference - Text and Vision - DeepSeek V3
Serverless Inference - Text and Vision - DeepSeek R1 (Fast)
Serverless Inference - Text and Vision - DeepSeek R1 0528 (Fast)
Serverless Inference - Text and Vision - DeepSeek R1 (Basic)
Serverless Inference - Text and Vision - Meta Llama 3.1 405B
Serverless Inference - Text and Vision - Meta Llama 4 Maverick (Basic)
Serverless Inference - Text and Vision - Meta Llama 4 Scout (Basic)
Serverless Inference - Text and Vision - Qwen3 235B
Serverless Inference - Text and Vision - Qwen3 30B
Speech to Text (STT) - Whisper-v3-large
Speech to Text (STT) - Whisper-v3-large-turbo
Speech to Text (STT) - Streaming transcription service
Image Generation - All Non-Flux Models (SDXL, Playground, etc)
Image Generation - All Non-Flux Models (SDXL, Playground, etc) with ControlNet
Image Generation - FLUX.1 [dev]
- N/A on serverless
Image Generation - FLUX.1 [schnell]
- N/A on serverless
Embeddings - up to 150M
Embeddings - 150M - 350M
Fine Tuning - Models up to 16B parameters
Fine Tuning - Models 16.1B - 80B
Fine Tuning - DeepSeek R1 / V3
On-Demand Deployments - A100 80 GB GPU
On-Demand Deployments - H100 80 GB GPU
On-Demand Deployments - H200 141 GB GPU
On-Demand Deployments - B200 180 GB GPU
On-Demand Deployments - AMD MI300X
Rationale
Fireworks AI offers a platform for deploying and fine-tuning generative AI models, including LLMs and image models, with a strong emphasis on fast inference. They provide API access, support for custom models and fine-tuning, and dedicated enterprise solutions with security and compliance features. The platform explicitly mentions support for various modalities like text, audio, and image, and features like function calling and JSON mode for structured outputs, which are key to conversational AI and multimodal capabilities. Their enterprise offerings also highlight security and compliance, aligning with the safety and alignment framework.