Available Models
Subscriptions
Always-On, LoRA, and Embeddings models are included in every subscription.
Always-On Models
These models are included in all Standard and Pro subscriptions. Per-token pricing is also available with usage-based billing.
Full model intelligence
For our self-hosted Synthetic models, we run the weights directly and do not perform any quantization on either the weights or the KV cache.
| Model | Provider | Context length | Status |
|---|---|---|---|
| hf:moonshotai/Kimi-K2.5 | Synthetic | 256k tokens | ✓ Included |
| hf:nvidia/Kimi-K2.5-NVFP4 | Synthetic | 256k tokens | ✓ Included |
| hf:zai-org/GLM-4.7 | Synthetic | 198k tokens | ✓ Included |
| hf:deepseek-ai/DeepSeek-R1-0528 | Fireworks | 128k tokens | ✓ Included |
| hf:deepseek-ai/DeepSeek-V3-0324 | Fireworks | 128k tokens | ✓ Included |
| hf:deepseek-ai/DeepSeek-V3.2 | Fireworks | 159k tokens | ✓ Included |
| hf:meta-llama/Llama-3.3-70B-Instruct | Fireworks | 128k tokens | ✓ Included |
| hf:MiniMaxAI/MiniMax-M2.1 | Fireworks | 192k tokens | ✓ Included |
| hf:moonshotai/Kimi-K2-Instruct-0905 | Fireworks | 256k tokens | ✓ Included |
| hf:moonshotai/Kimi-K2-Thinking | Fireworks | 256k tokens | ✓ Included |
| hf:openai/gpt-oss-120b | Fireworks | 128k tokens | ✓ Included |
| hf:deepseek-ai/DeepSeek-V3 | Together AI | 128k tokens | ✓ Included |
| hf:Qwen/Qwen3-235B-A22B-Thinking-2507 | Together AI | 256k tokens | ✓ Included |
| hf:Qwen/Qwen3-Coder-480B-A35B-Instruct | Together AI | 256k tokens | ✓ Included |
LoRA Models
What's a LoRA?
Low-rank adapters — called "LoRAs" — are small, efficient fine-tunes that run on top of existing models. They can modify a model to be much more effective at specific tasks.
We support LoRAs for the following base models:
| Model | Provider | Context length | Status |
|---|---|---|---|
| meta-llama/Llama-3.2-1B-Instruct | Together AI | 128k tokens | ✓ Included |
| meta-llama/Llama-3.2-3B-Instruct | Together AI | 128k tokens | ✓ Included |
| meta-llama/Meta-Llama-3.1-8B-Instruct | Together AI | 128k tokens | ✓ Included |
| meta-llama/Meta-Llama-3.1-70B-Instruct | Together AI | 128k tokens | ✓ Included |
Embedding Models
Embedding models convert text into numerical vectors for search, clustering, and other applications.
There's no additional charge for using embeddings, and embeddings requests don't count against your subscription rate limit.
| Model | Provider | Context length | Status |
|---|---|---|---|
| hf:nomic-ai/nomic-embed-text-v1.5 | Fireworks | 8k tokens | ✓ Included |
Getting Started
Ready to start using our models? Check out:
- Getting Started Guide - Your first API call
- chat/completions - Most popular endpoint for conversations
Need help choosing the right model? Join our Discord community for recommendations!