Available Models
Subscriptions
Always-On, LoRA, and Embeddings models are included in every subscription.
Always-On Models
These models are included in all subscriptions. Per-token pricing is also available with usage-based billing.
Model Details
More information about each model is available via the /openai/v1/models endpoint.
| Model | Provider | Context length | Status |
|---|---|---|---|
| hf:MiniMaxAI/MiniMax-M2.5 | Synthetic | 187k tokens | ✓ Included |
| hf:moonshotai/Kimi-K2.5 | Synthetic | 256k tokens | ✓ Included |
| hf:nvidia/Kimi-K2.5-NVFP4 | Synthetic | 256k tokens | ✓ Included |
| hf:nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 | Synthetic | 256k tokens | ✓ Included |
| hf:zai-org/GLM-4.7 | Synthetic | 198k tokens | ✓ Included |
| hf:zai-org/GLM-4.7-Flash | Synthetic | 192k tokens | ✓ Included |
| hf:zai-org/GLM-5 | Synthetic | 192k tokens | ✓ Included |
| hf:deepseek-ai/DeepSeek-V3.2 | Fireworks | 159k tokens | ✓ Included |
| hf:MiniMaxAI/MiniMax-M2.1 | Fireworks | 192k tokens | ✓ Included |
| hf:moonshotai/Kimi-K2-Instruct-0905 | Fireworks | 256k tokens | ✓ Included |
| hf:moonshotai/Kimi-K2-Thinking | Fireworks | 256k tokens | ✓ Included |
| hf:openai/gpt-oss-120b | Fireworks | 128k tokens | ✓ Included |
| hf:deepseek-ai/DeepSeek-R1-0528 | Together AI | 128k tokens | ✓ Included |
| hf:deepseek-ai/DeepSeek-V3 | Together AI | 128k tokens | ✓ Included |
| hf:meta-llama/Llama-3.3-70B-Instruct | Together AI | 128k tokens | ✓ Included |
| hf:Qwen/Qwen3-235B-A22B-Thinking-2507 | Together AI | 256k tokens | ✓ Included |
| hf:Qwen/Qwen3-Coder-480B-A35B-Instruct | Together AI | 256k tokens | ✓ Included |
| hf:Qwen/Qwen3.5-397B-A17B | Together AI | 256k tokens | ✓ Included |
LoRA Models
What's a LoRA?
Low-rank adapters — called "LoRAs" — are small, efficient fine-tunes that run on top of existing models. They can modify a model to be much more effective at specific tasks.
We support LoRAs for the following base models:
| Model | Provider | Context length | Status |
|---|---|---|---|
| meta-llama/Llama-3.2-1B-Instruct | Together AI | 128k tokens | ✓ Included |
| meta-llama/Llama-3.2-3B-Instruct | Together AI | 128k tokens | ✓ Included |
| meta-llama/Meta-Llama-3.1-8B-Instruct | Together AI | 128k tokens | ✓ Included |
| meta-llama/Meta-Llama-3.1-70B-Instruct | Together AI | 128k tokens | ✓ Included |
Embedding Models
Embedding models convert text into numerical vectors for search, clustering, and other applications.
There's no additional charge for using embeddings, and embeddings requests don't count against your subscription rate limit.
| Model | Provider | Context length | Status |
|---|---|---|---|
| hf:nomic-ai/nomic-embed-text-v1.5 | Fireworks | 8k tokens | ✓ Included |
Getting Started
Ready to start using our models? Check out:
- Getting Started Guide - Your first API call
- chat/completions - Most popular endpoint for conversations
Need help choosing the right model? Join our Discord community for recommendations!