WHAT LLM PROVIDER?
Chart Configuration
Filter Models
Current: 0
Current: $30.00
Model Comparison Chart
Color Legend
GPT-4
Llama 3.1 405B
Llama 3.2 90B
Llama 3.1 70B
Llama 3.2 11B
Llama 3.1 8B
Llama 3.2 3B
Llama 3.2 1B
Gemini 1.5 Pro
Gemini 1.5 Flash
Gemma 2 27B
Gemma 2 9B
Claude 3.5 Sonnet
Claude 3 Opus
Mistral Large 2
Mixtral 8x22B
Jamba 1.5 Large
DeepSeek-V2
Qwen2 72B
Yi-Large
Model Comparison
Name | Provider | Creator | Model | Price (per M/Token) | Input Price | Output Price | Output Speed | Context Window | Latency | Quality Index |
---|---|---|---|---|---|---|---|---|---|---|
o1-preview | OpenAI | OpenAI | GPT-4o1-preview | $26.25 | $15.00 | $60.00 | 30.9 tokens/s | 128000 tokens | 32.62 s | 85 |
o1-mini | OpenAI | OpenAI | GPT-4o1-mini | $5.25 | $3.00 | $12.00 | 70.2 tokens/s | 128000 tokens | 14.58 s | 82 |
GPT-4o | OpenAI | OpenAI | GPT-4o | $4.38 | $2.50 | $10.00 | 124.2 tokens/s | 128000 tokens | 0.42 s | 77 |
GPT-4o (May '24) | OpenAI | OpenAI | GPT-4o (May 24) | $7.50 | $5.00 | $15.00 | 112.6 tokens/s | 128000 tokens | 0.42 s | 77 |
GPT-4o (May '24) | Microsoft Azure | OpenAI | GPT-4o | $7.50 | $5.00 | $15.00 | 107.2 tokens/s | 128000 tokens | 0.37 s | 77 |
GPT-4o mini | OpenAI | OpenAI | GPT-4o mini | $0.26 | $0.15 | $0.60 | 98.8 tokens/s | 128000 tokens | 0.45 s | 71 |
Llama 3.1 405B | Replicate | Meta | Llama 3.1 405B | $9.50 | $9.50 | $9.50 | 18.7 tokens/s | 128000 tokens | 1.13 s | 72 |
Llama 3.1 405B | Hyperbolic | Meta | Llama 3.1 405B | $4.00 | $4.00 | $4.00 | 16.1 tokens/s | 128000 tokens | 0.91 s | 72 |
Llama 3.1 405B | Amazon Bedrock | Meta | Llama 3.1 405B | $7.99 | $5.32 | $16.00 | 13.1 tokens/s | 128000 tokens | 1.78 s | 72 |
Llama 3.1 405B | OctoAI | Meta | Llama 3.1 405B | $4.50 | $3.00 | $9.00 | 59.6 tokens/s | 128000 tokens | 0.3 s | 72 |
Llama 3.1 405B | Lepton AI | Meta | Llama 3.1 405B | $2.80 | $2.80 | $2.80 | 20.1 tokens/s | 128000 tokens | 1.04 s | 72 |
Llama 3.1 405B | Microsoft Azure | Meta | Llama 3.1 405B | $8.00 | $5.33 | $16.00 | 8.4 tokens/s | 128000 tokens | 4.38 s | 72 |
Llama 3.1 405B | Fireworks | Meta | Llama 3.1 405B | $3.00 | $3.00 | $3.00 | 69.3 tokens/s | 128000 tokens | 0.64 s | 72 |
Llama 3.1 405B | Deepinfra | Meta | Llama 3.1 405B | $1.79 | $1.79 | $1.79 | 22.7 tokens/s | 33000 tokens | 0.45 s | 72 |
Llama 3.1 405B | SambaNova | Meta | Llama 3.1 405B | $6.25 | $5.00 | $10.00 | 127.8 tokens/s | 8000 tokens | 1.5 s | 72 |
Llama 3.1 405B | Databricks | Meta | Llama 3.1 405B | $7.50 | $5.00 | $15.00 | 28.6 tokens/s | 128000 tokens | 0.67 s | 72 |
Llama 3.1 405B Turbo | Together.ai Turbo | Meta | Llama 3.1 405B | $3.50 | $3.50 | $3.50 | 87.3 tokens/s | 128000 tokens | 0.72 s | 72 |
Llama 3.2 90B (Vision) | Hyperbolic | Meta | Llama 3.2 90B (Vision) | $0.40 | $0.40 | $0.40 | 40.2 tokens/s | 128000 tokens | 0.57 s | 66 |
Llama 3.2 90B (Vision) | Amazon Bedrock | Meta | Llama 3.2 90B (Vision) | $2.00 | $2.00 | $2.00 | 18.2 tokens/s | 128000 tokens | 0.54 s | 66 |
Llama 3.2 90B (Vision) | Fireworks | Meta | Llama 3.2 90B (Vision) | $0.90 | $0.90 | $0.90 | 50.5 tokens/s | 128000 tokens | 0.44 s | 66 |
Llama 3.2 90B (Vision) | Deepinfra | Meta | Llama 3.2 90B (Vision) | $0.36 | $0.35 | $0.40 | 24.7 tokens/s | 128000 tokens | 0.41 s | 66 |
Llama 3.2 90B (Vision) Turbo | Together.ai Turbo | Meta | Llama 3.2 90B (Vision) | $1.20 | $1.20 | $1.20 | 54.6 tokens/s | 128000 tokens | 0.38 s | 66 |
Llama 3.1 70B | Cerebras | Meta | Llama 3.1 70B | $0.60 | $0.60 | $0.60 | 569.2 tokens/s | 8000 tokens | 0.43 s | 65 |
Llama 3.1 70B | Hyperbolic | Meta | Llama 3.1 70B | $0.40 | $0.40 | $0.40 | 29.1 tokens/s | 128000 tokens | 0.69 s | 65 |
Llama 3.1 70B | Amazon Bedrock | Meta | Llama 3.1 70B | $0.99 | $0.99 | $0.99 | 31.6 tokens/s | 128000 tokens | 0.71 s | 65 |
Llama 3.1 70B | OctoAI | Meta | Llama 3.1 70B | $0.90 | $0.90 | $0.90 | 59.6 tokens/s | 128000 tokens | 0.36 s | 65 |
Llama 3.1 70B | Lepton AI | Meta | Llama 3.1 70B | $0.80 | $0.80 | $0.80 | 52.8 tokens/s | 128000 tokens | 0.58 s | 65 |
Llama 3.1 70B | Microsoft Azure | Meta | Llama 3.1 70B | $2.90 | $2.68 | $3.54 | 20.3 tokens/s | 128000 tokens | 0.64 s | 65 |
Llama 3.1 70B | Fireworks | Meta | Llama 3.1 70B | $0.90 | $0.90 | $0.90 | 86.2 tokens/s | 128000 tokens | 0.39 s | 65 |
Llama 3.1 70B | Deepinfra | Meta | Llama 3.1 70B | $0.36 | $0.35 | $0.40 | 27.6 tokens/s | 128000 tokens | 0.31 s | 65 |
Llama 3.1 70B | Groq | Meta | Llama 3.1 70B | $0.64 | $0.59 | $0.79 | 249.7 tokens/s | 128000 tokens | 0.44 s | 65 |
Llama 3.1 70B | SambaNova | Meta | Llama 3.1 70B | $0.75 | $0.60 | $1.20 | 419.7 tokens/s | 8000 tokens | 0.81 s | 65 |
Llama 3.1 70B | Databricks | Meta | Llama 3.1 70B | $1.50 | $1.00 | $3.00 | 54.6 tokens/s | 128000 tokens | 0.58 s | 65 |
Llama 3.1 70B | Perplexity | Meta | Llama 3.1 70B | $1.00 | $1.00 | $1.00 | 47.7 tokens/s | 128000 tokens | 0.33 s | 65 |
Llama 3.1 70B Turbo | Together.ai Turbo | Meta | Llama 3.1 70B | $0.88 | $0.88 | $0.88 | 27 tokens/s | 128000 tokens | 0.73 s | 65 |
Llama 3.2 11B (Vision) | Amazon Bedrock | Meta | Llama 3.2 11B (Vision) | $0.35 | $0.35 | $0.35 | 41.7 tokens/s | 128000 tokens | 0.4 s | 54 |
Llama 3.2 11B (Vision) | Fireworks | Meta | Llama 3.2 11B (Vision) | $0.20 | $0.20 | $0.20 | 119.8 tokens/s | 128000 tokens | 0.33 s | 54 |
Llama 3.2 11B (Vision) | Deepinfra | Meta | Llama 3.2 11B (Vision) | $0.06 | $0.06 | $0.06 | 77.4 tokens/s | 128000 tokens | 0.27 s | 54 |
Llama 3.2 11B (Vision) Turbo | Together.ai Turbo | Meta | Llama 3.2 11B (Vision) | $0.18 | $0.18 | $0.18 | 152.2 tokens/s | 128000 tokens | 0.32 s | 54 |
Llama 3.1 8B | Cerebras | Meta | Llama 3.1 8B | $0.10 | $0.10 | $0.10 | 2023 tokens/s | 8000 tokens | 0.46 s | 53 |
Llama 3.1 8B | Hyperbolic | Meta | Llama 3.1 8B | $0.10 | $0.10 | $0.10 | 93.4 tokens/s | 128000 tokens | 0.52 s | 53 |
Llama 3.1 8B | Amazon Bedrock | Meta | Llama 3.1 8B | $0.22 | $0.22 | $0.22 | 89.8 tokens/s | 128000 tokens | 0.4 s | 53 |
Llama 3.1 8B | OctoAI | Meta | Llama 3.1 8B | $0.15 | $0.15 | $0.15 | 174.9 tokens/s | 128000 tokens | 0.26 s | 53 |
Llama 3.1 8B | Lepton AI | Meta | Llama 3.1 8B | $0.07 | $0.07 | $0.07 | 206.7 tokens/s | 128000 tokens | 0.38 s | 53 |
Llama 3.1 8B | Microsoft Azure | Meta | Llama 3.1 8B | $0.38 | $0.30 | $0.61 | 54.3 tokens/s | 128000 tokens | 0.42 s | 53 |
Llama 3.1 8B | Fireworks | Meta | Llama 3.1 8B | $0.20 | $0.20 | $0.20 | 264.1 tokens/s | 128000 tokens | 0.28 s | 53 |
Llama 3.1 8B | Deepinfra | Meta | Llama 3.1 8B | $0.06 | $0.06 | $0.06 | 86.2 tokens/s | 128000 tokens | 0.22 s | 53 |
Llama 3.1 8B | Groq | Meta | Llama 3.1 8B | $0.06 | $0.05 | $0.08 | 751.7 tokens/s | 128000 tokens | 0.38 s | 53 |
Llama 3.1 8B | SambaNova | Meta | Llama 3.1 8B | $0.13 | $0.10 | $0.20 | 1018.6 tokens/s | 8000 tokens | 0.39 s | 53 |
Llama 3.1 8B | Perplexity | Meta | Llama 3.1 8B | $0.20 | $0.20 | $0.20 | 161.8 tokens/s | 128000 tokens | 0.2 s | 53 |
Llama 3.1 8B Turbo | Together.ai Turbo | Meta | Llama 3.1 8B | $0.18 | $0.18 | $0.18 | 145.6 tokens/s | 128000 tokens | 0.58 s | 53 |
Llama 3.2 3B | Hyperbolic | Meta | Llama 3.2 3B | $0.10 | $0.10 | $0.10 | 192.3 tokens/s | 128000 tokens | 0.5 s | 47 |
Llama 3.2 3B | Amazon Bedrock | Meta | Llama 3.2 3B | $0.15 | $0.15 | $0.15 | 142.4 tokens/s | 128000 tokens | 0.44 s | 47 |
Llama 3.2 3B | Lepton AI | Meta | Llama 3.2 3B | $0.03 | $0.03 | $0.03 | 116.1 tokens/s | 128000 tokens | 0.43 s | 47 |
Llama 3.2 3B | Fireworks | Meta | Llama 3.2 3B | $0.10 | $0.10 | $0.10 | 257.5 tokens/s | 128000 tokens | 0.32 s | 47 |
Llama 3.2 3B | Deepinfra | Meta | Llama 3.2 3B | $0.04 | $0.03 | $0.05 | 104.1 tokens/s | 128000 tokens | 0.26 s | 47 |
Llama 3.2 3B | Groq | Meta | Llama 3.2 3B | $0.06 | $0.06 | $0.06 | 1406.1 tokens/s | 8000 tokens | 0.34 s | 47 |
Llama 3.2 3B | SambaNova | Meta | Llama 3.2 3B | $0.10 | $0.08 | $0.16 | 1565.7 tokens/s | 4000 tokens | 0.34 s | 47 |
Llama 3.2 3B Turbo | Together.ai Turbo | Meta | Llama 3.2 3B | $0.06 | $0.06 | $0.06 | 124.8 tokens/s | 128000 tokens | 0.3 s | 47 |
Llama 3.2 1B | Amazon Bedrock | Meta | Llama 3.2 1B | $0.10 | $0.10 | $0.10 | 303 tokens/s | 128000 tokens | 0.38 s | 27 |
Llama 3.2 1B | Fireworks | Meta | Llama 3.2 1B | $0.10 | $0.10 | $0.10 | 507.1 tokens/s | 128000 tokens | 0.29 s | 27 |
Llama 3.2 1B | Deepinfra | Meta | Llama 3.2 1B | $0.01 | $0.01 | $0.02 | 167 tokens/s | 128000 tokens | 0.23 s | 27 |
Llama 3.2 1B | Groq | Meta | Llama 3.2 1B | $0.04 | $0.04 | $0.04 | 3129.3 tokens/s | 8000 tokens | 0.48 s | 27 |
Llama 3.2 1B | SambaNova | Meta | Llama 3.2 1B | $0.05 | $0.04 | $0.08 | 2473.9 tokens/s | 4000 tokens | 0.35 s | 27 |
Gemini 1.5 Pro (Sep '24) (Vertex) | Google (Vertex) | Gemini 1.5 Pro | $2.19 | $1.25 | $5.00 | 59.4 tokens/s | 2000000 tokens | 0.44 s | 80 | |
Gemini 1.5 Pro (Sep '24) (AI Studio) | Google (AI Studio) | Gemini 1.5 Pro | $2.19 | $1.25 | $5.00 | 61.3 tokens/s | 2000000 tokens | 0.81 s | 80 | |
Gemini 1.5 Flash-8B AI Studio | Google AI Studio | Gemini 1.5 Flash-8B | $0.07 | $0.04 | $0.15 | 283.2 tokens/s | 1000000 tokens | 0.53 s | 73 | |
Gemini 1.5 Flash (Sep '24) (Vertex) | Google (Vertex) | Gemini 1.5 Flash | $0.13 | $0.07 | $0.30 | 203.8 tokens/s | 1000000 tokens | 0.26 s | 73 | |
Gemini 1.5 Flash (Sep '24) (AI Studio) | Google (AI Studio) | Gemini 1.5 Flash | $0.13 | $0.07 | $0.30 | 209.3 tokens/s | 1000000 tokens | 0.4 s | 73 | |
Gemma 2 27B | Together.ai | Gemma 2 27B | $0.80 | $0.80 | $0.80 | 65.5 tokens/s | 8000 tokens | 0.48 s | 61 | |
Gemma 2 9B | Deepinfra | Gemma 2 9B | $0.06 | $0.06 | $0.06 | 71.1 tokens/s | 8000 tokens | 0.32 s | 46 | |
Gemma 2 9B | Groq | Gemma 2 9B | $0.20 | $0.20 | $0.20 | 665.6 tokens/s | 8000 tokens | 0.19 s | 46 | |
Gemma 2 9B | Together.ai | Gemma 2 9B | $0.30 | $0.30 | $0.30 | 114.9 tokens/s | 8000 tokens | 0.44 s | 46 | |
Gemini 1.5 Flash (May '24) (Vertex) | Google (Vertex) | Gemini 1.5 Flash | $0.13 | $0.07 | $0.30 | 303.9 tokens/s | 1000000 tokens | 0.27 s | 73 | |
Gemini 1.5 Flash (May '24) (AI Studio) | Google (AI Studio) | Gemini 1.5 Flash | $0.13 | $0.07 | $0.30 | 309.2 tokens/s | 1000000 tokens | 0.36 s | 73 | |
Gemini 1.5 Pro (May '24) (Vertex) | Google (Vertex) | Gemini 1.5 Pro | $5.25 | $3.50 | $10.50 | 64.3 tokens/s | 2000000 tokens | 0.5 s | 80 | |
Gemini 1.5 Pro (May '24) (AI Studio) | Google (AI Studio) | Gemini 1.5 Pro | $5.25 | $3.50 | $10.50 | 65.4 tokens/s | 2000000 tokens | 0.79 s | 80 | |
Claude 3.5 Sonnet | Amazon Bedrock | Anthropic | Claude 3.5 Sonnet | $6.00 | $3.00 | $15.00 | 53.5 tokens/s | 200000 tokens | 0.95 s | 77 |
Claude 3.5 Sonnet | Anthropic | Anthropic | Claude 3.5 Sonnet | $6.00 | $3.00 | $15.00 | 90.8 tokens/s | 200000 tokens | 0.8 s | 77 |
Claude 3 Opus | Amazon Bedrock | Anthropic | Claude 3 Opus | $30.00 | $15.00 | $75.00 | 23.6 tokens/s | 200000 tokens | 1.67 s | 70 |
Claude 3 Opus | Anthropic | Anthropic | Claude 3 Opus | $30.00 | $15.00 | $75.00 | 28.4 tokens/s | 200000 tokens | 2.98 s | 70 |
Claude 3 Haiku | Amazon Bedrock | Anthropic | Claude 3 Haiku | $0.50 | $0.25 | $1.25 | 120.8 tokens/s | 200000 tokens | 0.48 s | 54 |
Claude 3 Haiku | Anthropic | Anthropic | Claude 3 Haiku | $0.50 | $0.25 | $1.25 | 145.4 tokens/s | 200000 tokens | 0.45 s | 54 |
Mistral Large 2 | Mistral | Mistral AI | Mistral Large 2 | $3.00 | $2.00 | $6.00 | 31.5 tokens/s | 128000 tokens | 0.75 s | 73 |
Mistral Large 2 | Amazon Bedrock | Mistral AI | Mistral Large 2 | $4.50 | $3.00 | $9.00 | 42.8 tokens/s | 128000 tokens | 0.42 s | 73 |
Mistral Large 2 | Microsoft Azure | Mistral AI | Mistral Large 2 | $4.50 | $3.00 | $9.00 | 53.7 tokens/s | 128000 tokens | 0.44 s | 73 |
Mixtral 8x22B | Mistral | Mistral AI | Mixtral 8x22B | $3.00 | $2.00 | $6.00 | 64.6 tokens/s | 65000 tokens | 0.59 s | 61 |
Mixtral 8x22B | OctoAI | Mistral AI | Mixtral 8x22B | $1.20 | $1.20 | $1.20 | 92.4 tokens/s | 65000 tokens | 0.33 s | 61 |
Mixtral 8x22B | Fireworks | Mistral AI | Mixtral 8x22B | $1.20 | $1.20 | $1.20 | 79.3 tokens/s | 65000 tokens | 0.32 s | 61 |
Mixtral 8x22B | Together.ai | Mistral AI | Mixtral 8x22B | $1.20 | $1.20 | $1.20 | 61 tokens/s | 65000 tokens | 0.42 s | 61 |
Mistral Small (Sep '24) | Mistral | Mistral AI | Mistral Small | $0.30 | $0.20 | $0.60 | 76.2 tokens/s | 128000 tokens | 0.5 s | 60 |
Pixtral 12B | Mistral | Mistral AI | Pixtral 12B | $0.15 | $0.15 | $0.15 | 79.9 tokens/s | 128000 tokens | 0.58 s | 56 |
Pixtral 12B | Hyperbolic | Mistral AI | Pixtral 12B | $0.10 | $0.10 | $0.10 | 80.1 tokens/s | 128000 tokens | 0.52 s | 56 |
Mistral NeMo | Mistral | Mistral AI | Mistral NeMo | $0.15 | $0.15 | $0.15 | 130.1 tokens/s | 128000 tokens | 0.52 s | 52 |
Mistral NeMo | OctoAI | Mistral AI | Mistral NeMo | $0.20 | $0.20 | $0.20 | 158.5 tokens/s | 128000 tokens | 0.31 s | 52 |
Mistral NeMo | Deepinfra | Mistral AI | Mistral NeMo | $0.13 | $0.13 | $0.13 | 57.7 tokens/s | 128000 tokens | 0.25 s | 52 |
Mixtral 8x7B | Mistral | Mistral AI | Mixtral 8x7B | $0.70 | $0.70 | $0.70 | 86.4 tokens/s | 33000 tokens | 0.56 s | 42 |
Mixtral 8x7B | Replicate | Mistral AI | Mixtral 8x7B | $0.47 | $0.30 | $1.00 | 89.5 tokens/s | 33000 tokens | 0.55 s | 42 |
Mixtral 8x7B | Amazon Bedrock | Mistral AI | Mixtral 8x7B | $0.51 | $0.45 | $0.70 | 68.7 tokens/s | 33000 tokens | 0.36 s | 42 |
Mixtral 8x7B | OctoAI | Mistral AI | Mixtral 8x7B | $0.45 | $0.45 | $0.45 | 81.5 tokens/s | 33000 tokens | 0.33 s | 42 |
Mixtral 8x7B | Lepton AI | Mistral AI | Mixtral 8x7B | $0.50 | $0.50 | $0.50 | 105.4 tokens/s | 33000 tokens | 0.48 s | 42 |
Mixtral 8x7B | Fireworks | Mistral AI | Mixtral 8x7B | $0.50 | $0.50 | $0.50 | 98.3 tokens/s | 33000 tokens | 0.29 s | 42 |
Mixtral 8x7B | Deepinfra | Mistral AI | Mixtral 8x7B | $0.24 | $0.24 | $0.24 | 40.5 tokens/s | 33000 tokens | 0.28 s | 42 |
Mixtral 8x7B | Groq | Mistral AI | Mixtral 8x7B | $0.24 | $0.24 | $0.24 | 543.6 tokens/s | 33000 tokens | 0.22 s | 42 |
Mixtral 8x7B | Databricks | Mistral AI | Mixtral 8x7B | $0.63 | $0.50 | $1.00 | 86.7 tokens/s | 33000 tokens | 0.48 s | 42 |
Mixtral 8x7B | Together.ai | Mistral AI | Mixtral 8x7B | $0.60 | $0.60 | $0.60 | 104.3 tokens/s | 33000 tokens | 0.44 s | 42 |
Codestral-Mamba | Mistral | Mistral AI | Codestral-Mamba | $0.25 | $0.25 | $0.25 | 94.1 tokens/s | 256000 tokens | 0.68 s | 36 |
Command-R+ | Amazon Bedrock | Cohere | Command-R+ | $6.00 | $3.00 | $15.00 | 44.9 tokens/s | 128000 tokens | 0.58 s | 56 |
Command-R+ | Cohere | Cohere | Command-R+ | $4.38 | $2.50 | $10.00 | 65.2 tokens/s | 128000 tokens | 0.29 s | 56 |
Command-R | Amazon Bedrock | Cohere | Command-R | $0.75 | $0.50 | $1.50 | 102.6 tokens/s | 128000 tokens | 0.39 s | 51 |
Command-R | Cohere | Cohere | Command-R | $0.26 | $0.15 | $0.60 | 111.5 tokens/s | 128000 tokens | 0.22 s | 51 |
Command-R+ (Apr '24) | Amazon Bedrock | Cohere | Command-R+ | $6.00 | $3.00 | $15.00 | 45.2 tokens/s | 128000 tokens | 0.57 s | 46 |
Command-R+ (Apr '24) | Cohere | Cohere | Command-R+ | $6.00 | $3.00 | $15.00 | 41.4 tokens/s | 128000 tokens | 0.32 s | 46 |
Command-R+ (Apr '24) | Microsoft Azure | Cohere | Command-R+ | $6.00 | $3.00 | $15.00 | 45.6 tokens/s | 128000 tokens | 0.69 s | 46 |
Command-R (Mar '24) | Amazon Bedrock | Cohere | Command-R | $0.75 | $0.50 | $1.50 | 102.6 tokens/s | 128000 tokens | 0.39 s | 36 |
Command-R (Mar '24) | Cohere | Cohere | Command-R | $0.75 | $0.50 | $1.50 | 153 tokens/s | 128000 tokens | 0.2 s | 36 |
Command-R (Mar '24) | Microsoft Azure | Cohere | Command-R | $0.75 | $0.50 | $1.50 | 103.9 tokens/s | 128000 tokens | 0.52 s | 36 |
Sonar 3.1 Small | Perplexity | Perplexity | Sonar 3.1 Small | $0.20 | $0.20 | $0.20 | 131.8 tokens/s | 131000 tokens | 0.19 s | N/A |
Sonar 3.1 Large | Perplexity | Perplexity | Sonar 3.1 Large | $1.00 | $1.00 | $1.00 | 58.8 tokens/s | 131000 tokens | 0.24 s | N/A |
Phi-3 Medium 14B | Microsoft Azure | Microsoft | Phi-3 Medium 14B | $0.30 | $0.17 | $0.68 | 51.2 tokens/s | 128000 tokens | 0.44 s | N/A |
DBRX | Databricks | Databricks | DBRX | $1.13 | $0.75 | $2.25 | 84.7 tokens/s | 33000 tokens | 0.49 s | 49 |
DBRX | Together.ai | Databricks | DBRX | $1.20 | $1.20 | $1.20 | 104.8 tokens/s | 33000 tokens | 0.36 s | 49 |
ALT 11B | Reka | Reka AI | ALT 11B | $0.16 | $0.16 | $0.16 | 147.6 tokens/s | 128000 tokens | 0.21 s | 49 |
ALT 7B | Reka | Reka AI | ALT 7B | $0.10 | $0.10 | $0.10 | 152.3 tokens/s | 128000 tokens | 0.18 s | 49 |
ALT 3B | OctoAI | Reka AI | ALT 3B | $0.05 | $0.05 | $0.05 | 91.5 tokens/s | 128000 tokens | 0.12 s | 49 |
Jamba 1.5 Large | AI21 Labs | AI21 Labs | Jamba 1.5 Large | $3.50 | $2.00 | $8.00 | 59.2 tokens/s | 256000 tokens | 1.01 s | 64 |
Jamba 1.5 Large | Microsoft Azure | AI21 Labs | Jamba 1.5 Large | $3.50 | $2.00 | $8.00 | 51.1 tokens/s | 256000 tokens | 0.69 s | 64 |
Jamba 1.5 Mini | AI21 Labs | AI21 Labs | Jamba 1.5 Mini | $0.25 | $0.20 | $0.40 | 161.2 tokens/s | 256000 tokens | 0.85 s | 46 |
Jamba 1.5 Mini | Microsoft Azure | AI21 Labs | Jamba 1.5 Mini | $0.25 | $0.20 | $0.40 | 82.5 tokens/s | 256000 tokens | 0.5 s | 46 |
DeepSeek-Coder-V2 | DeepSeek | DeepSeek | DeepSeek-Coder-V2 | $0.17 | $0.14 | $0.28 | 16 tokens/s | 128000 tokens | 1.12 s | 67 |
DeepSeek-V2 | DeepSeek | DeepSeek | DeepSeek-V2 | $0.17 | $0.14 | $0.28 | 16.4 tokens/s | 128000 tokens | 1.15 s | 66 |
DeepSeek-V2.5 | DeepSeek | DeepSeek | DeepSeek-V2.5 | $0.17 | $0.14 | $0.28 | 16 tokens/s | 128000 tokens | 1.13 s | 66 |
DeepSeek-V2.5 | Hyperbolic | DeepSeek | DeepSeek-V2.5 | $2.00 | $2.00 | $2.00 | 7.6 tokens/s | 128000 tokens | 0.84 s | N/A |
Qwen2.5 72B | Hyperbolic | Alibaba | Qwen2.5 72B | $0.40 | $0.40 | $0.40 | 34.4 tokens/s | 131000 tokens | 0.63 s | 75 |
Qwen2.5 72B | Deepinfra | Alibaba | Qwen2.5 72B | $0.36 | $0.35 | $0.40 | 35.8 tokens/s | 131000 tokens | 0.29 s | 75 |
Qwen2 72B | Fireworks | Alibaba | Qwen2 72B | $0.90 | $0.90 | $0.90 | 58.9 tokens/s | 128000 tokens | 0.33 s | 69 |
Qwen2 72B | Deepinfra | Alibaba | Qwen2 72B | $0.36 | $0.35 | $0.40 | 31.1 tokens/s | 33000 tokens | 0.45 s | 69 |
Qwen2 72B | Together.ai | Alibaba | Qwen2 72B | $0.90 | $0.90 | $0.90 | 65.4 tokens/s | 33000 tokens | 0.44 s | 69 |
Yi-Large | Fireworks | 01.AI | Yi-Large | $3.00 | $3.00 | $3.00 | 63 tokens/s | 32000 tokens | 0.48 s | 58 |
GPT-4 Turbo | OpenAI | OpenAI | GPT-4 Turbo | $15.00 | $10.00 | $30.00 | 33.1 tokens/s | 128000 tokens | 0.67 s | 74 |
GPT-4 Turbo | Microsoft Azure | OpenAI | GPT-4 Turbo | $15.00 | $10.00 | $30.00 | 49.5 tokens/s | 128000 tokens | 0.53 s | 74 |
GPT-3.5 Turbo | OpenAI | OpenAI | GPT-3.5 Turbo | $0.75 | $0.50 | $1.50 | 86.4 tokens/s | 16000 tokens | 0.43 s | 52 |
GPT-3.5 Turbo | Microsoft Azure | OpenAI | GPT-3.5 Turbo | $0.75 | $0.50 | $1.50 | 89.1 tokens/s | 16000 tokens | 0.34 s | 52 |
GPT-3.5 Turbo Instruct | OpenAI | OpenAI | GPT-3.5 Turbo Instruct | $1.63 | $1.50 | $2.00 | 107.3 tokens/s | 4000 tokens | 0.41 s | N/A |
GPT-3.5 Turbo Instruct | Microsoft Azure | OpenAI | GPT-3.5 Turbo Instruct | $1.63 | $1.50 | $2.00 | 127.3 tokens/s | 4000 tokens | 0.61 s | N/A |
Llama 3 70B | Replicate | Meta | Llama 3 70B | $1.18 | $0.65 | $2.75 | 47.9 tokens/s | 8000 tokens | 0.55 s | 62 |
Llama 3 70B | Hyperbolic | Meta | Llama 3 70B | $0.40 | $0.40 | $0.40 | 29.2 tokens/s | 8000 tokens | 1.6 s | 62 |
Llama 3 70B | Amazon Bedrock | Meta | Llama 3 70B | $2.86 | $2.65 | $3.50 | 52 tokens/s | 8000 tokens | 0.46 s | 62 |
Llama 3 70B | OctoAI | Meta | Llama 3 70B | $0.90 | $0.90 | $0.90 | 69.4 tokens/s | 8000 tokens | 0.34 s | 62 |
Llama 3 70B | Lepton AI | Meta | Llama 3 70B | $0.80 | $0.80 | $0.80 | 30.1 tokens/s | 8000 tokens | 0.82 s | 62 |
Llama 3 70B | Microsoft Azure | Meta | Llama 3 70B | $2.90 | $2.68 | $3.54 | 18.3 tokens/s | 8000 tokens | 0.81 s | 62 |
Llama 3 70B | Fireworks | Meta | Llama 3 70B | $0.90 | $0.90 | $0.90 | 107 tokens/s | 8000 tokens | 0.33 s | 62 |
Llama 3 70B | Deepinfra | Meta | Llama 3 70B | $0.36 | $0.35 | $0.40 | 21.6 tokens/s | 8000 tokens | 0.32 s | 62 |
Llama 3 70B | Groq | Meta | Llama 3 70B | $0.64 | $0.59 | $0.79 | 316.4 tokens/s | 8000 tokens | 0.23 s | 62 |
Llama 3 70B (Reference, FP16) | Together.ai (Reference, FP16) | Meta | Llama 3 70B | $0.90 | $0.90 | $0.90 | 108.2 tokens/s | 8000 tokens | 0.65 s | N/A |
Llama 3 70B (Turbo, FP8) | Together.ai (Turbo, FP8) | Meta | Llama 3 70B | $0.88 | $0.88 | $0.88 | 85 tokens/s | 8000 tokens | 0.51 s | N/A |
Llama 3 8B | Replicate | Meta | Llama 3 8B | $0.10 | $0.05 | $0.25 | 58.7 tokens/s | 8000 tokens | 0.57 s | 53 |
Llama 3 8B | Amazon Bedrock | Meta | Llama 3 8B | $0.38 | $0.30 | $0.60 | 78.4 tokens/s | 8000 tokens | 0.35 s | 53 |
Llama 3 8B | Lepton AI | Meta | Llama 3 8B | $0.07 | $0.07 | $0.07 | 84.1 tokens/s | 8000 tokens | 1.08 s | 53 |
Llama 3 8B | Microsoft Azure | Meta | Llama 3 8B | $0.38 | $0.30 | $0.61 | 73.4 tokens/s | 8000 tokens | 0.48 s | 53 |
Llama 3 8B | Fireworks | Meta | Llama 3 8B | $0.20 | $0.20 | $0.20 | 118.3 tokens/s | 8000 tokens | 0.36 s | 53 |
Llama 3 8B | Deepinfra | Meta | Llama 3 8B | $0.06 | $0.06 | $0.06 | 105.1 tokens/s | 8000 tokens | 0.21 s | 53 |
Llama 3 8B | Groq | Meta | Llama 3 8B | $0.06 | $0.05 | $0.08 | 1202.2 tokens/s | 8000 tokens | 0.3 s | 53 |
Llama 3 8B | Together.ai | Meta | Llama 3 8B | $0.20 | $0.20 | $0.20 | 291.8 tokens/s | 8000 tokens | 0.47 s | 53 |
Llama 2 Chat 70B | Replicate | Meta | Llama 2 Chat 70B | $1.18 | $0.65 | $2.75 | 48.1 tokens/s | 4000 tokens | 0.71 s | 39 |
Llama 2 Chat 70B | Amazon Bedrock | Meta | Llama 2 Chat 70B | $2.10 | $1.95 | $2.56 | 36.7 tokens/s | 4000 tokens | 0.47 s | 39 |
Llama 2 Chat 70B | OctoAI | Meta | Llama 2 Chat 70B | $0.90 | $0.90 | $0.90 | 174.1 tokens/s | 4000 tokens | 0.25 s | 39 |
Llama 2 Chat 70B | Microsoft Azure | Meta | Llama 2 Chat 70B | $1.60 | $1.54 | $1.77 | tokens/s | 4000 tokens | s | 39 |
Llama 2 Chat 13B | Amazon Bedrock | Meta | Llama 2 Chat 13B | $0.81 | $0.75 | $1.00 | 53.1 tokens/s | 4000 tokens | 0.41 s | 36 |
Llama 2 Chat 13B | OctoAI | Meta | Llama 2 Chat 13B | $0.20 | $0.20 | $0.20 | 170.8 tokens/s | 4000 tokens | 0.24 s | 36 |
Llama 2 Chat 13B | Together.ai | Meta | Llama 2 Chat 13B | $0.30 | $0.30 | $0.30 | 52.1 tokens/s | 4000 tokens | 0.5 s | 36 |
Llama 2 Chat 7B | Replicate | Meta | Llama 2 Chat 7B | $0.10 | $0.05 | $0.25 | 124.3 tokens/s | 4000 tokens | 0.55 s | 29 |
Llama 2 Chat 7B | Microsoft Azure | Meta | Llama 2 Chat 7B | $0.56 | $0.52 | $0.67 | tokens/s | 4000 tokens | s | 29 |
Gemini 1.0 Pro (AI Studio) | Google (AI Studio) | Gemini 1.0 Pro | $0.75 | $0.50 | $1.50 | 97.7 tokens/s | 33000 tokens | 1.21 s | N/A | |
Claude 3 Sonnet | Amazon Bedrock | Anthropic | Claude 3 Sonnet | $6.00 | $3.00 | $15.00 | 47.1 tokens/s | 200000 tokens | 0.81 s | 77 |
Claude 3 Sonnet | Anthropic | Anthropic | Claude 3 Sonnet | $6.00 | $3.00 | $15.00 | 64.9 tokens/s | 200000 tokens | 0.79 s | 77 |
Mistral Large | Mistral | Mistral AI | Mistral Large | $6.00 | $4.00 | $12.00 | 31.6 tokens/s | 33000 tokens | 0.76 s | 56 |
Mistral Large | Amazon Bedrock | Mistral AI | Mistral Large | $6.00 | $4.00 | $12.00 | 37.3 tokens/s | 33000 tokens | 0.39 s | 56 |
Mistral Large | Microsoft Azure | Mistral AI | Mistral Large | $6.00 | $4.00 | $12.00 | 40.1 tokens/s | 33000 tokens | 0.57 s | 56 |
Mistral Small (Feb '24) | Mistral | Mistral AI | Mistral Small | $1.50 | $1.00 | $3.00 | 76.6 tokens/s | 33000 tokens | 0.58 s | 50 |
Mistral Small (Feb '24) | Microsoft Azure | Mistral AI | Mistral Small | $1.50 | $1.00 | $3.00 | 52.8 tokens/s | 33000 tokens | 0.45 s | 50 |
Mistral 7B | Mistral | Mistral AI | Mistral 7B | $0.25 | $0.25 | $0.25 | 96.6 tokens/s | 33000 tokens | 0.49 s | 24 |
Mistral 7B | Replicate | Mistral AI | Mistral 7B | $0.10 | $0.05 | $0.25 | 61.7 tokens/s | 33000 tokens | 30.44 s | 24 |
Mistral 7B | Amazon Bedrock | Mistral AI | Mistral 7B | $0.16 | $0.15 | $0.20 | 79.1 tokens/s | 33000 tokens | 0.35 s | 24 |
Mistral 7B | OctoAI | Mistral AI | Mistral 7B | $0.15 | $0.15 | $0.15 | 179.1 tokens/s | 33000 tokens | 0.21 s | 24 |
Mistral 7B | Lepton AI | Mistral AI | Mistral 7B | $0.07 | $0.07 | $0.07 | 102.9 tokens/s | 33000 tokens | 0.97 s | 24 |
Mistral 7B | Deepinfra | Mistral AI | Mistral 7B | $0.06 | $0.06 | $0.06 | 111.1 tokens/s | 33000 tokens | 0.2 s | 24 |
Mistral 7B | Perplexity | Mistral AI | Mistral 7B | $0.20 | $0.20 | $0.20 | 123.4 tokens/s | 16000 tokens | 0.22 s | 24 |
Mistral 7B | Together.ai | Mistral AI | Mistral 7B | $0.20 | $0.20 | $0.20 | 126.2 tokens/s | 8000 tokens | 0.32 s | 24 |
Codestral | Mistral | Mistral AI | Codestral | $0.30 | $0.20 | $0.60 | 47.8 tokens/s | 33000 tokens | 0.57 s | N/A |
Mistral Medium | Mistral | Mistral AI | Mistral Medium | $4.09 | $2.75 | $8.10 | 38 tokens/s | 33000 tokens | 0.86 s | 70 |
Llama 3.1 8B | Nebius AI | Meta | Llama 3.1 8B | $0.03 | $0.02 | $0.06 | 30 tokens/s | 128000 tokens | s | 73 |
Llama 3.1 8B | Nebius AI Fast | Meta | Llama 3.1 8B | $0.04 | $0.03 | $0.09 | 155 tokens/s | 128000 tokens | s | 73 |
Llama 3.1 70B | Nebius AI | Meta | Llama 3.1 70B | $0.20 | $0.13 | $0.40 | 25 tokens/s | 128000 tokens | s | 65 |
Llama 3.1 70B | Nebius AI Fast | Meta | Llama 3.1 70B | $0.36 | $0.25 | $0.70 | 140 tokens/s | 128000 tokens | s | 65 |
Llama 3.1 405B | Nebius AI | Meta | Llama 3.1 405B | $1.50 | $1.00 | $3.00 | 20 tokens/s | 128000 tokens | s | 72 |
Mistral-Nemo-Instruct-2407 | Nebius AI | Mistral | Mistral Nemo | $0.06 | $0.04 | $0.12 | 30 tokens/s | 128000 tokens | s | 52 |
Mistral-Nemo-Instruct-2407 | Nebius AI Fast | Mistral | Mistral Nemo | $0.12 | $0.08 | $0.24 | 100 tokens/s | 128000 tokens | s | 52 |
Mixtral-8x7B-Instruct-v0.1 | Nebius AI | Mistral | Mixtral 8x7B | $0.12 | $0.08 | $0.24 | 23 tokens/s | 33000 tokens | s | 42 |
Mixtral-8x7B-Instruct-v0.1 | Nebius AI Fast | Mistral | Mixtral 8x7B | $0.23 | $0.15 | $0.45 | 143 tokens/s | 33000 tokens | s | 42 |
Mixtral-8x22B-Instruct-v0.1 | Nebius AI | Mistral | Mixtral 8x22B | $0.60 | $0.40 | $1.20 | 23 tokens/s | 65000 tokens | s | 61 |
Mixtral-8x22B-Instruct-v0.1 | Nebius AI Fast | Mistral | Mixtral 8x22B | $1.05 | $0.70 | $2.10 | 135 tokens/s | 65000 tokens | s | 61 |
Qwen2.5-Coder-7B | Nebius AI | Qwen | Qwen2.5-Coder-7B | $0.04 | $0.03 | $0.09 | 70 tokens/s | 32000 tokens | s | 74 |
Qwen2.5-Coder-7B | Nebius AI Fast | Qwen | Qwen2.5-Coder-7B | $0.31 | $0.31 | $0.30 | 132 tokens/s | 32000 tokens | s | 74 |
Qwen2.5-Coder-7B-Instruct | Nebius AI | Qwen | Qwen2.5-Coder-7B | $0.04 | $0.03 | $0.09 | 70 tokens/s | 32000 tokens | s | 74 |
Qwen2.5-Coder-7B-Instruct | Nebius AI Fast | Qwen | Qwen2.5-Coder-7B | $0.31 | $0.31 | $0.30 | 132 tokens/s | 32000 tokens | s | 74 |
DeepSeek-Coder-V2-Lite-Instruct | Nebius AI | DeepSeek | DeepSeek-Coder-V2-Lite | $0.06 | $0.04 | $0.12 | 30 tokens/s | 128000 tokens | s | N/A |
DeepSeek-Coder-V2-Lite-Instruct | Nebius AI Fast | DeepSeek | DeepSeek-Coder-V2-Lite | $0.12 | $0.08 | $0.24 | 50 tokens/s | 128000 tokens | s | N/A |
Phi-3-mini-4k-instruct | Nebius AI | Microsoft | Phi-3-mini-4k | $0.06 | $0.04 | $0.13 | 13 tokens/s | 4000 tokens | s | N/A |
Phi-3-mini-4k-instruct | Nebius AI Fast | Microsoft | Phi-3-mini-4k | $0.20 | $0.13 | $0.40 | 40 tokens/s | 4000 tokens | s | N/A |
OLMo-7B-Instruct | Nebius AI | AllenAI | OLMo-7B | $0.12 | $0.08 | $0.24 | 25 tokens/s | 2000 tokens | s | N/A |