Compare LLM models across providers to find the best fit for your needs
Filter models by their quality score
Filter models by price per million tokens
Model Name | Creator | Provider | Quality Index | Context Window | Price/1M Tokens | Speed (tokens/s) | Latency (s) |
---|---|---|---|---|---|---|---|
Aya Expanse 32B Aya Expanse 32B | Cohere | Cohere | 20 | 128,000 | $0.75 | 118.8 | 0.17s |
Aya Expanse 8B Aya Expanse 8B | Cohere | Cohere | 16 | 8,000 | $0.75 | 167.3 | 0.21s |
Claude 2.1 Claude 2.1 | Anthropic | Amazon Bedrock | 24 | 200,000 | $12.00 | 29.6 | 1.81s |
Claude 2.1 Claude 2.1 | Anthropic | Anthropic | 24 | 200,000 | $12.00 | 13.7 | 0.92s |
Claude 3 Opus Claude 3 Opus | Anthropic | Amazon Bedrock | 35 | 200,000 | $30.00 | 27.1 | 1.20s |
Claude 3 Opus Vertex Claude 3 Opus | Anthropic | Google Vertex | 35 | 200,000 | $30.00 | 21.2 | 2.42s |
Claude 3 Opus Claude 3 Opus | Anthropic | Anthropic | 35 | 200,000 | $30.00 | 28.3 | 1.15s |
Claude 3 Sonnet Claude 3 Sonnet | Anthropic | Amazon Bedrock | 28 | 200,000 | $6.00 | 64.8 | 0.74s |
Claude 3 Sonnet Claude 3 Sonnet | Anthropic | Anthropic | 28 | 200,000 | $6.00 | 59.4 | 0.56s |
Claude 3.5 Haiku Standard Claude 3.5 Haiku | Anthropic | Amazon Bedrock (Standard) | 35 | 200,000 | $1.60 | 54.4 | 1.32s |
Claude 3.5 Haiku Latency Optimized Claude 3.5 Haiku | Anthropic | Amazon Bedrock (Latency Optimized) | 35 | 200,000 | $2.00 | 92.8 | 0.51s |
Claude 3.5 Haiku Vertex Claude 3.5 Haiku | Anthropic | Google Vertex | 35 | 200,000 | $1.60 | 65.9 | 0.76s |
Claude 3.5 Haiku Claude 3.5 Haiku | Anthropic | Anthropic | 35 | 200,000 | $1.60 | 64.1 | 1.34s |
Claude 3.5 Sonnet (Oct) Claude 3.5 Sonnet (Oct) | Anthropic | Amazon Bedrock | 44 | 200,000 | $6.00 | 49.9 | 0.95s |
Claude 3.5 Sonnet (Oct) Vertex Claude 3.5 Sonnet (Oct) | Anthropic | Google Vertex | 44 | 200,000 | $6.00 | 79.6 | 0.87s |
Claude 3.5 Sonnet (Oct) Claude 3.5 Sonnet (Oct) | Anthropic | Anthropic | 44 | 200,000 | $6.00 | 77.8 | 1.41s |
Claude 3.7 Sonnet Claude 3.7 Sonnet | Anthropic | Amazon Bedrock | 48 | 200,000 | $6.00 | 53.0 | 1.05s |
Claude 3.7 Sonnet Vertex Claude 3.7 Sonnet | Anthropic | Google Vertex | 48 | 200,000 | $6.00 | 78.1 | 0.91s |
Claude 3.7 Sonnet Claude 3.7 Sonnet | Anthropic | Anthropic | 48 | 200,000 | $6.00 | 78.7 | 1.19s |
Claude 3.7 Sonnet Thinking Claude 3.7 Sonnet Thinking | Anthropic | Amazon Bedrock | 57 | 200,000 | $6.00 | 76.3 | 1.48s |
Claude 3.7 Sonnet Thinking Claude 3.7 Sonnet Thinking | Anthropic | Anthropic | 57 | 200,000 | $6.00 | 88.3 | 1.37s |
Claude 4 Opus Claude 4 Opus | Anthropic | Amazon Bedrock | 58 | 200,000 | $30.00 | 22.4 | 3.64s |
Claude 4 Opus Vertex Claude 4 Opus | Anthropic | Google Vertex | 58 | 200,000 | $30.00 | 63.2 | 1.62s |
Claude 4 Opus Claude 4 Opus | Anthropic | Anthropic | 58 | 200,000 | $30.00 | 63.8 | 1.84s |
Claude 4 Opus Thinking Claude 4 Opus Thinking | Anthropic | Amazon Bedrock | 64 | 200,000 | $30.00 | 19.1 | 3.47s |
Claude 4 Opus Thinking Vertex Claude 4 Opus Thinking | Anthropic | Google Vertex | 64 | 200,000 | $30.00 | 59.2 | 1.62s |
Claude 4 Opus Thinking Claude 4 Opus Thinking | Anthropic | Anthropic | 64 | 200,000 | $30.00 | 65.5 | 1.83s |
Claude 4 Sonnet Claude 4 Sonnet | Anthropic | Amazon Bedrock | 53 | 200,000 | $6.00 | 63.8 | 1.30s |
Claude 4 Sonnet Vertex Claude 4 Sonnet | Anthropic | Google Vertex | 53 | 200,000 | $6.00 | 83.8 | 1.25s |
Claude 4 Sonnet Claude 4 Sonnet | Anthropic | Anthropic | 53 | 200,000 | $6.00 | 80.3 | 1.63s |
Claude 4 Sonnet Thinking Claude 4 Sonnet Thinking | Anthropic | Amazon Bedrock | 61 | 200,000 | $6.00 | 44.0 | 1.17s |
Claude 4 Sonnet Thinking Vertex Claude 4 Sonnet Thinking | Anthropic | Google Vertex | 61 | 200,000 | $6.00 | 72.8 | 1.17s |
Claude 4 Sonnet Thinking Claude 4 Sonnet Thinking | Anthropic | Anthropic | 61 | 200,000 | $6.00 | 85.0 | 1.44s |
Codestral (Jan '25) Codestral (Jan '25) | Mistral | Mistral | 28 | 256,000 | $0.45 | 163.8 | 0.30s |
Codestral (Jan '25) Vertex Codestral (Jan '25) | Mistral | Google Vertex | 28 | 128,000 | $0.45 | 150.5 | 0.14s |
Command A Command A | Cohere | Cohere | 40 | 256,000 | $4.38 | 165.6 | 0.21s |
Command-R Command-R | Cohere | Amazon Bedrock | 15 | 128,000 | $0.75 | 107.6 | 0.34s |
Command-R Command-R | Cohere | Cohere | 15 | 128,000 | $0.26 | 72.1 | 0.19s |
Command-R (Mar '24) Command-R (Mar '24) | Cohere | Amazon Bedrock | 15 | 128,000 | $0.75 | 107.2 | 0.33s |
Command-R (Mar '24) Command-R (Mar '24) | Cohere | Cohere | 15 | 128,000 | $0.75 | 179.0 | 0.15s |
Command-R+ Command-R+ | Cohere | Amazon Bedrock | 21 | 128,000 | $6.00 | 48.1 | 0.49s |
Command-R+ Command-R+ | Cohere | Cohere | 21 | 128,000 | $4.38 | 48.6 | 0.26s |
Command-R+ (Apr '24) Command-R+ (Apr '24) | Cohere | Amazon Bedrock | 20 | 128,000 | $6.00 | 48.1 | 0.49s |
Command-R+ (Apr '24) Command-R+ (Apr '24) | Cohere | Cohere | 20 | 128,000 | $6.00 | 58.8 | 0.22s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | Lambda | 60 | 164,000 | $0.95 | 39.8 | 0.33s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | Hyperbolic | 60 | 128,000 | $2.00 | 94.1 | 0.96s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | Amazon Bedrock | 60 | 128,000 | $2.36 | 229.3 | 0.37s |
DeepSeek R1 (Jan '25) Base DeepSeek R1 (Jan '25) | DeepSeek | Nebius (Base) | 60 | 128,000 | $1.20 | 35.0 | 0.62s |
DeepSeek R1 (Jan '25) Fast DeepSeek R1 (Jan '25) | DeepSeek | Nebius (Fast) | 60 | 128,000 | $3.00 | 80.9 | 0.65s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | Microsoft Azure | 60 | 128,000 | $2.36 | 120.8 | 0.78s |
DeepSeek R1 (Jan '25) (Fast) DeepSeek R1 (Jan '25) | DeepSeek | Fireworks (Fast) | 60 | 164,000 | $4.25 | 111.1 | 0.55s |
DeepSeek R1 (Jan '25) (Base) DeepSeek R1 (Jan '25) | DeepSeek | Fireworks (Base) | 60 | 164,000 | $0.96 | 90.4 | 0.50s |
DeepSeek R1 (Jan '25) (Turbo, FP4) DeepSeek R1 (Jan '25) | DeepSeek | Deepinfra (Turbo, FP4) | 60 | 33,000 | $1.50 | 207.2 | 0.24s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | Deepinfra | 60 | 64,000 | $0.88 | 122.2 | 0.26s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | FriendliAI | 60 | 128,000 | $4.00 | 94.2 | 0.48s |
DeepSeek R1 (Jan '25) Turbo DeepSeek R1 (Jan '25) | DeepSeek | Novita (Turbo) | 60 | 64,000 | $1.15 | 32.6 | 0.79s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | Novita | 60 | 64,000 | $4.00 | 32.8 | 0.76s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | Together.ai | 60 | 128,000 | $4.00 | 335.1 | 0.68s |
DeepSeek R1 (Jan '25) DeepSeek R1 (Jan '25) | DeepSeek | kluster.ai | 60 | 128,000 | $3.50 | 76.8 | 0.52s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Lambda | 68 | 164,000 | $0.92 | 39.2 | 0.32s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | DeepSeek | 68 | 64,000 | $0.96 | 26.3 | 3.07s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Parasail | 68 | 164,000 | $1.59 | 109.8 | 0.45s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Hyperbolic | 68 | 164,000 | $3.00 | 105.2 | 0.99s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Nebius AI Studio base | 68 | 164,000 | $1.00 | 103.1 | 0.61s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | CentML | 68 | 64,000 | $0.00 | 87.5 | 0.82s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Microsoft Azure | 68 | 128,000 | $2.36 | 119.0 | 0.75s |
DeepSeek R1 0528 (May '25) Fast DeepSeek R1 0528 (May '25) | DeepSeek | Fireworks (Fast) | 68 | 164,000 | $4.25 | 267.1 | 0.47s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Deepinfra | 68 | 164,000 | $0.91 | 84.2 | 0.28s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Novita | 68 | 128,000 | $1.15 | 117.3 | 0.56s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | GMI | 68 | 131,000 | $1.18 | 63.6 | 0.61s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | SambaNova | 68 | 33,000 | $5.50 | 131.8 | 4.57s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | Together.ai | 68 | 128,000 | $4.00 | 338.5 | 0.69s |
DeepSeek R1 0528 (May '25) (Throughput) DeepSeek R1 0528 (May '25) | DeepSeek | Together.ai (Throughput) | 68 | 128,000 | $0.96 | 24.2 | 1.92s |
DeepSeek R1 0528 (May '25) DeepSeek R1 0528 (May '25) | DeepSeek | kluster.ai | 68 | 164,000 | $3.50 | 80.0 | 0.52s |
DeepSeek R1 0528 (May '25) (Vertex) DeepSeek R1 0528 (May '25) (Vertex) | DeepSeek | Google (Vertex) | 68 | 128,000 | $0.00 | 121.3 | 0.38s |
DeepSeek R1 0528 Qwen3 8B DeepSeek R1 0528 Qwen3 8B | DeepSeek | Parasail | 52 | 131,000 | $0.06 | 124.9 | 0.41s |
DeepSeek R1 0528 Qwen3 8B DeepSeek R1 0528 Qwen3 8B | DeepSeek | Novita | 52 | 128,000 | $0.07 | 90.7 | 0.79s |
DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B | DeepSeek | Lambda | 48 | 128,000 | $0.30 | 64.6 | 0.28s |
DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B | DeepSeek | Cerebras | 48 | 66,000 | $0.94 | 2473.0 | 0.21s |
DeepSeek R1 Distill Llama 70B Base DeepSeek R1 Distill Llama 70B | DeepSeek | Nebius (Base) | 48 | 128,000 | $0.38 | 59.5 | 0.56s |
DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B | DeepSeek | Deepinfra | 48 | 128,000 | $0.17 | 32.5 | 0.38s |
DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B | DeepSeek | Novita | 48 | 32,000 | $0.80 | 32.5 | 0.59s |
DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B | DeepSeek | GMI | 48 | 24,000 | $0.38 | 35.9 | 0.82s |
DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B | DeepSeek | Groq | 48 | 128,000 | $0.81 | 368.3 | 0.17s |
DeepSeek R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B | DeepSeek | SambaNova | 48 | 131,000 | $0.88 | 315.5 | 1.67s |
DeepSeek R1 Distill Llama 8B DeepSeek R1 Distill Llama 8B | DeepSeek | Novita | 34 | 32,000 | $0.04 | 55.9 | 0.74s |
DeepSeek R1 Distill Qwen 14B DeepSeek R1 Distill Qwen 14B | DeepSeek | Novita | 49 | 64,000 | $0.15 | 44.0 | 1.24s |
DeepSeek R1 Distill Qwen 14B DeepSeek R1 Distill Qwen 14B | DeepSeek | GMI | 49 | 131,000 | $0.20 | 82.8 | 0.63s |
DeepSeek R1 Distill Qwen 14B DeepSeek R1 Distill Qwen 14B | DeepSeek | Together.ai | 49 | 128,000 | $1.60 | 166.5 | 0.26s |
DeepSeek R1 Distill Qwen 32B DeepSeek R1 Distill Qwen 32B | DeepSeek | Deepinfra | 52 | 128,000 | $0.09 | 31.8 | 0.30s |
DeepSeek R1 Distill Qwen 32B DeepSeek R1 Distill Qwen 32B | DeepSeek | Novita | 52 | 64,000 | $0.30 | 21.2 | 1.34s |
DeepSeek V3 (Dec '24) (FP8) DeepSeek V3 (Dec '24) | DeepSeek | Hyperbolic (FP8) | 46 | 128,000 | $0.25 | 31.0 | 1.42s |
DeepSeek V3 (Dec '24) DeepSeek V3 (Dec '24) | DeepSeek | Nebius | 46 | 128,000 | $0.75 | 33.8 | 0.61s |
DeepSeek V3 (Dec '24) DeepSeek V3 (Dec '24) | DeepSeek | Microsoft Azure | 46 | 128,000 | $2.00 | 74.9 | 0.53s |
DeepSeek V3 (Dec '24) DeepSeek V3 (Dec '24) | DeepSeek | Fireworks | 46 | 128,000 | $1.31 | 111.3 | 0.65s |
DeepSeek V3 (Dec '24) DeepSeek V3 (Dec '24) | DeepSeek | Deepinfra | 46 | 64,000 | $0.51 | 34.3 | 0.29s |
DeepSeek V3 (Dec '24) Turbo DeepSeek V3 (Dec '24) | DeepSeek | Novita (Turbo) | 46 | 64,000 | $0.63 | 30.7 | 1.04s |
DeepSeek V3 (Dec '24) DeepSeek V3 (Dec '24) | DeepSeek | Novita | 46 | 64,000 | $0.89 | 29.9 | 0.79s |
DeepSeek V3 (Dec '24) DeepSeek V3 (Dec '24) | DeepSeek | Together.ai | 46 | 128,000 | $1.25 | 90.9 | 0.61s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Lambda | 53 | 164,000 | $0.47 | 33.5 | 0.55s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | DeepSeek | 53 | 64,000 | $0.48 | 28.7 | 2.91s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Replicate | 53 | 128,000 | $1.45 | 89.0 | 0.73s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Hyperbolic | 53 | 128,000 | $1.25 | 35.3 | 1.20s |
DeepSeek V3 0324 (Mar '25) Fast DeepSeek V3 0324 (Mar '25) | DeepSeek | Nebius (Fast) | 53 | 128,000 | $3.00 | 94.6 | 0.71s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Nebius | 53 | 128,000 | $0.75 | 27.5 | 0.63s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | CentML | 53 | 164,000 | $0.00 | 85.8 | 0.56s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Microsoft Azure | 53 | 128,000 | $2.00 | 66.4 | 0.53s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Fireworks | 53 | 160,000 | $0.90 | 276.3 | 0.47s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Deepinfra | 53 | 164,000 | $0.43 | 21.5 | 0.35s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Novita | 53 | 128,000 | $0.57 | 29.1 | 1.00s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | GMI | 53 | 131,000 | $0.78 | 164.0 | 0.57s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | SambaNova | 53 | 33,000 | $3.38 | 166.1 | 1.77s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | Together.ai | 53 | 128,000 | $1.25 | 110.5 | 0.63s |
DeepSeek V3 0324 (Mar '25) DeepSeek V3 0324 (Mar '25) | DeepSeek | kluster.ai | 53 | 164,000 | $0.88 | 37.1 | 0.62s |
Devstral Devstral | Mistral | Mistral | 34 | 256,000 | $0.15 | 140.5 | 0.32s |
Gemini 1.5 Flash (Sep) (Vertex) Gemini 1.5 Flash (Sep) | Google (Vertex) | 39 | 1,000,000 | $0.13 | 189.2 | 0.22s | |
Gemini 1.5 Flash (Sep) (AI Studio) Gemini 1.5 Flash (Sep) | Google (AI Studio) | 39 | 1,000,000 | $0.13 | 193.0 | 0.32s | |
Gemini 1.5 Flash-8B AI Studio Gemini 1.5 Flash-8B | Google AI Studio | 31 | 1,000,000 | $0.07 | 238.6 | 0.23s | |
Gemini 1.5 Pro Gemini 1.5 Pro | 67 | 1,000,000 | $7.00 | 102.5 | 1.25s | ||
Gemini 1.5 Pro (Sep) (Vertex) Gemini 1.5 Pro (Sep) | Google (Vertex) | 45 | 2,000,000 | $2.19 | 92.5 | 0.80s | |
Gemini 1.5 Pro (Sep) (AI Studio) Gemini 1.5 Pro (Sep) | Google (AI Studio) | 45 | 2,000,000 | $2.19 | 93.6 | 2.25s | |
Gemini 2.0 Flash Vertex Gemini 2.0 Flash | Google Vertex | 48 | 1,000,000 | $0.26 | 213.3 | 0.33s | |
Gemini 2.0 Flash (AI Studio) Gemini 2.0 Flash | Google (AI Studio) | 48 | 1,000,000 | $0.17 | 242.7 | 0.38s | |
Gemini 2.0 Flash (exp) (AI Studio) Gemini 2.0 Flash (exp) | Google (AI Studio) | 46 | 1,000,000 | $0.00 | 212.9 | 0.29s | |
Gemini 2.0 Flash-Lite (Feb '25) (AI Studio) Gemini 2.0 Flash-Lite (Feb '25) | Google (AI Studio) | 41 | 1,000,000 | $0.13 | 217.9 | 0.28s | |
Gemini 2.0 Pro Experimental (AI Studio) Gemini 2.0 Pro Experimental | Google (AI Studio) | 49 | 2,000,000 | $0.00 | 44.2 | 18.04s | |
Gemini 2.5 Flash (AI Studio) Gemini 2.5 Flash | Google (AI Studio) | 53 | 1,000,000 | $0.26 | 284.3 | 0.35s | |
Gemini 2.5 Flash (Vertex) Gemini 2.5 Flash | Google (Vertex) | 53 | 1,000,000 | $0.26 | 201.8 | 0.31s | |
Gemini 2.5 Flash (April '25) (AI Studio) Gemini 2.5 Flash (April '25) | Google (AI Studio) | 49 | 1,000,000 | $0.26 | 317.7 | 0.37s | |
Gemini 2.5 Flash (April '25) (Reasoning) (AI Studio) Gemini 2.5 Flash (April '25) (Reasoning) | Google (AI Studio) | 60 | 1,000,000 | $0.99 | 421.9 | 6.58s | |
Gemini 2.5 Flash (Reasoning) (AI Studio) Gemini 2.5 Flash (Reasoning) | Google (AI Studio) | 65 | 1,000,000 | $0.99 | 356.0 | 9.39s | |
Gemini 2.5 Flash (Reasoning) (Vertex) Gemini 2.5 Flash (Reasoning) | Google (Vertex) | 65 | 1,000,000 | $0.99 | 274.5 | 18.05s | |
Gemini 2.5 Flash-Lite (AI Studio) Gemini 2.5 Flash-Lite | Google (AI Studio) | 46 | 1,000,000 | $0.17 | 472.5 | 0.23s | |
Gemini 2.5 Flash-Lite (Reasoning) (AI Studio) Gemini 2.5 Flash-Lite (Reasoning) | Google (AI Studio) | 55 | 1,000,000 | $0.17 | 697.1 | 5.99s | |
Gemini 2.5 Pro (AI Studio) Gemini 2.5 Pro | Google (AI Studio) | 70 | 1,000,000 | $3.44 | 144.9 | 36.80s | |
Gemini 2.5 Pro (Mar '25) Gemini 2.5 Pro (Mar '25) | 69 | 1,000,000 | $3.44 | 144.5 | 34.53s | ||
Gemini 2.5 Pro (May' 25) (AI Studio) Gemini 2.5 Pro (May' 25) | Google (AI Studio) | 68 | 1,000,000 | $3.44 | 144.5 | 34.08s | |
Gemini 2.5 Pro (May' 25) Vertex Gemini 2.5 Pro (May' 25) | Google Vertex | 68 | 1,000,000 | $3.44 | 153.7 | 30.80s | |
Gemma 2 27B Gemma 2 27B | Together.ai | 32 | 8,000 | $0.80 | 92.0 | 0.27s | |
Gemma 2 9B Fast Gemma 2 9B | Nebius (Fast) | 22 | 8,000 | $0.04 | 194.7 | 0.46s | |
Gemma 2 9B Gemma 2 9B | Groq | 22 | 8,000 | $0.20 | 725.7 | 0.20s | |
Gemma 3 12B Gemma 3 12B | Deepinfra | 34 | 128,000 | $0.06 | 63.7 | 0.53s | |
Gemma 3 27B Gemma 3 27B | Parasail | 38 | 131,000 | $0.29 | 80.3 | 0.46s | |
Gemma 3 27B (AI Studio) Gemma 3 27B | Google (AI Studio) | 38 | 128,000 | $0.00 | 23.3 | 0.69s | |
Gemma 3 27B Gemma 3 27B | Deepinfra | 38 | 128,000 | $0.12 | 28.6 | 0.52s | |
Gemma 3 4B Gemma 3 4B | Deepinfra | 25 | 128,000 | $0.03 | 134.3 | 0.23s | |
Gemma 3n E4B Gemma 3n E4B | Together.ai | 28 | 33,000 | $0.03 | 51.0 | 0.29s | |
GPT-4.1 GPT-4.1 | OpenAI | OpenAI | 53 | 1,000,000 | $3.50 | 133.6 | 0.50s |
GPT-4.1 GPT-4.1 | OpenAI | Microsoft Azure | 53 | 1,000,000 | $3.50 | 209.3 | 1.08s |
GPT-4.1 mini GPT-4.1 mini | OpenAI | OpenAI | 53 | 1,000,000 | $0.70 | 73.1 | 0.48s |
GPT-4.1 mini GPT-4.1 mini | OpenAI | Microsoft Azure | 53 | 1,000,000 | $0.70 | 217.8 | 0.57s |
GPT-4.1 nano GPT-4.1 nano | OpenAI | OpenAI | 41 | 1,000,000 | $0.17 | 190.2 | 0.31s |
GPT-4.1 nano GPT-4.1 nano | OpenAI | Microsoft Azure | 41 | 1,000,000 | $0.17 | 225.2 | 0.69s |
GPT-4o (March 2025) GPT-4o (March 2025) | OpenAI | OpenAI | 50 | 128,000 | $7.50 | 196.8 | 0.44s |
GPT-4o (May '24) GPT-4o (May '24) | OpenAI | OpenAI | 41 | 128,000 | $7.50 | 115.7 | 0.52s |
GPT-4o (May '24) GPT-4o (May '24) | OpenAI | Microsoft Azure | 41 | 128,000 | $7.50 | 153.3 | 0.64s |
GPT-4o (Nov '24) GPT-4o (Nov '24) | OpenAI | OpenAI | 41 | 128,000 | $4.38 | 199.9 | 0.38s |
GPT-4o (Nov '24) GPT-4o (Nov '24) | OpenAI | Microsoft Azure | 41 | 128,000 | $4.38 | 133.8 | 1.02s |
GPT-4o mini GPT-4o mini | OpenAI | OpenAI | 36 | 128,000 | $0.26 | 95.3 | 0.48s |
GPT-4o mini GPT-4o mini | OpenAI | Microsoft Azure | 36 | 128,000 | $0.26 | 120.0 | 0.78s |
Grok 3 Grok 3 | xAI | xAI | 51 | 131,000 | $6.00 | 83.1 | 0.62s |
Grok 3 Fast Grok 3 | xAI | xAI (Fast) | 51 | 131,000 | $10.00 | 88.0 | 0.62s |
Grok 3 mini Reasoning (high) Grok 3 mini Reasoning (high) | xAI | xAI | 67 | 131,000 | $0.35 | 210.7 | 0.60s |
Grok 3 mini Reasoning (high) Fast Grok 3 mini Reasoning (high) | xAI | xAI (Fast) | 67 | 131,000 | $1.45 | 210.6 | 0.65s |
Grok 4 Grok 4 | xAI | xAI | 73 | 256,000 | $6.00 | 76.1 | 5.69s |
Jamba 1.5 Large Jamba 1.5 Large | AI21 Labs | Microsoft Azure | 29 | 256,000 | $3.50 | 50.6 | 0.69s |
Jamba 1.5 Mini Jamba 1.5 Mini | AI21 Labs | Microsoft Azure | 18 | 256,000 | $0.25 | 82.4 | 0.48s |
Jamba 1.6 Large Jamba 1.6 Large | AI21 Labs | AI21 Labs | 29 | 256,000 | $3.50 | 59.7 | 0.70s |
Jamba 1.6 Mini Jamba 1.6 Mini | AI21 Labs | AI21 Labs | 18 | 256,000 | $0.25 | 165.2 | 0.60s |
LFM 40B LFM 40B | Liquid AI | Lambda | 22 | 32,000 | $0.15 | 161.1 | 0.16s |
Llama 2 Chat 7B Llama 2 Chat 7B | Meta | Replicate | 8 | 4,000 | $0.10 | 132.0 | 0.42s |
Llama 3 70B Llama 3 70B | Meta | Replicate | 27 | 8,000 | $1.18 | 43.6 | 0.42s |
Llama 3 70B Llama 3 70B | Meta | Hyperbolic | 27 | 8,000 | $0.40 | 18.9 | 1.59s |
Llama 3 70B Llama 3 70B | Meta | Amazon Bedrock | 27 | 8,000 | $2.86 | 47.4 | 0.41s |
Llama 3 70B Llama 3 70B | Meta | Microsoft Azure | 27 | 8,000 | $2.90 | 18.9 | 0.76s |
Llama 3 70B Llama 3 70B | Meta | Deepinfra | 27 | 8,000 | $0.33 | 43.9 | 0.31s |
Llama 3 70B Llama 3 70B | Meta | Novita | 27 | 8,000 | $0.57 | 18.9 | 1.29s |
Llama 3 70B Llama 3 70B | Meta | Groq | 27 | 8,000 | $0.64 | 485.3 | 0.20s |
Llama 3 70B (Reference, FP16) Llama 3 70B | Meta | Together.ai (Reference, FP16) | 27 | 8,000 | $0.88 | 114.4 | 0.34s |
Llama 3 70B (Turbo, FP8) Llama 3 70B | Meta | Together.ai (Turbo, FP8) | 27 | 8,000 | $0.88 | 114.6 | 0.36s |
Llama 3 8B Llama 3 8B | Meta | Replicate | 21 | 8,000 | $0.10 | 80.6 | 0.39s |
Llama 3 8B Llama 3 8B | Meta | Amazon Bedrock | 21 | 8,000 | $0.38 | 103.9 | 0.31s |
Llama 3 8B Llama 3 8B | Meta | Microsoft Azure | 21 | 8,000 | $0.38 | 73.7 | 0.37s |
Llama 3 8B Llama 3 8B | Meta | Deepinfra | 21 | 8,000 | $0.04 | 125.7 | 0.50s |
Llama 3 8B Llama 3 8B | Meta | Novita | 21 | 8,000 | $0.04 | 72.7 | 0.84s |
Llama 3 8B Llama 3 8B | Meta | Groq | 21 | 8,000 | $0.06 | 1228.9 | 0.27s |
Llama 3.1 405B (FP8) Llama 3.1 405B | Meta | Lambda (FP8) | 40 | 128,000 | $0.80 | 32.6 | 0.34s |
Llama 3.1 405B Llama 3.1 405B | Meta | Replicate | 40 | 128,000 | $9.50 | 19.3 | 1.00s |
Llama 3.1 405B Llama 3.1 405B | Meta | Hyperbolic | 40 | 128,000 | $4.00 | 92.2 | 0.97s |
Llama 3.1 405B Standard Llama 3.1 405B | Meta | Amazon Bedrock (Standard) | 40 | 128,000 | $2.40 | 29.7 | 1.85s |
Llama 3.1 405B Latency Optimized Llama 3.1 405B | Meta | Amazon Bedrock (Latency Optimized) | 40 | 128,000 | $3.00 | 89.4 | 0.45s |
Llama 3.1 405B Base Llama 3.1 405B | Meta | Nebius (Base) | 40 | 128,000 | $1.50 | 32.9 | 0.67s |
Llama 3.1 405B Vertex Llama 3.1 405B | Meta | Google Vertex | 40 | 128,000 | $7.75 | 29.7 | 0.42s |
Llama 3.1 405B Llama 3.1 405B | Meta | Microsoft Azure | 40 | 128,000 | $8.00 | 31.1 | 0.47s |
Llama 3.1 405B Llama 3.1 405B | Meta | Fireworks | 40 | 128,000 | $3.00 | 99.4 | 0.45s |
Llama 3.1 405B Llama 3.1 405B | Meta | Deepinfra | 40 | 33,000 | $0.80 | 27.7 | 0.71s |
Llama 3.1 405B Llama 3.1 405B | Meta | SambaNova | 40 | 16,000 | $6.25 | 168.2 | 0.70s |
Llama 3.1 405B Llama 3.1 405B | Meta | Databricks | 40 | 128,000 | $7.50 | 38.9 | 0.94s |
Llama 3.1 405B Turbo Llama 3.1 405B | Meta | Together.ai (Turbo) | 40 | 128,000 | $3.50 | 91.9 | 0.42s |
Llama 3.1 70B (FP8) Llama 3.1 70B | Meta | Lambda (FP8) | 35 | 128,000 | $0.17 | 51.0 | 0.22s |
Llama 3.1 70B Llama 3.1 70B | Meta | Hyperbolic | 35 | 128,000 | $0.40 | 130.4 | 0.89s |
Llama 3.1 70B Standard Llama 3.1 70B | Meta | Amazon Bedrock (Standard) | 35 | 128,000 | $0.72 | 31.7 | 0.65s |
Llama 3.1 70B Latency Optimized Llama 3.1 70B | Meta | Amazon Bedrock (Latency Optimized) | 35 | 128,000 | $0.90 | 141.3 | 0.31s |
Llama 3.1 70B Base Llama 3.1 70B | Meta | Nebius (Base) | 35 | 128,000 | $0.20 | 29.9 | 0.65s |
Llama 3.1 70B Vertex Llama 3.1 70B | Meta | Google Vertex | 35 | 128,000 | $0.00 | 72.9 | 0.27s |
Llama 3.1 70B Llama 3.1 70B | Meta | Microsoft Azure | 35 | 128,000 | $2.90 | 64.1 | 0.43s |
Llama 3.1 70B Llama 3.1 70B | Meta | Fireworks | 35 | 128,000 | $0.90 | 158.6 | 0.34s |
Llama 3.1 70B (Turbo, FP8) Llama 3.1 70B | Meta | Deepinfra (Turbo, FP8) | 35 | 128,000 | $0.14 | 37.3 | 0.26s |
Llama 3.1 70B Llama 3.1 70B | Meta | Deepinfra | 35 | 128,000 | $0.27 | 34.9 | 0.30s |
Llama 3.1 70B Turbo Llama 3.1 70B | Meta | Together.ai (Turbo) | 35 | 128,000 | $0.88 | 107.5 | 0.39s |
Llama 3.1 70B Llama 3.1 70B | Meta | Simplismart | 35 | 128,000 | $0.90 | 125.4 | 0.50s |
Llama 3.1 8B Llama 3.1 8B | Meta | Lambda | 24 | 128,000 | $0.03 | 141.2 | 0.21s |
Llama 3.1 8B Llama 3.1 8B | Meta | Cerebras | 24 | 33,000 | $0.10 | 2269.2 | 0.25s |
Llama 3.1 8B Llama 3.1 8B | Meta | Hyperbolic | 24 | 128,000 | $0.10 | 414.2 | 0.70s |
Llama 3.1 8B Llama 3.1 8B | Meta | Amazon Bedrock | 24 | 128,000 | $0.22 | 229.3 | 0.28s |
Llama 3.1 8B Fast Llama 3.1 8B | Meta | Nebius (Fast) | 24 | 128,000 | $0.04 | 182.5 | 0.47s |
Llama 3.1 8B Base Llama 3.1 8B | Meta | Nebius (Base) | 24 | 128,000 | $0.03 | 67.0 | 0.53s |
Llama 3.1 8B Vertex Llama 3.1 8B | Meta | Google Vertex | 24 | 128,000 | $0.00 | 119.1 | 0.18s |
Llama 3.1 8B Llama 3.1 8B | Meta | Microsoft Azure | 24 | 128,000 | $0.38 | 226.2 | 0.31s |
Llama 3.1 8B Llama 3.1 8B | Meta | Fireworks | 24 | 128,000 | $0.20 | 306.9 | 0.25s |
Llama 3.1 8B Llama 3.1 8B | Meta | Deepinfra | 24 | 128,000 | $0.04 | 55.3 | 0.27s |
Llama 3.1 8B Llama 3.1 8B | Meta | FriendliAI | 24 | 128,000 | $0.10 | 469.8 | 0.27s |
Llama 3.1 8B Llama 3.1 8B | Meta | Novita | 24 | 16,000 | $0.03 | 74.1 | 0.86s |
Llama 3.1 8B Llama 3.1 8B | Meta | Groq | 24 | 128,000 | $0.06 | 629.1 | 0.16s |
Llama 3.1 8B Llama 3.1 8B | Meta | SambaNova | 24 | 16,000 | $0.13 | 1191.1 | 0.26s |
Llama 3.1 8B Turbo Llama 3.1 8B | Meta | Together.ai (Turbo) | 24 | 128,000 | $0.18 | 159.8 | 0.25s |
Llama 3.1 8B Llama 3.1 8B | Meta | Simplismart | 24 | 128,000 | $0.15 | 473.6 | 1.65s |
Llama 3.1 8B Llama 3.1 8B | Meta | kluster.ai | 24 | 128,000 | $0.18 | 119.0 | 0.24s |
Llama 3.1 Nemotron 70B (FP8) Llama 3.1 Nemotron 70B | NVIDIA | Lambda (FP8) | 37 | 128,000 | $0.17 | 50.5 | 0.23s |
Llama 3.1 Nemotron 70B Llama 3.1 Nemotron 70B | NVIDIA | Deepinfra | 37 | 128,000 | $0.17 | 39.6 | 0.29s |
Llama 3.2 1B Llama 3.2 1B | Meta | Amazon Bedrock | 10 | 128,000 | $0.10 | 129.2 | 0.46s |
Llama 3.2 1B Llama 3.2 1B | Meta | Deepinfra | 10 | 128,000 | $0.01 | 285.5 | 0.27s |
Llama 3.2 3B (FP8) Llama 3.2 3B | Meta | Lambda (FP8) | 20 | 128,000 | $0.02 | 216.0 | 0.20s |
Llama 3.2 3B Llama 3.2 3B | Meta | Hyperbolic | 20 | 128,000 | $0.10 | 56.2 | 1.16s |
Llama 3.2 3B Llama 3.2 3B | Meta | Amazon Bedrock | 20 | 128,000 | $0.15 | 71.1 | 0.47s |
Llama 3.2 3B Llama 3.2 3B | Meta | Deepinfra | 20 | 128,000 | $0.00 | 151.4 | 0.43s |
Llama 3.2 3B Llama 3.2 3B | Meta | Novita | 20 | 32,000 | $0.04 | 67.3 | 0.73s |
Llama 3.2 3B Turbo Llama 3.2 3B | Meta | Together.ai (Turbo) | 20 | 128,000 | $0.06 | 156.5 | 0.31s |
Llama 3.2 90B (Vision) Llama 3.2 90B (Vision) | Meta | Amazon Bedrock | 33 | 128,000 | $0.72 | 60.4 | 0.52s |
Llama 3.2 90B (Vision) Vertex Llama 3.2 90B (Vision) | Meta | Google Vertex | 33 | 128,000 | $0.00 | 32.5 | 0.20s |
Llama 3.2 90B (Vision) Llama 3.2 90B (Vision) | Meta | Deepinfra | 33 | 33,000 | $0.36 | 35.6 | 0.31s |
Llama 3.3 70B (FP8) Llama 3.3 70B | Meta | Lambda (FP8) | 41 | 128,000 | $0.17 | 56.5 | 0.29s |
Llama 3.3 70B (FP8) Llama 3.3 70B | Meta | Parasail (FP8) | 41 | 131,000 | $0.28 | 116.0 | 0.46s |
Llama 3.3 70B Llama 3.3 70B | Meta | Cerebras | 41 | 128,000 | $0.94 | 2455.4 | 0.25s |
Llama 3.3 70B Llama 3.3 70B | Meta | Hyperbolic | 41 | 128,000 | $0.40 | 41.3 | 1.09s |
Llama 3.3 70B Llama 3.3 70B | Meta | Amazon Bedrock | 41 | 128,000 | $0.71 | 244.4 | 0.53s |
Llama 3.3 70B Fast Llama 3.3 70B | Meta | Nebius (Fast) | 41 | 128,000 | $0.38 | 191.0 | 0.55s |
Llama 3.3 70B Base Llama 3.3 70B | Meta | Nebius (Base) | 41 | 128,000 | $0.20 | 37.5 | 0.63s |
Llama 3.3 70B Vertex Llama 3.3 70B | Meta | Google Vertex | 41 | 128,000 | $0.72 | 85.7 | 0.20s |
Llama 3.3 70B Snowflake Llama 3.3 70B | Meta | Snowflake | 41 | 8,000 | $0.58 | 84.7 | 0.31s |
Llama 3.3 70B Llama 3.3 70B | Meta | CentML | 41 | 128,000 | $0.00 | 152.8 | 0.53s |
Llama 3.3 70B Llama 3.3 70B | Meta | Microsoft Azure | 41 | 128,000 | $0.71 | 55.9 | 0.45s |
Llama 3.3 70B Llama 3.3 70B | Meta | Fireworks | 41 | 128,000 | $0.90 | 121.5 | 0.41s |
Llama 3.3 70B (Turbo, FP8) Llama 3.3 70B | Meta | Deepinfra (Turbo, FP8) | 41 | 128,000 | $0.08 | 36.6 | 0.25s |
Llama 3.3 70B Llama 3.3 70B | Meta | Deepinfra | 41 | 128,000 | $0.27 | 36.4 | 0.34s |
Llama 3.3 70B Llama 3.3 70B | Meta | FriendliAI | 41 | 128,000 | $0.60 | 156.9 | 0.41s |
Llama 3.3 70B Llama 3.3 70B | Meta | Novita | 41 | 128,000 | $0.20 | 44.9 | 0.58s |
Llama 3.3 70B Llama 3.3 70B | Meta | Groq | 41 | 128,000 | $0.64 | 442.5 | 0.22s |
Llama 3.3 70B Llama 3.3 70B | Meta | SambaNova | 41 | 128,000 | $0.75 | 446.4 | 0.29s |
Llama 3.3 70B Turbo Llama 3.3 70B | Meta | Together.ai (Turbo) | 41 | 128,000 | $0.88 | 108.0 | 0.34s |
Llama 3.3 70B Llama 3.3 70B | Meta | kluster.ai | 41 | 128,000 | $0.70 | 32.4 | 0.39s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | Lambda (FP8) | 51 | 1,000,000 | $0.28 | 171.1 | 0.23s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | Parasail (FP8) | 51 | 1,000,000 | $0.35 | 183.7 | 0.36s |
Llama 4 Maverick Llama 4 Maverick | Meta | Amazon Bedrock | 51 | 128,000 | $0.42 | 317.9 | 0.59s |
Llama 4 Maverick Vertex Llama 4 Maverick | Meta | Google Vertex | 51 | 524,000 | $0.55 | 125.0 | 0.36s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | CentML (FP8) | 51 | 1,000,000 | $0.00 | 131.9 | 0.32s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | Microsoft Azure (FP8) | 51 | 128,000 | $0.61 | 171.8 | 0.32s |
Llama 4 Maverick (Base) Llama 4 Maverick | Meta | Fireworks (Base) | 51 | 1,000,000 | $0.39 | 175.2 | 0.44s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | Deepinfra (FP8) | 51 | 131,000 | $0.26 | 105.3 | 0.23s |
Llama 4 Maverick (Turbo, FP8) Llama 4 Maverick | Meta | Deepinfra (Turbo, FP8) | 51 | 8,000 | $0.50 | 979.1 | 0.18s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | Novita (FP8) | 51 | 1,000,000 | $0.34 | 108.2 | 0.51s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | GMI (FP8) | 51 | 1,000,000 | $0.39 | 159.2 | 0.46s |
Llama 4 Maverick Llama 4 Maverick | Meta | Groq | 51 | 128,000 | $0.30 | 559.8 | 0.11s |
Llama 4 Maverick Llama 4 Maverick | Meta | SambaNova | 51 | 131,000 | $0.92 | 801.8 | 0.36s |
Llama 4 Maverick Llama 4 Maverick | Meta | Together.ai | 51 | 524,000 | $0.41 | 120.1 | 0.35s |
Llama 4 Maverick (FP8) Llama 4 Maverick | Meta | kluster.ai (FP8) | 51 | 1,000,000 | $0.35 | 168.9 | 0.40s |
Llama 4 Scout Llama 4 Scout | Meta | Lambda | 43 | 1,000,000 | $0.14 | 118.0 | 0.24s |
Llama 4 Scout (FP8) Llama 4 Scout | Meta | Parasail (FP8) | 43 | 158,000 | $0.19 | 114.7 | 0.36s |
Llama 4 Scout Llama 4 Scout | Meta | Cerebras | 43 | 32,000 | $0.70 | 2808.5 | 0.26s |
Llama 4 Scout Llama 4 Scout | Meta | Amazon Bedrock | 43 | 128,000 | $0.29 | 169.0 | 0.51s |
Llama 4 Scout Vertex Llama 4 Scout | Meta | Google Vertex | 43 | 1,000,000 | $0.36 | 134.6 | 0.37s |
Llama 4 Scout Llama 4 Scout | Meta | CentML | 43 | 1,000,000 | $0.00 | 119.1 | 0.34s |
Llama 4 Scout Llama 4 Scout | Meta | Microsoft Azure | 43 | 128,000 | $0.34 | 113.2 | 0.34s |
Llama 4 Scout (Base) Llama 4 Scout | Meta | Fireworks (Base) | 43 | 1,000,000 | $0.26 | 167.9 | 0.48s |
Llama 4 Scout Llama 4 Scout | Meta | Deepinfra | 43 | 131,000 | $0.14 | 36.4 | 0.33s |
Llama 4 Scout Llama 4 Scout | Meta | Novita | 43 | 131,000 | $0.20 | 115.1 | 0.49s |
Llama 4 Scout Llama 4 Scout | Meta | GMI | 43 | 1,000,000 | $0.18 | 130.3 | 0.43s |
Llama 4 Scout Llama 4 Scout | Meta | Groq | 43 | 131,000 | $0.17 | 599.5 | 0.19s |
Llama 4 Scout Llama 4 Scout | Meta | Together.ai | 43 | 328,000 | $0.28 | 123.8 | 0.20s |
Llama 4 Scout Llama 4 Scout | Meta | kluster.ai | 43 | 128,000 | $0.71 | 90.8 | 0.45s |
Llama Nemotron Ultra Reasoning Base Llama Nemotron Ultra Reasoning | NVIDIA | Nebius (Base) | 61 | 131,000 | $0.90 | 41.8 | 0.64s |
Magistral Medium Magistral Medium | Mistral | Mistral | 56 | 41,000 | $2.75 | 124.5 | 0.43s |
Magistral Small Magistral Small | Mistral | Mistral | 55 | 40,000 | $0.75 | 196.6 | 0.31s |
MiniMax M1 40k MiniMax M1 40k | MiniMax | MiniMax | 61 | 1,000,000 | $0.82 | 19.4 | 1.31s |
MiniMax M1 80k MiniMax M1 80k | MiniMax | MiniMax | 63 | 1,000,000 | $0.82 | 20.2 | 1.46s |
MiniMax-Text-01 MiniMax-Text-01 | MiniMax | MiniMax | 40 | 1,000,000 | $0.42 | 30.4 | 1.15s |
Ministral 3B Ministral 3B | Mistral | Mistral | 20 | 128,000 | $0.04 | 270.1 | 0.29s |
Ministral 8B Ministral 8B | Mistral | Mistral | 22 | 128,000 | $0.10 | 197.3 | 0.33s |
Mistral 7B Mistral 7B | Mistral | Mistral | 10 | 8,000 | $0.25 | 125.1 | 0.30s |
Mistral 7B Mistral 7B | Mistral | Amazon Bedrock | 10 | 8,000 | $0.16 | 93.8 | 0.32s |
Mistral 7B Mistral 7B | Mistral | Deepinfra | 10 | 8,000 | $0.04 | 102.6 | 0.20s |
Mistral 7B Mistral 7B | Mistral | Novita | 10 | 32,000 | $0.04 | 117.1 | 0.86s |
Mistral 7B Mistral 7B | Mistral | Together.ai | 10 | 8,000 | $0.20 | 157.7 | 0.20s |
Mistral Large (Feb '24) Mistral Large (Feb '24) | Mistral | Amazon Bedrock | 26 | 33,000 | $6.00 | 45.1 | 0.40s |
Mistral Large 2 (Jul '24) Mistral Large 2 (Jul '24) | Mistral | Mistral | 37 | 128,000 | $3.00 | 95.6 | 0.42s |
Mistral Large 2 (Jul '24) Mistral Large 2 (Jul '24) | Mistral | Amazon Bedrock | 37 | 128,000 | $3.00 | 33.6 | 0.45s |
Mistral Large 2 (Nov '24) Mistral Large 2 (Nov '24) | Mistral | Mistral | 38 | 128,000 | $3.00 | 32.9 | 0.51s |
Mistral Large 2 (Nov '24) Mistral Large 2 (Nov '24) | Mistral | Microsoft Azure | 38 | 128,000 | $3.00 | 23.8 | 0.49s |
Mistral Medium Mistral Medium | Mistral | Mistral | 23 | 33,000 | $4.09 | 53.0 | 0.45s |
Mistral Medium 3 Mistral Medium 3 | Mistral | Mistral | 49 | 128,000 | $0.80 | 57.7 | 0.42s |
Mistral Medium 3 Mistral Medium 3 | Mistral | Microsoft Azure | 49 | 128,000 | $0.80 | 55.0 | 0.42s |
Mistral NeMo Mistral NeMo | Mistral | Mistral | 20 | 128,000 | $0.15 | 176.6 | 0.30s |
Mistral NeMo (FP8) Mistral NeMo | Mistral | Parasail (FP8) | 20 | 131,000 | $0.11 | 151.3 | 0.36s |
Mistral NeMo Base Mistral NeMo | Mistral | Nebius (Base) | 20 | 128,000 | $0.06 | 51.5 | 0.54s |
Mistral NeMo Mistral NeMo | Mistral | Deepinfra | 20 | 128,000 | $0.01 | 50.5 | 0.23s |
Mistral Small (Feb '24) Mistral Small (Feb '24) | Mistral | Mistral | 23 | 33,000 | $1.50 | 208.7 | 0.30s |
Mistral Small (Feb '24) Mistral Small (Feb '24) | Mistral | Microsoft Azure | 23 | 33,000 | $1.50 | 87.9 | 0.41s |
Mistral Small (Sep '24) Mistral Small (Sep '24) | Mistral | Mistral | 27 | 33,000 | $0.30 | 124.7 | 0.31s |
Mistral Small 3 Mistral Small 3 | Mistral | Mistral | 35 | 32,000 | $0.15 | 110.1 | 0.33s |
Mistral Small 3 Mistral Small 3 | Mistral | Deepinfra | 35 | 32,000 | $0.06 | 85.1 | 0.20s |
Mistral Small 3 Mistral Small 3 | Mistral | Together.ai | 35 | 32,000 | $0.80 | 95.9 | 0.18s |
Mistral Small 3.1 Mistral Small 3.1 | Mistral | Mistral | 35 | 128,000 | $0.15 | 179.1 | 0.28s |
Mistral Small 3.1 Mistral Small 3.1 | Mistral | Parasail | 35 | 128,000 | $0.15 | 61.5 | 0.41s |
Mistral Small 3.1 Vertex Mistral Small 3.1 | Mistral | Google Vertex | 35 | 128,000 | $0.15 | 34.4 | 0.22s |
Mistral Small 3.2 Mistral Small 3.2 | Mistral | Mistral | 42 | 33,000 | $0.15 | 204.1 | 0.29s |
Mistral Small 3.2 (FP8) Mistral Small 3.2 | Mistral | Deepinfra (FP8) | 42 | 128,000 | $0.06 | 39.1 | 0.38s |
Mixtral 8x22B Mixtral 8x22B | Mistral | Mistral | 26 | 65,000 | $3.00 | 55.5 | 0.37s |
Mixtral 8x22B Mixtral 8x22B | Mistral | Fireworks | 26 | 65,000 | $1.20 | 75.3 | 0.34s |
Mixtral 8x7B Mixtral 8x7B | Mistral | Mistral | 17 | 33,000 | $0.70 | 78.6 | 0.36s |
Mixtral 8x7B Mixtral 8x7B | Mistral | Amazon Bedrock | 17 | 33,000 | $0.51 | 93.3 | 0.33s |
Mixtral 8x7B Mixtral 8x7B | Mistral | Deepinfra | 17 | 33,000 | $0.12 | 85.5 | 0.25s |
Mixtral 8x7B Mixtral 8x7B | Mistral | Together.ai | 17 | 33,000 | $0.60 | 46.6 | 0.94s |
Nova Lite Nova Lite | Amazon | Amazon Bedrock | 33 | 300,000 | $0.10 | 229.2 | 0.46s |
Nova Micro Nova Micro | Amazon | Amazon Bedrock | 28 | 130,000 | $0.06 | 371.5 | 0.43s |
Nova Premier Nova Premier | Amazon | Amazon Bedrock | 43 | 1,000,000 | $5.00 | 85.5 | 0.97s |
Nova Pro Nova Pro | Amazon | Amazon Bedrock | 37 | 300,000 | $1.40 | 115.8 | 0.53s |
o1 o1 | OpenAI | OpenAI | 62 | 200,000 | $26.25 | 192.0 | 14.52s |
o1 o1 | OpenAI | Microsoft Azure | 62 | 200,000 | $26.25 | 114.0 | 26.41s |
o1-mini o1-mini | OpenAI | OpenAI | 54 | 128,000 | $1.93 | 244.2 | 9.05s |
o1-mini o1-mini | OpenAI | Microsoft Azure | 54 | 128,000 | $1.93 | 270.4 | 8.87s |
o3 o3 | OpenAI | OpenAI | 70 | 128,000 | $3.50 | 199.5 | 12.85s |
o3 o3 | OpenAI | Microsoft Azure | 70 | 128,000 | $3.50 | 99.1 | 31.89s |
o3-mini o3-mini | OpenAI | OpenAI | 63 | 200,000 | $1.93 | 189.1 | 11.71s |
o3-mini o3-mini | OpenAI | Microsoft Azure | 63 | 200,000 | $1.93 | 218.3 | 11.66s |
o3-mini (high) o3-mini (high) | OpenAI | OpenAI | 66 | 200,000 | $1.93 | 194.2 | 35.49s |
o3-mini (high) o3-mini (high) | OpenAI | Microsoft Azure | 66 | 200,000 | $1.93 | 213.2 | 32.91s |
o4-mini (high) o4-mini (high) | OpenAI | OpenAI | 70 | 200,000 | $1.93 | 135.7 | 37.56s |
o4-mini (high) o4-mini (high) | OpenAI | Microsoft Azure | 70 | 200,000 | $1.93 | 152.2 | 30.12s |
Phi-3 Medium 14B Phi-3 Medium 14B | Microsoft Azure | Microsoft Azure | 25 | 128,000 | $0.30 | 52.9 | 0.43s |
Phi-4 Phi-4 | Microsoft Azure | Nebius | 40 | 16,000 | $0.15 | 106.2 | 0.48s |
Phi-4 Phi-4 | Microsoft Azure | Microsoft Azure | 40 | 16,000 | $0.22 | 22.2 | 0.47s |
Phi-4 Phi-4 | Microsoft Azure | Deepinfra | 40 | 16,000 | $0.09 | 39.1 | 0.26s |
Phi-4 Mini Phi-4 Mini | Microsoft Azure | Microsoft Azure | 26 | 128,000 | $0.00 | 31.7 | 0.38s |
Phi-4 Multimodal Phi-4 Multimodal | Microsoft Azure | Microsoft Azure | 27 | 128,000 | $0.00 | 22.0 | 0.33s |
Pixtral 12B Pixtral 12B | Mistral | Mistral | 23 | 128,000 | $0.15 | 103.3 | 0.31s |
Pixtral 12B Pixtral 12B | Mistral | Hyperbolic | 23 | 128,000 | $0.10 | 104.7 | 0.44s |
Pixtral Large Pixtral Large | Mistral | Mistral | 37 | 128,000 | $3.00 | 95.4 | 0.41s |
Qwen2 72B Qwen2 72B | Alibaba | Together.ai | 33 | 33,000 | $0.90 | 39.0 | 0.43s |
Qwen2 72B Qwen2 72B | Alibaba | Alibaba Cloud | 33 | 131,000 | $0.00 | 31.0 | 1.33s |
Qwen2.5 72B Qwen2.5 72B | Alibaba | Hyperbolic | 40 | 131,000 | $0.40 | 31.9 | 1.29s |
Qwen2.5 72B Qwen2.5 72B | Alibaba | Nebius | 40 | 131,000 | $0.20 | 35.5 | 0.63s |
Qwen2.5 72B Fast Qwen2.5 72B | Alibaba | Nebius (Fast) | 40 | 131,000 | $0.38 | 65.2 | 0.52s |
Qwen2.5 72B Qwen2.5 72B | Alibaba | Fireworks | 40 | 131,000 | $0.90 | 75.2 | 0.32s |
Qwen2.5 72B Qwen2.5 72B | Alibaba | Deepinfra | 40 | 33,000 | $0.19 | 42.0 | 0.54s |
Qwen2.5 72B Turbo Qwen2.5 72B | Alibaba | Together.ai (Turbo) | 40 | 131,000 | $1.20 | 114.8 | 0.28s |
Qwen2.5 72B Qwen2.5 72B | Alibaba | Alibaba Cloud | 40 | 131,000 | $0.00 | 58.2 | 1.28s |
Qwen2.5 Coder 32B Qwen2.5 Coder 32B | Alibaba | Lambda | 36 | 33,000 | $0.09 | 42.9 | 0.32s |
Qwen2.5 Coder 32B Qwen2.5 Coder 32B | Alibaba | Hyperbolic | 36 | 131,000 | $0.20 | 51.8 | 1.13s |
Qwen2.5 Coder 32B Qwen2.5 Coder 32B | Alibaba | Deepinfra | 36 | 33,000 | $0.08 | 50.2 | 0.24s |
Qwen2.5 Coder 32B Qwen2.5 Coder 32B | Alibaba | Together.ai | 36 | 131,000 | $0.80 | 90.3 | 0.21s |
Qwen2.5 Instruct 32B Fast Qwen2.5 Instruct 32B | Alibaba | Nebius (Fast) | 37 | 128,000 | $0.20 | 88.5 | 0.52s |
Qwen2.5 Instruct 32B Base Qwen2.5 Instruct 32B | Alibaba | Nebius (Base) | 37 | 128,000 | $0.10 | 58.7 | 0.55s |
Qwen2.5 Max Qwen2.5 Max | Alibaba | Alibaba Cloud | 45 | 32,000 | $2.80 | 40.2 | 1.41s |
Qwen2.5 Turbo Qwen2.5 Turbo | Alibaba | Alibaba Cloud | 34 | 1,000,000 | $0.09 | 48.9 | 1.07s |
Qwen3 0.6B Qwen3 0.6B | Alibaba | Alibaba Cloud | 17 | 33,000 | $0.19 | 232.8 | 0.91s |
Qwen3 0.6B (Reasoning) Qwen3 0.6B (Reasoning) | Alibaba | Alibaba Cloud | 23 | 33,000 | $0.40 | 229.3 | 0.94s |
Qwen3 1.7B Qwen3 1.7B | Alibaba | Alibaba Cloud | 25 | 33,000 | $0.19 | 141.2 | 0.92s |
Qwen3 1.7B (Reasoning) Qwen3 1.7B (Reasoning) | Alibaba | Alibaba Cloud | 38 | 33,000 | $0.40 | 138.8 | 0.94s |
Qwen3 14B Qwen3 14B | Alibaba | Alibaba Cloud | 41 | 131,000 | $0.61 | 66.5 | 1.17s |
Qwen3 14B (Reasoning) Base Qwen3 14B (Reasoning) | Alibaba | Nebius (Base) | 56 | 33,000 | $0.12 | 85.8 | 0.52s |
Qwen3 14B (Reasoning) (FP8) Qwen3 14B (Reasoning) | Alibaba | Deepinfra (FP8) | 56 | 128,000 | $0.12 | 75.0 | 0.51s |
Qwen3 14B (Reasoning) Qwen3 14B (Reasoning) | Alibaba | Alibaba Cloud | 56 | 131,000 | $1.31 | 65.6 | 1.06s |
Qwen3 235B Qwen3 235B | Alibaba | GMI | 47 | 41,000 | $0.40 | 64.4 | 0.62s |
Qwen3 235B Qwen3 235B | Alibaba | Alibaba Cloud | 47 | 131,000 | $1.23 | 42.4 | 1.28s |
Qwen3 235B (Reasoning) (FP8) Qwen3 235B (Reasoning) | Alibaba | Parasail (FP8) | 62 | 41,000 | $0.35 | 57.3 | 0.45s |
Qwen3 235B (Reasoning) Base Qwen3 235B (Reasoning) | Alibaba | Nebius (Base) | 62 | 33,000 | $0.30 | 50.2 | 0.56s |
Qwen3 235B (Reasoning) Qwen3 235B (Reasoning) | Alibaba | Fireworks | 62 | 128,000 | $0.10 | 79.0 | 0.84s |
Qwen3 235B (Reasoning) (FP8) Qwen3 235B (Reasoning) | Alibaba | Deepinfra (FP8) | 62 | 41,000 | $0.30 | 15.2 | 0.44s |
Qwen3 235B (Reasoning) (FP8) Qwen3 235B (Reasoning) | Alibaba | Novita (FP8) | 62 | 128,000 | $0.35 | 17.3 | 0.66s |
Qwen3 235B (Reasoning) Qwen3 235B (Reasoning) | Alibaba | GMI | 62 | 41,000 | $0.40 | 60.7 | 0.60s |
Qwen3 235B (Reasoning) (FP8) Qwen3 235B (Reasoning) | Alibaba | Together.ai (FP8) | 62 | 41,000 | $0.30 | 38.6 | 0.37s |
Qwen3 235B (Reasoning) (FP8) Qwen3 235B (Reasoning) | Alibaba | kluster.ai (FP8) | 62 | 41,000 | $0.61 | 41.5 | 0.49s |
Qwen3 235B (Reasoning) Qwen3 235B (Reasoning) | Alibaba | Alibaba Cloud | 62 | 131,000 | $2.63 | 42.4 | 1.14s |
Qwen3 30B A3B Qwen3 30B A3B | Alibaba | Alibaba Cloud | 43 | 131,000 | $0.35 | 49.2 | 1.12s |
Qwen3 30B A3B (Reasoning) (FP8) Qwen3 30B A3B (Reasoning) | Alibaba | Parasail (FP8) | 56 | 41,000 | $0.20 | 158.7 | 0.39s |
Qwen3 30B A3B (Reasoning) Fast Qwen3 30B A3B (Reasoning) | Alibaba | Nebius (Fast) | 56 | 33,000 | $0.45 | 139.2 | 0.51s |
Qwen3 30B A3B (Reasoning) Base Qwen3 30B A3B (Reasoning) | Alibaba | Nebius (Base) | 56 | 33,000 | $0.15 | 124.9 | 0.49s |
Qwen3 30B A3B (Reasoning) Qwen3 30B A3B (Reasoning) | Alibaba | Fireworks | 56 | 131,000 | $0.90 | 167.9 | 0.38s |
Qwen3 30B A3B (Reasoning) (FP8) Qwen3 30B A3B (Reasoning) | Alibaba | Deepinfra (FP8) | 56 | 41,000 | $0.15 | 112.1 | 0.19s |
Qwen3 30B A3B (Reasoning) (FP8) Qwen3 30B A3B (Reasoning) | Alibaba | Novita (FP8) | 56 | 128,000 | $0.19 | 162.4 | 0.71s |
Qwen3 30B A3B (Reasoning) Qwen3 30B A3B (Reasoning) | Alibaba | Alibaba Cloud | 56 | 131,000 | $0.75 | 48.5 | 1.21s |
Qwen3 32B (FP8) Qwen3 32B | Alibaba | Parasail (FP8) | 44 | 41,000 | $0.20 | 53.9 | 0.45s |
Qwen3 32B Qwen3 32B | Alibaba | Cerebras | 44 | 128,000 | $0.50 | 2359.6 | 0.24s |
Qwen3 32B Base Qwen3 32B | Alibaba | Nebius (Base) | 44 | 33,000 | $0.15 | 46.5 | 0.57s |
Qwen3 32B Fast Qwen3 32B | Alibaba | Nebius (Fast) | 44 | 33,000 | $0.30 | 199.0 | 0.52s |
Qwen3 32B (FP8) Qwen3 32B | Alibaba | Novita (FP8) | 44 | 41,000 | $0.19 | 45.5 | 1.08s |
Qwen3 32B (FP8) Qwen3 32B | Alibaba | GMI (FP8) | 44 | 33,000 | $0.23 | 53.8 | 0.81s |
Qwen3 32B Qwen3 32B | Alibaba | Groq | 44 | 131,000 | $0.36 | 616.0 | 0.14s |
Qwen3 32B Qwen3 32B | Alibaba | SambaNova | 44 | 33,000 | $0.50 | 344.8 | 0.37s |
Qwen3 32B Qwen3 32B | Alibaba | Alibaba Cloud | 44 | 131,000 | $1.23 | 62.2 | 1.10s |
Qwen3 32B (Reasoning) (FP8) Qwen3 32B (Reasoning) | Alibaba | Parasail (FP8) | 59 | 41,000 | $0.20 | 53.5 | 0.48s |
Qwen3 32B (Reasoning) Qwen3 32B (Reasoning) | Alibaba | Cerebras | 59 | 41,000 | $0.50 | 2496.1 | 0.24s |
Qwen3 32B (Reasoning) Base Qwen3 32B (Reasoning) | Alibaba | Nebius (Base) | 59 | 33,000 | $0.15 | 46.1 | 0.57s |
Qwen3 32B (Reasoning) (FP8) Qwen3 32B (Reasoning) | Alibaba | Deepinfra (FP8) | 59 | 41,000 | $0.15 | 42.0 | 0.57s |
Qwen3 32B (Reasoning) (FP8) Qwen3 32B (Reasoning) | Alibaba | Novita (FP8) | 59 | 128,000 | $0.19 | 44.0 | 1.04s |
Qwen3 32B (Reasoning) (FP8) Qwen3 32B (Reasoning) | Alibaba | GMI (FP8) | 59 | 33,000 | $0.23 | 53.4 | 0.71s |
Qwen3 32B (Reasoning) Qwen3 32B (Reasoning) | Alibaba | Groq | 59 | 131,000 | $0.36 | 627.3 | 0.14s |
Qwen3 32B (Reasoning) Qwen3 32B (Reasoning) | Alibaba | SambaNova | 59 | 33,000 | $0.50 | 265.6 | 0.43s |
Qwen3 32B (Reasoning) Qwen3 32B (Reasoning) | Alibaba | Alibaba Cloud | 59 | 131,000 | $2.63 | 60.6 | 1.06s |
Qwen3 4B Qwen3 4B | Alibaba | Alibaba Cloud | 35 | 131,000 | $0.19 | 106.4 | 0.99s |
Qwen3 4B (Reasoning) Fast Qwen3 4B (Reasoning) | Alibaba | Nebius (Fast) | 47 | 33,000 | $0.12 | 157.6 | 0.47s |
Qwen3 4B (Reasoning) (FP8) Qwen3 4B (Reasoning) | Alibaba | Novita (FP8) | 47 | 128,000 | $0.00 | 71.4 | 0.72s |
Qwen3 4B (Reasoning) Qwen3 4B (Reasoning) | Alibaba | Alibaba Cloud | 47 | 131,000 | $0.40 | 104.8 | 1.09s |
Qwen3 8B Qwen3 8B | Alibaba | Alibaba Cloud | 37 | 131,000 | $0.31 | 100.2 | 0.98s |
Qwen3 8B (Reasoning) (FP8) Qwen3 8B (Reasoning) | Alibaba | Novita (FP8) | 51 | 128,000 | $0.06 | 55.2 | 0.80s |
Qwen3 8B (Reasoning) Qwen3 8B (Reasoning) | Alibaba | Alibaba Cloud | 51 | 131,000 | $0.66 | 98.8 | 0.99s |
QwQ 32B-Preview QwQ 32B-Preview | Alibaba | Deepinfra | 43 | 33,000 | $0.14 | 32.7 | 0.35s |
QwQ 32B-Preview QwQ 32B-Preview | Alibaba | Together.ai | 43 | 33,000 | $1.20 | 63.1 | 0.70s |
QwQ-32B QwQ-32B | Alibaba | Hyperbolic | 58 | 131,000 | $0.20 | 142.8 | 1.07s |
QwQ-32B Fast QwQ-32B | Alibaba | Nebius (Fast) | 58 | 131,000 | $0.75 | 85.7 | 0.51s |
QwQ-32B Base QwQ-32B | Alibaba | Nebius (Base) | 58 | 131,000 | $0.23 | 52.3 | 0.55s |
QwQ-32B QwQ-32B | Alibaba | Fireworks | 58 | 131,000 | $0.90 | 179.5 | 0.45s |
QwQ-32B QwQ-32B | Alibaba | Deepinfra | 58 | 131,000 | $0.09 | 47.9 | 0.26s |
QwQ-32B QwQ-32B | Alibaba | Groq | 58 | 131,000 | $0.32 | 411.6 | 0.23s |
QwQ-32B QwQ-32B | Alibaba | Together.ai | 58 | 131,000 | $1.20 | 62.9 | 0.52s |
Reka Flash 3 Reka Flash 3 | Reka AI | Reka AI | 47 | 128,000 | $0.35 | 55.9 | 1.25s |
Solar Pro 2 Solar Pro 2 | Upstage | Upstage | 45 | 64,000 | $0.00 | 112.8 | 1.82s |
Solar Pro 2 (Reasoning) Solar Pro 2 (Reasoning) | Upstage | Upstage | 51 | 64,000 | $0.00 | 104.7 | 1.67s |
Sonar Sonar | Perplexity | Perplexity | 43 | 127,000 | $1.00 | 172.4 | 1.83s |
Sonar Pro Sonar Pro | Perplexity | Perplexity | 43 | 200,000 | $6.00 | 154.3 | 1.89s |
This interactive tool helps you compare different LLM providers and models based on various metrics like price, performance, and capabilities.
Data is sourced from artificialanalysis.ai and is updated regularly to reflect the latest information available.
Use the filters and chart configuration options to customize your view and find the perfect LLM for your specific needs.