October 2025

Alibaba Cloud

🧠
Qwen3 235B flagship MoE
235B MoE flagship, hybrid reasoning, 22B active
235B (22B active)
Unknown*
300-500ms
Free (open-source)
v3.0
2025-08-20
🔀
Qwen3 30B MoE MoE
30B MoE, 3.3B active, hybrid reasoning
30B (3.3B active)
Unknown*
150-250ms
Free (open-source)
v3.0
2025-08-20
⚖️
Qwen3 32B
Balanced dense model, hybrid reasoning
32B
Unknown*
180-300ms
Free (open-source)
v3.0
2025-08-20
⚡️
Qwen3 14B
Fast dense model, hybrid reasoning
14B
Unknown*
100-180ms
Free (open-source)
v3.0
2025-08-20
⚡️
Qwen3 8B
Efficient dense model, hybrid reasoning
8B
Unknown*
80-140ms
Free (open-source)
v3.0
2025-08-20
⚡️
Qwen3 4B
Lightweight dense model, hybrid reasoning
4B
Unknown*
60-120ms
Free (open-source)
v3.0
2025-08-20
⚡️
Qwen3 1.7B
Ultra-lightweight, hybrid reasoning
1.7B
Unknown*
40-80ms
Free (open-source)
v3.0
2025-08-20
⚡️
Qwen3 0.6B
Smallest model, hybrid reasoning
0.6B
Unknown*
20-50ms
Free (open-source)
v3.0
2025-08-20

Amazon

💡
Nova Pro multimodal
Highly capable multimodal model
Unknown*
Unknown*
300-500ms
$0.80/$3.20 per 1M tokens
v1
2024-12-03
⚡️
Nova Lite
Fast and cost-effective
Unknown*
Unknown*
100-200ms
$0.06/$0.24 per 1M tokens
v1
2024-12-03
⚡️
Nova Micro
Ultra-fast text-only model
Unknown*
Unknown*
50-100ms
$0.035/$0.14 per 1M tokens
v1
2024-12-03

Anthropic

🧠
Claude 3.5 Sonnet BEST
Top performance, computer use, 200k context
Unknown*
Unknown*
250-400ms
$3/$15 per 1M tokens
v3.5
2024-10-22
⚡️
Claude 3.5 Haiku
Fast, affordable, improved performance
Unknown*
Unknown*
80-150ms
$0.80/$4 per 1M tokens
v3.5
2024-11-04
🧠
Claude 4.1 Opus flagship
Most advanced Claude, 74.5% SWE-bench, 200k context
Unknown*
Unknown*
300-500ms
$15/$75 per 1M tokens
v4.1
2025-08-05
⚖️
Claude 4.5 Sonnet LATEST
30+ hour autonomous coding, state-of-the-art SWE-bench
Unknown*
Unknown*
200-350ms
$3/$15 per 1M tokens
v4.5
2025-09-29
⚖️
Claude 4 Sonnet
Balanced performance, high volume applications
Unknown*
Unknown*
200-350ms
$3/$15 per 1M tokens
v4.0
2025-07-15
⚡️
Claude 4 Haiku
Lightweight, fast and efficient
Unknown*
Unknown*
60-120ms
$0.50/$2.50 per 1M tokens
v4.0
2025-07-15
🎨
Claude 3 Opus
Most capable Claude 3, 200k context
Unknown*
Unknown*
300-500ms
$15/$75 per 1M tokens
v3
2024-03-04

Cohere

🧠
Command A LATEST
Most advanced Cohere, 256k context, 150% faster
Unknown*
Unknown*
150-250ms
API pricing
vA
2025-03-15
💎
Command A Reasoning reasoning
First reasoning model, thinks before output
Unknown*
Unknown*
Variable (reasoning)
API pricing
vA
2025-08-15
👁️
Command A Vision vision
First vision model, analyzes charts & tables
Unknown*
Unknown*
200-400ms
API pricing
vA
2025-07-15
🌍
Command A Translate translation
State-of-the-art translation, 23 languages
Unknown*
Unknown*
100-200ms
API pricing
vA
2025-08-15
🔓
Aya 23 open source
Open-source multilingual, 8B & 35B variants
8B/35B
Unknown*
100-300ms
Free (open-source)
v23
2025-05-15
🤝
Command R+ RAG
Enterprise RAG, 128k context
104B
Unknown*
220-350ms
$2.50/$10 per 1M tokens
v08-2024
2024-08-01
⚖️
Command R
Balanced RAG model
35B
Unknown*
180-280ms
$0.15/$0.60 per 1M tokens
v08-2024
2024-08-01

DeepSeek

🧠
DeepSeek-V3 MoE
671B MoE flagship, 37B active
671B (37B active)
Unknown*
300-500ms
$0.27/$1.10 per 1M tokens
v3
2024-12-26
💻
DeepSeek-Coder-V2 coding
236B MoE for coding, 21B active
236B (21B active)
Unknown*
200-350ms
$0.14/$0.28 per 1M tokens
v2
2024-06-17
💎
DeepSeek-R1 reasoning
Advanced reasoning with RL
671B (37B active)
Unknown*
Variable (reasoning)
$0.55/$2.19 per 1M tokens
v1
2025-01-20

Google

🧠
Gemini 2.5 Pro LATEST
Most capable Gemini, 1M context, enhanced reasoning
Unknown*
Unknown*
250-400ms
$1.25/$5 per 1M tokens
v2.5
2025-09-15
⚡️
Gemini 2.5 Flash
Fast multimodal, 1M context, 20-30% fewer tokens
Unknown*
Unknown*
120-250ms
$0.075/$0.30 per 1M tokens
v2.5
2025-09-15
🎨
Gemini 2.5 Flash Image Nano Banana
Advanced image generation & editing model
Unknown*
Unknown*
200-400ms
API pricing
v2.5
2025-08-15
🧠
Gemini 1.5 Pro 2M context
Flagship model with 2 million token context
Unknown*
Unknown*
300-500ms
$1.25/$5 per 1M tokens
v1.5
2024-02-15
⚡️
Gemini 1.5 Flash
Fast multimodal, 1M context
Unknown*
Unknown*
150-300ms
$0.075/$0.30 per 1M tokens
v1.5
2024-05-14
⚡️
Gemini 1.5 Flash-8B
Ultra-fast, cost-effective, 1M context
8B
Unknown*
80-150ms
$0.0375/$0.15 per 1M tokens
v1.5
2024-10-03

Meta

🧠
Llama 3.1 405B open source
Largest open model, 128k context, multilingual
405B
Unknown*
400-700ms
Free (open-source)
v3.1
2024-07-23
⚖️
Llama 3.1 70B
Balanced open model, 128k context
70B
Unknown*
200-350ms
Free (open-source)
v3.1
2024-07-23
⚡️
Llama 3.1 8B
Efficient open model, 128k context
8B
Unknown*
80-150ms
Free (open-source)
v3.1
2024-07-23
👁️
Llama 3.2 90B Vision vision
Multimodal with vision capabilities
90B
Unknown*
250-400ms
Free (open-source)
v3.2
2024-09-25
👁️
Llama 3.2 11B Vision
Lightweight multimodal model
11B
Unknown*
100-180ms
Free (open-source)
v3.2
2024-09-25
⚡️
Llama 3.2 3B
Ultra-lightweight for edge devices
3B
Unknown*
50-100ms
Free (open-source)
v3.2
2024-09-25
⚡️
Llama 3.2 1B
Smallest model for mobile/edge
1B
Unknown*
30-80ms
Free (open-source)
v3.2
2024-09-25

Microsoft

🧠
MAI-1-preview LATEST
Microsoft's new proprietary text model series
Unknown*
Unknown*
200-400ms
Azure pricing
v1-preview
2025-09-15
🎵
MAI-Voice-1 audio
1 minute audio in <1 second on 1 GPU
Unknown*
Unknown*
Ultra-fast
Azure pricing
v1
2025-09-15
🧠
Phi-4 small LLM
14B model rivaling larger models
14B
Unknown*
60-120ms
Azure pricing
v4
2024-12-12

Mistral AI

🧠
Mistral Medium 3.1 LATEST
Advanced reasoning, coding & multimodal, 128k context
Unknown*
Unknown*
200-350ms
API pricing
v3.1
2025-08-15
⚡️
Mistral Small 3.2 fast
Improved instruction following, 128k context
24B
Unknown*
100-200ms
API pricing
v3.2
2025-06-04
⚡️
Mistral Small 3.1
Multimodal processing, 150 tokens/sec
24B
Unknown*
120-220ms
API pricing
v3.1
2025-03-17
👁️
Pixtral Large vision
Multimodal flagship with 128k context
124B
Unknown*
300-450ms
$2/$6 per 1M tokens
v1
2024-11-13
⚡️
Pixtral 12B
Multimodal model, open-source
12B
Unknown*
120-200ms
Free (open-source)
v1
2024-09-17

OpenAI

🧠
GPT-5 LATEST
Most advanced GPT model, next-generation capabilities
Unknown*
Unknown*
200-350ms
$5/$20 per 1M tokens
v5.0
2025-08-07
🧠
GPT-4.1 1M context
1M context window, 8x larger than GPT-4o
Unknown*
Unknown*
300-500ms
$3/$12 per 1M tokens
v4.1
2025-04-23
⚡️
GPT-4.5
Enhanced performance and accuracy
Unknown*
Unknown*
200-400ms
$2.50/$10 per 1M tokens
v4.5
2025-02-27
💎
o3 reasoning
Most powerful reasoning model, multimodal
Unknown*
Unknown*
Variable (reasoning)
$20/$80 per 1M tokens
v3.0
2025-01-31
⚡️
o4-mini
Fast reasoning model, cost-effective
Unknown*
Unknown*
Fast (reasoning)
$5/$20 per 1M tokens
v4.0
2025-01-31
🧠
GPT-4o multimodal
Flagship multimodal model, text, vision, audio
Unknown*
Unknown*
250-400ms
$2.50/$10 per 1M tokens
v4
2024-05-13
⚡️
GPT-4o mini
Fast and affordable multimodal model
Unknown*
Unknown*
100-200ms
$0.15/$0.60 per 1M tokens
v4
2024-07-18
💎
o1 reasoning
Advanced reasoning with chain-of-thought
Unknown*
Unknown*
Variable (reasoning)
$15/$60 per 1M tokens
v1
2024-09-12
⚡️
o1-mini
Fast reasoning model, 80% cheaper than o1
Unknown*
Unknown*
Fast (reasoning)
$3/$12 per 1M tokens
v1
2024-09-12
🔓
GPT-4o Realtime voice
Real-time voice conversation model
Unknown*
Unknown*
Real-time
$5/$20 per 1M tokens
v4
2024-10-01
🔓
GPT-OSS-120B open source
120B MoE open-source, Apache 2.0 license
120B (MoE, 4 active)
Unknown*
200-350ms
Free (open-source)
v1.0
2025-08-05
🔓
GPT-OSS-20B open source
20B MoE open-source, runs on 16GB RAM
20B (MoE, 4 active)
Unknown*
100-200ms
Free (open-source)
v1.0
2025-08-05

xAI

🚀
Grok 4 LATEST
Most advanced Grok, 256k context, multimodal
Unknown*
Unknown*
300-500ms
$3/$15 per 1M tokens
v4
2025-07-09
⚡️
Grok 4 Fast fast
Cost-efficient, 2M context, 40% fewer tokens
Unknown*
Unknown*
150-300ms
$0.20/$1.50 per 1M tokens
v4
2025-09-19
💻
Grok Code Fast 1 coding
Agentic coding, 70.8% SWE-bench, fast inference
Unknown*
Unknown*
100-200ms
$0.20/$1.50 per 1M tokens
v1
2025-08-28
👁️
Grok Vision vision
Multimodal vision model, image analysis
Unknown*
Unknown*
200-400ms
API pricing
v1
2025-04-24

Here to simplify complexity.

Genotix

Summarization

Tests how well each model distills information.

Summarize the following article in 3 bullet points: […article text…]

Creative Writing

Tests creativity, coherence, and tone.

Write a short story about a child who befriends a robot on Mars, in the style of a bedtime fairy tale.

Information Q&A

Tests factual accuracy and explanatory clarity.

What are the main causes of climate change, and how do they impact ocean levels?

Code Generation

Tests coding ability and correctness.

Write a Python function that takes a list of numbers and returns the list sorted without using built-in sort.

Code Debugging

Tests ability to reason about and fix code.

Here is a snippet of code and the error it produces, how can I fix it? […code snippet and error message…]

Customer Support Email

Tests tone control, empathy, and professionalism.

You are a customer service agent. Respond to this customer complaint in a polite tone: "I bought your product and it broke in two days. I'm very upset."

Translation

Tests multilingual capabilities and preservation of meaning/tone.

Translate this English paragraph into French and Chinese: […English paragraph…]

Idea Brainstorming

Tests creativity and practicality of suggestions.

I'm launching a new coffee shop. Give me 5 creative marketing ideas to attract college students.

Explanation (Tutoring)

Tests simplification skills and clarity.

Explain the concept of blockchain to a 12-year-old in a few sentences.

Roleplay/Conversation

Tests conversational ability, persuasiveness, and persona maintenance.

Act as a personal fitness coach. I haven't exercised in months; encourage me with a motivational plan in a friendly tone.

Structured Output

Tests ability to follow specific output format instructions.

Analyze this customer feedback and categorize the issues. Format your response as a JSON object with the following structure: {"positive_points": ["point1", "point2"], "negative_points": ["point1", "point2"], "suggestions": ["suggestion1", "suggestion2"]}

Step-by-Step Reasoning

Tests logical reasoning and problem-solving capabilities.

Solve this math problem step by step, explaining your reasoning at each stage: A store is offering a 25% discount on an item that originally costs $120. If there is also a 8% sales tax applied after the discount, what is the final price?

System Prompt Engineering

Tests ability to follow system-level instructions and constraints.

You are an expert programming tutor specializing in Python. Your responses should: 1. Explain concepts clearly with simple examples, 2. Identify and correct errors in student code, 3. Follow educational best practices by guiding rather than solving, 4. Include explanatory comments in all code examples, 5. Reference Python 3.12 standards. Now help me understand how to implement a binary search algorithm.

Context-Aware Response

Tests ability to incorporate provided context into responses.

Context: I'm a high school physics teacher preparing materials for students who struggle with mathematical concepts. Many of my students have math anxiety but are interested in practical applications. Request: Create an explanation of Newton's Second Law (F=ma) that uses minimal mathematical notation while still conveying the core concept accurately.

Multimodal Reasoning

Tests ability to reason about and describe visual content.

Look at this image of a data visualization chart and explain what trends it shows. What conclusions can be drawn from this data? What might be missing or misleading about this presentation?

Chain of Thought

Tests ability to break down complex problems into logical steps.

Let's think through this problem step by step: A train leaves Station A at 3:00 PM traveling at 60 mph. Another train leaves Station B at 4:30 PM traveling at 75 mph toward Station A. If the stations are 300 miles apart, at what time will the trains meet?

Iterative Refinement

Tests ability to improve outputs based on feedback.

Write a short product description for a new smartphone. After I review it, I'll provide feedback, and I want you to refine the description based on my comments.

Ethical Reasoning

Tests ability to navigate complex ethical scenarios with nuance.

Consider this ethical dilemma in AI development: A healthcare algorithm must allocate limited medical resources. What ethical frameworks should guide its design? Present multiple perspectives and discuss the tradeoffs involved.

Zero-Shot Prompting

Getting results without providing examples, relying on the model's pre-training.

Classify the sentiment of this text as positive, negative, or neutral: 'The product exceeded my expectations in every way.'

Few-Shot Prompting

Providing a few examples to guide the model's response pattern.

Convert these sentences to past tense:
Example: 'I walk to school' → 'I walked to school'
Example: 'She eats lunch' → 'She ate lunch'
Now convert: 'They play soccer'

Role Prompting

Assigning a specific role or persona to shape the model's responses.

You are a senior software architect with 15 years of experience in distributed systems. Review this API design and provide recommendations for scalability and maintainability.

Constraints & Guardrails

Setting clear boundaries on what the model should and shouldn't do.

Explain quantum computing to a beginner. Constraints: Use only simple analogies, avoid mathematical formulas, keep it under 150 words, and don't use jargon without explaining it first.

Meta-Prompting

Having the model help design better prompts or reflect on its own process.

I want to ask an AI to help me plan a wedding. What information should I include in my prompt to get the most helpful response?

Prompt Chaining

Breaking complex tasks into sequential prompts where each builds on the previous.

Step 1: Identify the main themes in this article. Step 2: For each theme, find supporting evidence. Step 3: Create an outline for a response article addressing these themes.

Tree of Thoughts

Exploring multiple reasoning paths simultaneously before selecting the best one.

Consider three different approaches to solve this optimization problem. For each approach, outline the steps, identify potential issues, and estimate success probability. Then recommend the best approach.

Reflection & Self-Critique

Having the model review and improve its own outputs.

First, write a product description. Then, critique your description for clarity, persuasiveness, and completeness. Finally, write an improved version based on your critique.

Retrieval-Augmented Generation (RAG)

Providing relevant context or documents for the model to reference.

Context: [Insert relevant documentation here]. Based on this context, answer: How do we implement authentication in our system?

Output Format Control

Specifying exact output format, often using JSON, XML, or markdown.

Extract key information and format as JSON with these fields: {"name": string, "date": ISO-8601, "priority": "high"|"medium"|"low", "tags": string[]}

Temperature & Creativity Control

Using prompt language to guide creativity vs. consistency.

For creative: 'Brainstorm 10 wild and unconventional ideas for...' vs. For consistent: 'Provide the standard, industry-accepted method for...'

Negative Prompting

Explicitly stating what NOT to include in the response.

Explain blockchain technology. Do NOT use technical jargon, do NOT assume prior knowledge of cryptography, and do NOT include code examples.

Model Performance Levels

Fast & Efficient

Lightweight models optimized for speed and cost-effectiveness

Balanced

General-purpose models offering good performance for most tasks

High Performance

Advanced models with enhanced capabilities and accuracy

Flagship

Top-tier models with maximum performance and latest features

Specialized Capabilities

Reasoning Focused

Models with extended thinking time for complex problem-solving

Code Specialized

Models optimized for programming, debugging, and code generation

Vision & Analysis

Multimodal models for image understanding and data analysis

Search & RAG

Models optimized for retrieval-augmented generation and web search

GitHub Copilot

Pair programmer that helps you write better code

Cursor

AI-first code editor with pair programming capabilities

Windsurf

AI IDE by Codeium with agentic flows and contextual awareness

Replit Agent

AI that builds complete apps from natural language descriptions

Aider

AI pair programming in your terminal with git integration

Augment Code

AI coding assistant that learns your codebase

Ollama

Run large language models locally

v0.dev

Generative UI. Generate UI with simple text prompts.

n8n

Workflow automation platform to connect different services

Tabnine

AI code completion assistant for developers

Amazon CodeWhisperer

AI coding companion by AWS, provides code suggestions

Snyk

Developer security platform for finding and fixing vulnerabilities

Sourcery

AI-powered coding assistant for refactoring and improving code quality

Mintlify

AI-powered platform for creating beautiful and effective documentation

Pieces.app

AI-enabled productivity tool for developers to save, enrich, and reuse code snippets

Codeium

Free AI-powered toolkit for developers, with code completion and chat features

Qodo

AI code assistant with test generation and PR review capabilities

Bolt

AI-powered app builder with no-code capabilities

Lovable

AI-powered coding assistant for web development

Perplexity

AI-powered search engine with real-time information

Deep Research

AI research assistant for comprehensive information gathering

NotebookLM

AI-powered note-taking and research tool by Google

Canva Magic Studio

AI-powered design tools for creating professional graphics

Notion AI Q&A

AI-powered knowledge management and question answering

Gamma

AI-powered presentation creation tool

ElevenLabs

AI voice generation with natural-sounding results

Suno

AI music generation platform

Tidio AI

AI-powered chatbot for customer service

Continue

Open-source autopilot for VS Code and JetBrains

Manus AI

Autonomous AI agent that works across applications

Claude Artifacts

Interactive code and content generation by Anthropic

Poe

Access multiple AI models (GPT-4, Claude, etc.) in one place

ChatGPT Canvas

Collaborative workspace for writing and coding with GPT

Copilot Workspace

AI-native development environment by GitHub

OpenHands

Open-source AI software engineer (formerly OpenDevin)

Sweep AI

AI junior developer that writes code from GitHub issues

ChatGPT Search

Real-time web search with AI-powered answers

Claude Computer Use

AI that can control your computer to complete tasks

Midjourney

AI image generation from text prompts

RunwayML

AI video generation and editing tools

Mem0

Personalized AI memory layer for applications

LangChain

Framework for building LLM applications

LlamaIndex

Data framework for LLM applications

Vercel AI SDK

Build AI-powered products with streaming interfaces

Ctrl + K Zoeken
Ctrl + D Dark mode