MiniMax: MiniMax M2.5

minimax-m2.5

chatMiniMax

Quick Reference

Input: Text
Output: Text

Context: 200K
Max Output: 131K

Input Price: $0.31/M
Output Price: $1.24/M

Author: MiniMax
Version: main
Open Source: No

Overview

SOTA for the agent world. Purpose-built for Agent 2.0, it extends coding into real-world workspaces, entertainment, and personal assistance. A global SOTA open-source coding and agent model: SWE-bench Pro and SWE-bench Verified scores surpass Opus 4.6; global SOTA on Excel, search & research, and document summarization; lightning fast with optimized thinking efficiency at 100+ TPS, delivering 3x the speed of Opus; extreme cost-performance to power always-on agents.

Input modalities

Text

Output modalities

Text

Capabilities

chatreasoning

Features

Function Calling

Structured Output

Caching

Pricing

Per-token prices for MiniMax: MiniMax M2.5.

Token Type	Price	Unit
Input	$0.31/M	per million tokens
Output	$1.24/M	per million tokens
Cache Read	$0.03/M	per million tokens

Specifications

Context Window

200Ktokens

Max Input

69Ktokens

Max Output

131Ktokens

API Reference

OpenAI-compatible endpoint at https://api.inferoute.ai/v1.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.inferoute.ai/v1",
    api_key=os.environ.get("INFEROUTE_API_KEY"),
)

try:
    response = client.chat.completions.create(
        model="minimax-m2.5",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Write a haiku about recursion."},
        ],
        max_tokens=512,
        temperature=0.7,
    )

    print(response.choices[0].message.content)
except Exception as e:
    print(f"Error: {e}")