Qwen: Qwen3.6 Flash
qwen3.6-flash
chatQwen
Quick Reference
- Input
- Text, Image
- Output
- Text
- Context
- 1M
- Max Output
- 65.5K
- Input Price
- $0.18/M
- Output Price
- $1.06/M
- Author
- Alibaba
- Version
- main
- Open Source
- Yes
Overview
Qwen3.6 native vision-language Flash model, with significantly improved performance over 3.5-Flash. This model focuses on enhancing agentic coding capabilities (substantially surpassing previous generations on multiple code-agent benchmarks), mathematical reasoning, and code reasoning; on the vision side, spatial intelligence is markedly strengthened, with especially notable gains in object localization and detection.
Input modalities
TextImage
Output modalities
Text
Capabilities
chatreasoningvision
Features
Function Calling
Structured Output
Caching
Prefix Completion
Pricing
Per-token prices for Qwen: Qwen3.6 Flash.
Input <= 128K
| Token Type | Price | Unit |
|---|---|---|
| Input | $0.18/M | per million tokens |
| Output | $1.06/M | per million tokens |
| Cache Read | $0.02/M | per million tokens |
Specifications
Context Window
1Mtokens
Max Input
934Ktokens
Max Output
65.5Ktokens
API Reference
OpenAI-compatible endpoint at https://api.inferoute.ai/v1.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.inferoute.ai/v1",
api_key=os.environ.get("INFEROUTE_API_KEY"),
)
try:
response = client.chat.completions.create(
model="qwen3.6-flash",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a haiku about recursion."},
],
max_tokens=512,
temperature=0.7,
)
print(response.choices[0].message.content)
except Exception as e:
print(f"Error: {e}")