Qwen: Qwen3.6 Flash

qwen3.6-flash
chatQwen

Quick Reference

Input
Text, Image
Output
Text
Context
1M
Max Output
65.5K
Input Price
$0.18/M
Output Price
$1.06/M
Author
Alibaba
Version
main
Open Source
Yes

Overview

Qwen3.6 native vision-language Flash model, with significantly improved performance over 3.5-Flash. This model focuses on enhancing agentic coding capabilities (substantially surpassing previous generations on multiple code-agent benchmarks), mathematical reasoning, and code reasoning; on the vision side, spatial intelligence is markedly strengthened, with especially notable gains in object localization and detection.

Input modalities

TextImage

Output modalities

Text

Capabilities

chatreasoningvision

Features

Function Calling
Structured Output
Caching
Prefix Completion

Pricing

Per-token prices for Qwen: Qwen3.6 Flash.

Input <= 128K

Token TypePriceUnit
Input$0.18/Mper million tokens
Output$1.06/Mper million tokens
Cache Read$0.02/Mper million tokens

Specifications

Context Window

1Mtokens

Max Input

934Ktokens

Max Output

65.5Ktokens

API Reference

OpenAI-compatible endpoint at https://api.inferoute.ai/v1.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.inferoute.ai/v1",
    api_key=os.environ.get("INFEROUTE_API_KEY"),
)

try:
    response = client.chat.completions.create(
        model="qwen3.6-flash",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Write a haiku about recursion."},
        ],
        max_tokens=512,
        temperature=0.7,
    )

    print(response.choices[0].message.content)
except Exception as e:
    print(f"Error: {e}")