Alibaba Launches Qwen3-Max-Thinking Reasoning Model with Top HLE Score

Published 2026-01-27Foundation ModelsMedium

Summary

Alibaba released Qwen3-Max-Thinking, a flagship API-only reasoning model with over 1 trillion parameters, trained with reinforcement learning at scale. The model scored 58.3 on Humanity's Last Exam (HLE) with tool use — leading GPT-5.2 and Gemini 3 Pro by nearly 13 points — and achieved 100% on AIME25 mathematical reasoning. The model thinks in rounds, checks itself, and uses tools autonomously when needed. Pricing is competitive: $1.20 per 1M input tokens and $6.00 per 1M output tokens, below

Alignment: Reinforces current position

alibabaqwen3reasoningfoundation-modelschinaapi-pricinghle-benchmarkenterprise-aimodel-competitioncost-deflation