Z.ai GLM-5.1 Tops SWE-Bench Pro, Beats GPT-5.4

Z.ai’s GLM-5.1 becomes the first open-source model to top SWE-Bench Pro

Neural Network World Editorial Team April 12, 2026 (Last updated: April 12, 2026) 2 minutes read

Beijing-based Z.ai released GLM-5.1 on April 7 under the MIT license, and within days the 754-billion-parameter model had claimed the top spot on SWE-Bench Pro with a score of 58.4 – surpassing OpenAI’s GPT-5.4 at 57.7, Anthropic’s Claude Opus 4.6 at 57.3, and Google’s Gemini 3.1 Pro at 54.2. It is the first open-weight model to lead the benchmark against all closed-source frontier competitors. The model can execute autonomously for up to eight hours on a single task, completing more than 1,700 steps in plan-execute-test-fix cycles – a capability that did not exist at this scale six months ago.

Why It Matters

GLM-5.1’s benchmark result is significant not just for its score but for how it was built. The entire model was trained on Huawei Ascend 910B chips using the MindSpore framework, with no NVIDIA or AMD hardware involved at any stage. That makes it the first frontier-class machine learning model developed entirely outside the Western GPU ecosystem – a direct challenge to the premise that US export controls can contain Chinese AI progress. At $1.40 per million input tokens and $4.40 per million output tokens, the model is priced well below proprietary alternatives, which will drive enterprise adoption across Asia and beyond. Z.ai went public in Hong Kong in January 2026, raising approximately $558 million at a market cap around $52.8 billion; FY2025 revenue reached $104.8 million, up 131% year-over-year.

What’s Next

GLM-5.1’s Mixture-of-Experts architecture – with only 40 billion parameters active per token – gives Z.ai a cost structure that proprietary labs cannot easily replicate at similar performance levels. If the model holds its benchmark lead as independent evaluations expand, it will accelerate the migration of code-intensive workloads away from OpenAI and Anthropic APIs, particularly in markets where Huawei hardware is the dominant compute provider. The competitive pressure will force US labs to respond either with open-weight releases of their own or with aggressive pricing changes. With GLM-5.1 already available on Hugging Face under a permissive license, the diffusion window is open now.

The eight-hour autonomous execution capability also raises immediate questions for enterprise security teams. A model capable of operating 1,700-plus steps without human oversight introduces new vectors for both legitimate automation and misuse – a policy debate that regulators in Beijing and Brussels will need to address before the technology sees widespread deployment.

Sources: VentureBeat · Dataconomy · MarkTechPost