Alibaba's Qwen3.7-Max AI model ran autonomously for 35 hours to optimize code for its own custom chip, achieving a 10x speedup
THE DECODER · May 23, 2026
AI Summary
•Qwen3.7-Max, a proprietary model from Alibaba's Qwen team designed for agent-based tasks, ran a fully autonomous kernel optimization for 35 hours straight. Over that period, the model ran 432 kernel tests with 1,158 total tool calls, compiling and revising code to optimize a hardware attention kernel for the T-Head-ZW-M890 accelerator (Alibaba's AI chip), achieving an average 10x speedup over the reference implementation.
•The model was trained using a three-part split: the task itself, the tool environment, and a validator (a verification system) that can be mixed and matched across different setups. This approach forces the model to learn strategies that work across different agent frameworks—OpenClaw, Claude Code, and Hermes—rather than shortcuts tied to one specific setup.
•On standardized benchmarks, Qwen3.7-Max claims to produce accelerated kernels 96 percent of the time on KernelBench L3, and scores 80.4 on SWE-Verified. In a simulated one-year startup scenario (YC-Bench), the model pulled in $2.08 million in total revenue and wrapped up 237 tasks, compared to its predecessor Qwen3.6-Plus at $1.05 million.