Researchers at Apple and Johns Hopkins University have developed a framework to control the computational cost of reasoning in large language models while maintaining a target accuracy level. The method uses two stopping thresholds—one for confident predictions and one to abandon unsolvable problems early—optimized via distribution-free risk control. Empirical testing shows the approach reduces computation while meeting specified error-rate targets across various reasoning tasks.
Summaries like this, in your inbox every morning.
Sign up free →What happened
Apple researchers introduced a framework that optimizes how much computation reasoning AI models should use to answer questions. The approach sets upper and lower thresholds to stop reasoning—one when the model is confident, another to preemptively halt unsolvable instances—and uses distribution-free risk control to specify these stopping mechanisms based on a target risk level and validation data.
Why it matters
Reasoning models can improve accuracy by spending more computation time (tokens), but deciding how much to spend involves a trade-off between accuracy and cost. This framework lets developers specify an acceptable error rate upfront and then minimize computational expense while meeting that target, which could help businesses deploy reasoning models more cost-effectively.
What to watch
The researchers demonstrated the approach across diverse reasoning tasks and models, showing that both the lower threshold and ensemble stopping mechanisms deliver computational efficiency gains while adhering to user-specified risk targets. Code is available at https://github.com/xidulu/reasoning_risk_control/.
No comments yet. Be the first to share your thoughts!
Log in to join the discussion





Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.
Get Started FreeFree · takes 30 seconds · unsubscribe anytime
1 minute a day. The AI essentials.
200+ sources · Email / LINE / Slack