
Hedge fund Bridgewater and AI startup Thinking Machines Lab say a fine-tuned open-weight model outperformed GPT and Claude on financial document evaluation tasks while costing nearly 14 times less to operate. The work highlights that proprietary corporate data and expert human judgment—which large AI labs cannot easily access—remain a significant source of improvement for specialized tasks, making open-model fine-tuning an attractive alternative for businesses that want to keep sensitive data private.
Summaries like this, in your inbox every morning.
Sign up free →What happened
Bridgewater and Thinking Machines Lab tested AI models on six finance-focused tasks—like flagging relevant articles and spotting central bank signals. Frontier models (Gemini, Claude, GPT variants) hit only about 50% accuracy with basic prompts; expert instructions lifted them to the mid-70s, still below an 80% deployment threshold. A fine-tuned open model reached 84.7% accuracy versus 78.2% for the best frontier model tested, while costing nearly 14 times less to run.
Why it matters
The result suggests large AI labs have not absorbed all valuable training data. Proprietary corporate data and human expertise locked inside companies remain untrained—a gap that open-model fine-tuning can close. For businesses worried about sending sensitive data to OpenAI or Anthropic, fine-tuning with tools like Thinking Machines' Tinker platform offers a way to keep weights, data, and compute infrastructure in-house while matching or exceeding frontier-model performance.
What to watch
The evaluation comes from the two companies involved, so it is not independent; both have a commercial interest in the result. The finding is part of a broader pattern: newer frontier models show only marginal accuracy gains per dollar (GPT 5.4 costs 43% more than 5.2 but is only marginally more accurate), suggesting diminishing returns in throwing more compute at public benchmarks.
No comments yet. Be the first to share your thoughts!
Log in to join the discussion





Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.
Get Started FreeFree · takes 30 seconds · unsubscribe anytime
1 minute a day. The AI essentials.
200+ sources · Email / LINE / Slack