← 記事一覧に戻る

大規模言語モデルオープンソースAI

New framework tackles harder automated theorem proving by requiring AI systems to discover answers independently before formal proof construction

arXiv cs.AI · 2026年4月20日

AI要約

•Researchers introduce 'Hard Mode' ATP benchmarks (MiniF2F-Hard and FIMO-Hard) that remove answer hints from formal statements, creating more realistic and challenging evaluation conditions
•DAP (Discover And Prove) framework combines LLM natural-language reasoning with self-reflection to independently discover solutions, then converts problems back to solvable format for existing provers
•DAP achieves state-of-the-art results: solves 10 problems on CombiBench (vs. 7 previously) and is the first system to solve problems on PutnamBench

元記事を読む

関連記事

AWS、NVIDIA、Microsoft、OpenAIなどが主導するカスタムLLM訓練プラットフォーム市場は2026年から2035年にかけて急速に拡大予定

大規模言語モデルオープンソースAI

AWS、NVIDIA、Microsoft、OpenAIなどが主導するカスタムLLM訓練プラットフォーム市場は2026年から2035年にかけて急速に拡大予定

Yahoo Finance AI·2026年4月20日

オープンウェイトモデルの厳選ガイドが、本番環境でのLLMデプロイメント実装を支援

大規模言語モデル

オープンウェイトモデルの厳選ガイドが、本番環境でのLLMデプロイメント実装を支援

Hacker News·2026年4月20日

AIエージェントがコードベースを扱えるかを評価するための「コードベース準備グリッド」がGitHubで公開された

大規模言語モデル

AIエージェントがコードベースを扱えるかを評価するための「コードベース準備グリッド」がGitHubで公開された

Hacker News·2026年4月20日

AI エージェントの動作を可視化・監視することが、信頼性の高いシステム構築に不可欠となっている。

大規模言語モデル

AI エージェントの動作を可視化・監視することが、信頼性の高いシステム構築に不可欠となっている。

Hacker News·2026年4月20日

PythonとOllamaを使用してキーボードショートカットでローカルAIスキルを実行するツール「Scryptian」がHackerNewsで紹介される

大規模言語モデルオープンソースAI

PythonとOllamaを使用してキーボードショートカットでローカルAIスキルを実行するツール「Scryptian」がHackerNewsで紹介される

Hacker News·2026年4月20日

AIニュースを毎日お届け

200以上のソースから厳選したAIニュースを毎日無料でお届けします。

無料で始める