New parallel monitoring system cuts LLM agent reasoning failures by up to 62% while reducing computational overhead
arXiv cs.AI · April 16, 2026
AI Summary
•LLM agents fail on multi-step tasks up to 30% of the time due to reasoning degradation, looping, and getting stuck
•Cognitive Companion introduces two monitoring approaches: an LLM-based version reducing repetition by 52-62% with 11% overhead, and a zero-overhead Probe-based version
•Probe-based Companion trained on layer 28 hidden states achieves 0.840 AUROC without measurable inference cost on Gemma 4 E4B, Qwen 2.5 1.5B, and Llama 3.2 1B
•Current solutions rely on hard step limits (abrupt failures) or LLM-as-judge monitoring adding 10-15% per-step overhead