Study finds multi-agent AI systems more vulnerable to attacks than single agents across three environments
arXiv cs.MA (Multi-Agent) · 2026年4月28日
AI要約
•Researchers evaluated 13 architectural configurations of multi-agent systems (networks of two or more autonomous AI agents) across browser, desktop, and code environments, using stagewise evaluations to measure planning refusal, execution-stage interception, partial harmful execution, and successful attack completion.
•Multi-agent architectures were more vulnerable than standalone agents in the majority of configurations, with attack success rates varying by up to 3.8x at comparable or higher benign accuracy. Three key design choices shaped the security tradeoff: agent roles (how authority and responsibility are allocated), communication topology (how and when agents interact), and memory (context and state visibility accessible to each agent).
•No single multi-agent design was found to be universally safer, indicating that architectural decisions governing agent coordination create attack surfaces that have not been systematically characterized until this empirical study.