From threat to trust: assessing security risks of agentic AI systems
Abstract
Agentic artificial intelligence (AI) systems are expected to have transformative impacts across sectors, including critical areas like finance and healthcare. Their architectural complexity, autonomous decision-making abilities, adaptive behaviors, and capacity to interact with the environment using tools are also likely to introduce new and poorly understood security risks. Existing security research and risk management frameworks are still in the early stages of development and are insufficient for addressing the complex vulnerabilities unique to agentic AI. This paper fills this important gap by presenting a comprehensive, layered risk assessment methodology that combines traditional threat modeling. The methodology recognizes that a threat to an agentic AI system not only compromises conventional security properties but also affects the trustworthiness of the system itself. Our method systematically maps and evaluates threats across different architectural layers of agentic AI, assessing how violations of trustworthiness impact the system to help prioritize risk reduction efforts. We demonstrate how this method works through a detailed case study of “RoboPMS,” a multi-agent autonomous portfolio management system in the financial sector. The analysis demonstrates how risks can be more effectively evaluated, understood, and prioritized to inform targeted mitigation strategies. By offering a structured, actionable framework for identifying, classifying, and managing risks in agentic AI systems, this work advances both academic knowledge and practical governance. The methodology can be adapted to various domains and is designed to help practitioners, risk managers, and policymakers ensure the secure and trustworthy integration of agentic AI into organizational information systems.