Skip to content

Jamie-Cui/awesome-ai-agents-security

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome AI Agents Security

Awesome License: CC0-1.0 PRs Welcome Maintenance

Forked from ProjectRecon/awesome-ai-agents-security with additional research papers on LLM security and privacy.

A curated list of open-source tools, frameworks, and resources for securing autonomous AI agents.

This list is organized by the security lifecycle of an autonomous agent, covering red teaming, runtime protection, sandboxing, and governance.

Scope: This repo focuses on the security of AI agent systems (tool use, orchestration, sandboxing, identity, etc.), not on attacks/defenses targeting the underlying LLMs themselves (e.g., adversarial examples, model robustness). The Surveys section is an exception, included as background reading.

Similar Projects:

Table of Contents

Agent Firewalls & Gateways (Runtime Protection)

Tools that sit between the agent and the world to filter traffic, prevent unauthorized tool access, and block prompt injections.

  • AgentGateway - A Linux Foundation project providing an AI-native proxy for secure connectivity (A2A & MCP protocols). It adds RBAC, observability, and policy enforcement to agent-tool interactions.
  • Envoy AI Gateway - An Envoy-based gateway that manages request traffic to GenAI services, providing a control point for rate limiting and policy enforcement.

Red Teaming & Vulnerability Scanners

Offensive tools to test agents for security flaws, loop conditions, and unauthorized actions.

  • Strix - An autonomous AI agent designed for penetration testing. It runs inside a docker sandbox to actively probe applications and generate verified exploit capabilities.
  • PyRIT - Microsoft’s open-source red teaming framework for generative AI. It automates multi-turn adversarial attacks to test if an agent can be coerced into harmful behavior.
  • Agentic Security - A dedicated vulnerability scanner for agent workflows and LLMs capable of running multi-step jailbreaks and fuzzing attacks against agent logic.
  • Garak - The "Nmap for LLMs." A vulnerability scanner that probes models for hallucination, data leakage, and prompt injection susceptibilities.
  • A2A Scanner - A scanner by Cisco designed to inspect "Agent-to-Agent" communication protocols for threats, validating agent identities and ensuring compliance with communication specs.
  • Cybersecurity AI (CAI) - A framework for building specialized security agents for offensive and defensive operations, often used in CTF (Capture The Flag) scenarios.

Static Analysis & Linters

Tools to analyze agent configuration and logic code before deployment.

  • Agentic Radar - A static analysis tool that visualizes agent workflows (LangGraph, CrewAI, AutoGen). It detects risky tool usage, permission loops, and maps them to known vulnerabilities.
  • Agent Bound - A design-time analysis tool that calculates "Agentic Entropy"—a metric to quantify the unpredictability and risk of infinite loops or unconstrained actions in agent architectures.
  • Checkov - While primarily for IaC, Checkov includes policies for scanning AI infrastructure and configurations to prevent misconfigurations in deployment.

Sandboxing & Isolation Environments

Secure runtimes to prevent agents from damaging the host system during code execution.

  • SandboxAI - An open-source runtime for executing AI-generated code (Python/Shell) in isolated containers with granular permission controls.
  • Kubernetes Agent Sandbox - A Kubernetes Native project providing a Sandbox Custom Resource Definition (CRD) to manage isolated, stateful workloads for AI agents.
  • Agent-Infra Sandbox - An "All-In-One" sandbox combining Browser, Shell, VSCode, and File System access in a single Docker container, optimized for agentic tasks.
  • OpenHands - Formerly OpenDevin, this platform includes a secure runtime environment for autonomous coding agents to operate without accessing the host machine's sensitive files.

Guardrails & Compliance

Middleware to enforce business logic and safety policies on inputs and outputs.

  • NeMo Guardrails - NVIDIA’s toolkit for adding programmable rails to LLM-based apps. It ensures agents stay on topic, avoid jailbreaks, and adhere to defined safety policies.
  • Guardrails - A Python framework for validating LLM outputs against structural and semantic rules (e.g., "must return valid JSON," "must not contain PII").
  • LiteLLM Guardrails - While known for model proxying, LiteLLM includes built-in guardrail features to filter requests and responses across multiple LLM providers.

Benchmarks & Datasets

Resources to evaluate agent security performance.

  • CVE Bench - A benchmark for evaluating an AI agent's ability to exploit real-world web application vulnerabilities (useful for testing defensive agents).

Identity & Authentication

Tools to manage agent identity (non-human identities).

  • WSO2 - An identity management solution that treats AI agents as first-class identities, enabling secure authentication and authorization for agent actions.

Surveys & Systematizations

Academic surveys covering LLM security, privacy threats, and defenses.

General Security & Privacy

Privacy-Focused

Contributing

Contributions are welcome! Please read the contribution guidelines first.

  1. Fork the project.
  2. Create your feature branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

About

A living map of the AI agent security ecosystem.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors