AgentGuard vs Semgrep vs CodeQL — AI Agent Security Benchmark

AI Agent Security Benchmark — 39 samples covering 17 detection rules. Run date: 2026-07-05.

AgentGuard v0.6.4

100%

39/39 detected

Semgrep

0/39 detected

CodeQL

0/39 detected

Why Semgrep and CodeQL Detect Nothing

Semgrep and CodeQL are general-purpose SAST tools. They have zero rules for AI agent security. Their rule sets target traditional web application vulnerabilities (SQL injection, XSS, path traversal) — none of which apply to AI agent code.

AgentGuard is purpose-built for the AI agent attack surface. Every rule targets a specific OWASP ASI Top 10 vulnerability or a novel attack vector unique to autonomous AI systems.

Detection Coverage by OWASP ASI Category

Category	Attack	Samples	AgentGuard	Semgrep	CodeQL
ASI01	Prompt Injection	16	100%	0%	0%
ASI02	Tool Abuse	5	100%	0%	0%
ASI03	Data Exfiltration	4	100%	0%	0%
ASI06	Output Handling	2	100%	0%	0%
ASI07	Credential Leakage	6	100%	0%	0%
ASI09	Resource Exhaustion	2	100%	0%	0%
ASI10	Isolation Bypass	5	100%	0%	0%

Category

Attack

Samples

AgentGuard

Semgrep

CodeQL

ASI01

Prompt Injection

100%

ASI02

Tool Abuse

100%

ASI03

Data Exfiltration

100%

ASI06

Output Handling

100%

ASI07

Credential Leakage

100%

ASI09

Resource Exhaustion

100%

ASI10

Isolation Bypass

100%

Beyond OWASP: Novel Attack Vectors

Novel Rule	Attack Vector	AgentGuard	Semgrep	CodeQL
ASI-MEMORY-POISON	Persistent vector store poisoning	Detects	No support	No support
ASI-TOOL-TRUST	Blind trust in tool outputs	Detects	No support	No support
ASI-CHAIN-AMPLIFY	Destructive amplification loops	Detects	No support	No support
ASI-AGENT-COLLUSION	Multi-agent conspiracy patterns	Detects	No support	No support
ASI01-INTERPROCEDURAL	Cross-function taint tracking	Detects	No support	No support
ASI01-CROSS-FILE	Cross-file import resolution	Detects	No support	No support

Novel Rule

Attack Vector

AgentGuard

Semgrep

CodeQL

ASI-MEMORY-POISON

Persistent vector store poisoning

Detects

No support

ASI-TOOL-TRUST

Blind trust in tool outputs

Detects

No support

ASI-CHAIN-AMPLIFY

Destructive amplification loops

Detects

No support

ASI-AGENT-COLLUSION

Multi-agent conspiracy patterns

Detects

No support

ASI01-INTERPROCEDURAL

Cross-function taint tracking

Detects

No support

ASI01-CROSS-FILE

Cross-file import resolution

Detects

No support

Why This Matters

Real-World Validation

AgentGuard v0.6.1 scanned Microsoft AutoGen (59K stars) and LlamaIndex (50K stars) — detecting 332 critical vulnerabilities across 3,500+ files. Findings reported as GitHub Issues #7917, #7918, llama_index#22245.

Zero False Positives

AgentGuard maintains 0% false positive rate on its entire benchmark suite while achieving 100% detection. Clean code samples are correctly passed. Generic variable names, commented code, and sanitized inputs are not flagged.

Technical Differentiation

Capability	AgentGuard	Semgrep	CodeQL
OWASP ASI Top 10	10/10	0/10	0/10
Interprocedural taint	Yes	No	Limited
Cross-file analysis	Yes	No	Yes (QL)
JS/TS support	Yes	Yes	Yes
Memory poison detection	Yes	No	No
Agent collusion detection	Yes	No	No
GitHub Action	Marketplace	Marketplace	Native
MCP Server mode	Yes	No	No
Open Source	MIT	LGPL	MIT
SARIF output	Yes	Yes	Yes

Capability

AgentGuard

Semgrep

CodeQL

OWASP ASI Top 10

10/10

0/10

Interprocedural taint

Yes

Limited

Cross-file analysis

Yes

Yes (QL)

JS/TS support

Yes

Memory poison detection

Yes

Agent collusion detection

Yes

GitHub Action

Marketplace

Native

MCP Server mode

Yes

Open Source

MIT

LGPL

MIT

SARIF output

Yes