Agentic Coding Security: Securing Autonomous AI Agents

What’s actually new about agentic coding

A coding “assistant” suggests; an “agent” decides. The shift matters because the security controls we built for the assistant era — the human reading every diff line, the lint pass before commit, the careful PR description — were predicated on the human being in the loop on every change. Agents push that loop to the boundary of the session, not the boundary of each edit.

Three properties of agentic workflows shift the threat model:

Diff size per session. A single agent run can touch 20+ files. Tier-one reviewers (the humans who used to look at every change) approve in bulk because line-by-line review at that volume is impractical. Security-relevant edits ride along with formatting changes.
Side effects extend past the source tree. Agents install packages, modify CI pipelines, edit .env files, run shell commands, and push migrations. Each of those is a potential security event that wouldn’t have happened in a chat-only assistant flow.
Loops widen scope. Agents iterate to make tests pass. The natural failure mode is “disable the failing check” rather than “fix the underlying defect” — and a disabled CORS check or a broadened file permission persists in the final commit with no commentary.

None of this means agents are unsafe to use. It means the controls that worked for chat-style assistance need to be re-pointed at the new failure surface.

The agentic threat model in one table

Threat	Where it shows up	Mitigation that actually scales
Insecure code generation (SQLi, missing authn, path traversal)	Inside the agent’s commits, often in scaffolding code that gets approved fast	Static + runtime scanning gate; treat agent PRs as untrusted contributors
Supply-chain pulls (typosquats, slopsquats, malicious post-install)	Agent installs a dep mid-session to “solve” a problem	Require dep additions to go through human review; pin lockfiles; package hallucination scanner
Credential exfiltration via the agent itself	Agent reads `.env`, logs include API keys, secrets get written into config files	Run agents in a sandboxed env with redacted secrets; never run with prod creds
Scope creep during the loop	Agent disables a failing security check to make tests pass	Branch protection + required status checks the agent can’t bypass; review the diff, not just the goal
Prompt injection from data	Agent processes external data (issue, README, scraped page) that contains hidden instructions	Treat retrieved data as untrusted; constrain tool permissions per task
Approval fatigue	Reviewer hits “approve all” on a 30-file diff	Tier files by sensitivity; require explicit re-approval if tier-one files change
Pipeline / CI tampering	Agent edits `.github/workflows/*.yml` to “unblock” itself	CODEOWNERS on workflow files; branch protection on `main` requires a human + CI

Most of these are not new vulnerabilities — they’re old ones with a new arrival rate.

Read these first

Foundations

Security Risks in Agentic AI Coding — the full risk catalog: autonomous generation, lack of oversight, supply-chain risk, credential exposure, scope creep in agent loops. Start here if you’re new to the topic.
How to Review Code from AI Agents — diff strategies for multi-file changes, the tier-one / tier-two / tier-three review pattern, and the explicit security checklist to apply to every agent PR.

Agent-specific guides

Cursor Composer Security — multi-file edits, agent mode, rapid-fire scaffolding. What Composer tends to leave out and how to catch it.
Claude Code Security — terminal access, file-system operations, the agentic loop. Permissions, hooks, and the boundaries you can configure.
Devin Security Practices — fully autonomous dev, cloud execution, PR-based workflow. Sandboxing posture and how Devin’s environment isolation actually works.

Minimum viable guardrails

The smallest set of controls that materially change the security posture of an agentic workflow:

Branch protection on main. No agent commits direct. Every change goes through a PR with required status checks. The agent’s identity is a separate role, not the human’s account.
CODEOWNERS on sensitive paths. Files in auth/, migrations/, .github/workflows/, infra/, and scripts/release/* require a named human approver. The agent can edit them; merging requires a human.
Dependency review in CI. A workflow step that runs on every PR, lists added/removed deps, and fails the build if a new package is from a low-trust source (recently registered, low downloads, no GitHub repo, or matches a slopsquat heuristic). Pair with package hallucination checking.
Secrets boundary. The agent’s runtime never sees production secrets. If the agent needs an API key for testing, it gets a sandbox key from a separate vault. Use .env.local.template with placeholder values and refuse to start the agent if real values are present.
Scoped tool permissions. If the agent has shell access, restrict to a project-local directory. Deny curl, wget, npm publish, git push, aws * by default — opt in per task.
Required runtime scan on PR. Once a PR is opened, run a runtime security probe (e.g. vibe code scanner on the preview deployment). Block merge on critical findings.
Audit log for agent actions. Every command, file write, and tool call the agent makes goes to a structured log. When something breaks in production, you need to be able to answer “what did the agent do, when, in response to what prompt.”

Half of these are off-the-shelf GitHub features. The rest are small CI additions. None of them require buying anything.

How to choose between agents (security lens)

Property	Cursor Composer	Claude Code	Devin
Default execution environment	Local IDE	Local terminal	Remote sandbox
Filesystem scope	Workspace folder	Configured working dir	Isolated VM
Network access	Through user’s machine	Through user’s machine + configured tools	Sandboxed; explicit egress
Long-running tool execution	No (interactive)	Yes (background tasks)	Yes (full session)
PR-by-default workflow	No	No (configurable)	Yes
Native permission model	OS-level	Hooks + permission modes	Sandbox-level
Where credentials live	Local env / 1Password CLI	Local env / `claude login`	Devin’s secret manager

The “best” agent for security depends on what blast radius you’re willing to accept if the agent does the wrong thing. Devin’s sandbox limits damage at the cost of slower iteration; Cursor and Claude Code are faster but trust the local machine. Match the agent to the sensitivity of the workspace.

Common mistakes (and what to do instead)

Running the agent with your daily-driver shell environment. It inherits every credential you’ve ever exported. Use a fresh shell, a project-scoped .env, or a container.
Auto-approving every tool call. Sets a precedent the agent’s later tool calls inherit. Approve per category, not per call.
Letting the agent edit its own permission rules. Anything in .claude/, .cursor/, or equivalent should be CODEOWNERS-protected. The agent fixing its own restrictions is the AI-era version of chmod 777.
Treating “tests pass” as “code is correct”. Agents optimise for green CI. A passing test suite after an agent run means nothing about whether the new code preserves security invariants — those are usually not what the test suite checks.
No log retention for agent sessions. When you discover a regression three weeks after the agent introduced it, you’ll want the prompt history, the tool calls, and the file diffs. Plan for retention before you need it.
Trusting the agent’s commit message. The message describes what the agent thinks it did. Read the diff.

AI Pentest Hub — the offensive view: how attackers approach apps the agents shipped.
AI Security Risks: Cursor — Cursor-specific risk profile, paired with the Composer guide above.
Vibe Coding Security Risks — broader frame, includes non-agentic AI coding tools.
Free Security Self-Audit — 30-minute manual pass to run on any app an agent helped build.
Vibe Code Scanner — runtime probe that pairs well with the agent-output review checklist above.

AGENTIC CODING SECURITY

What’s actually new about agentic coding

The agentic threat model in one table

Read these first

Foundations

Agent-specific guides

Minimum viable guardrails

How to choose between agents (security lens)

Common mistakes (and what to do instead)

SCAN AGENT-GENERATED CODE

What’s actually new about agentic coding

The agentic threat model in one table

Read these first

Foundations

Agent-specific guides

Minimum viable guardrails

How to choose between agents (security lens)

Common mistakes (and what to do instead)

Related resources

SCAN AGENT-GENERATED CODE