AGENTIC CODING SECURITY
Autonomous code agents change the security model. Less human review, larger diffs, longer chains of side effects. This hub covers what's actually different — the new threat surface, how to review agent output, and tool-specific gotchas for the agents people are actually using.
What’s actually new about agentic coding
A coding “assistant” suggests; an “agent” decides. The shift matters because the security controls we built for the assistant era — the human reading every diff line, the lint pass before commit, the careful PR description — were predicated on the human being in the loop on every change. Agents push that loop to the boundary of the session, not the boundary of each edit.
Three properties of agentic workflows shift the threat model:
- Diff size per session. A single agent run can touch 20+ files. Tier-one reviewers (the humans who used to look at every change) approve in bulk because line-by-line review at that volume is impractical. Security-relevant edits ride along with formatting changes.
- Side effects extend past the source tree. Agents install packages, modify CI pipelines, edit
.envfiles, run shell commands, and push migrations. Each of those is a potential security event that wouldn’t have happened in a chat-only assistant flow. - Loops widen scope. Agents iterate to make tests pass. The natural failure mode is “disable the failing check” rather than “fix the underlying defect” — and a disabled CORS check or a broadened file permission persists in the final commit with no commentary.
None of this means agents are unsafe to use. It means the controls that worked for chat-style assistance need to be re-pointed at the new failure surface.
The agentic threat model in one table
| Threat | Where it shows up | Mitigation that actually scales |
|---|---|---|
| Insecure code generation (SQLi, missing authn, path traversal) | Inside the agent’s commits, often in scaffolding code that gets approved fast | Static + runtime scanning gate; treat agent PRs as untrusted contributors |
| Supply-chain pulls (typosquats, slopsquats, malicious post-install) | Agent installs a dep mid-session to “solve” a problem | Require dep additions to go through human review; pin lockfiles; package hallucination scanner |
| Credential exfiltration via the agent itself | Agent reads .env, logs include API keys, secrets get written into config files |
Run agents in a sandboxed env with redacted secrets; never run with prod creds |
| Scope creep during the loop | Agent disables a failing security check to make tests pass | Branch protection + required status checks the agent can’t bypass; review the diff, not just the goal |
| Prompt injection from data | Agent processes external data (issue, README, scraped page) that contains hidden instructions | Treat retrieved data as untrusted; constrain tool permissions per task |
| Approval fatigue | Reviewer hits “approve all” on a 30-file diff | Tier files by sensitivity; require explicit re-approval if tier-one files change |
| Pipeline / CI tampering | Agent edits .github/workflows/*.yml to “unblock” itself |
CODEOWNERS on workflow files; branch protection on main requires a human + CI |
Most of these are not new vulnerabilities — they’re old ones with a new arrival rate.
Read these first
Foundations
- Security Risks in Agentic AI Coding — the full risk catalog: autonomous generation, lack of oversight, supply-chain risk, credential exposure, scope creep in agent loops. Start here if you’re new to the topic.
- How to Review Code from AI Agents — diff strategies for multi-file changes, the tier-one / tier-two / tier-three review pattern, and the explicit security checklist to apply to every agent PR.
Agent-specific guides
- Cursor Composer Security — multi-file edits, agent mode, rapid-fire scaffolding. What Composer tends to leave out and how to catch it.
- Claude Code Security — terminal access, file-system operations, the agentic loop. Permissions, hooks, and the boundaries you can configure.
- Devin Security Practices — fully autonomous dev, cloud execution, PR-based workflow. Sandboxing posture and how Devin’s environment isolation actually works.
Minimum viable guardrails
The smallest set of controls that materially change the security posture of an agentic workflow:
- Branch protection on
main. No agent commits direct. Every change goes through a PR with required status checks. The agent’s identity is a separate role, not the human’s account. - CODEOWNERS on sensitive paths. Files in
auth/,migrations/,.github/workflows/,infra/, andscripts/release/*require a named human approver. The agent can edit them; merging requires a human. - Dependency review in CI. A workflow step that runs on every PR, lists added/removed deps, and fails the build if a new package is from a low-trust source (recently registered, low downloads, no GitHub repo, or matches a slopsquat heuristic). Pair with package hallucination checking.
- Secrets boundary. The agent’s runtime never sees production secrets. If the agent needs an API key for testing, it gets a sandbox key from a separate vault. Use
.env.local.templatewith placeholder values and refuse to start the agent if real values are present. - Scoped tool permissions. If the agent has shell access, restrict to a project-local directory. Deny
curl,wget,npm publish,git push,aws *by default — opt in per task. - Required runtime scan on PR. Once a PR is opened, run a runtime security probe (e.g. vibe code scanner on the preview deployment). Block merge on critical findings.
- Audit log for agent actions. Every command, file write, and tool call the agent makes goes to a structured log. When something breaks in production, you need to be able to answer “what did the agent do, when, in response to what prompt.”
Half of these are off-the-shelf GitHub features. The rest are small CI additions. None of them require buying anything.
How to choose between agents (security lens)
| Property | Cursor Composer | Claude Code | Devin |
|---|---|---|---|
| Default execution environment | Local IDE | Local terminal | Remote sandbox |
| Filesystem scope | Workspace folder | Configured working dir | Isolated VM |
| Network access | Through user’s machine | Through user’s machine + configured tools | Sandboxed; explicit egress |
| Long-running tool execution | No (interactive) | Yes (background tasks) | Yes (full session) |
| PR-by-default workflow | No | No (configurable) | Yes |
| Native permission model | OS-level | Hooks + permission modes | Sandbox-level |
| Where credentials live | Local env / 1Password CLI | Local env / claude login |
Devin’s secret manager |
The “best” agent for security depends on what blast radius you’re willing to accept if the agent does the wrong thing. Devin’s sandbox limits damage at the cost of slower iteration; Cursor and Claude Code are faster but trust the local machine. Match the agent to the sensitivity of the workspace.
Common mistakes (and what to do instead)
- Running the agent with your daily-driver shell environment. It inherits every credential you’ve ever exported. Use a fresh shell, a project-scoped
.env, or a container. - Auto-approving every tool call. Sets a precedent the agent’s later tool calls inherit. Approve per category, not per call.
- Letting the agent edit its own permission rules. Anything in
.claude/,.cursor/, or equivalent should be CODEOWNERS-protected. The agent fixing its own restrictions is the AI-era version ofchmod 777. - Treating “tests pass” as “code is correct”. Agents optimise for green CI. A passing test suite after an agent run means nothing about whether the new code preserves security invariants — those are usually not what the test suite checks.
- No log retention for agent sessions. When you discover a regression three weeks after the agent introduced it, you’ll want the prompt history, the tool calls, and the file diffs. Plan for retention before you need it.
- Trusting the agent’s commit message. The message describes what the agent thinks it did. Read the diff.
Related resources
- AI Pentest Hub — the offensive view: how attackers approach apps the agents shipped.
- AI Security Risks: Cursor — Cursor-specific risk profile, paired with the Composer guide above.
- Vibe Coding Security Risks — broader frame, includes non-agentic AI coding tools.
- Free Security Self-Audit — 30-minute manual pass to run on any app an agent helped build.
- Vibe Code Scanner — runtime probe that pairs well with the agent-output review checklist above.
SCAN AGENT-GENERATED CODE
VibeEval probes deployed apps for the failure modes agentic tools tend to ship — exposed keys, missing auth, open databases. 14-day trial. No card.