Platform

Solutions

Results

Insights

Resources

About

Book a Demo

Agent 07:Processing Q4 revenue forecast

← Back to Insights

Your AI Coding Agent Is a Junior Dev With Publish Rights

Liam McCarthy

May 13, 2026

5min

Six concrete controls for repos where Claude Code, Cursor, and friends ship dependencies for you.

Your AI Coding Agent Is a Junior Dev With Publish Rights

Most supply-chain advice is bureaucratic. It assumes a security team, a release engineer, and a codebase that moves slowly enough for humans to read the release notes. None of that describes how I work right now, and probably not how you do either.

What I actually have is a Claude Code window, a repo, and an agent that just edited package.json to add a dependency I never asked about. The agent will run install if I let it. The install will execute arbitrary shell from a maintainer I've never heard of, on my laptop, with access to everything in my keychain.

The agent is doing what a junior developer would do, except faster, with more reach, and with zero hesitation about adding left-pad-3000 because "it might be useful for the helper function." It's also a junior dev who'll cheerfully edit your CI workflow, your Dockerfile, and your lockfile in the same PR, and who's read enough StackOverflow to think @main is a fine version pin.

You're going to ship code this way regardless — the productivity gains are real — so you need controls aimed at this particular failure mode. Not "supply chain security" as a category. The narrower problem: the LLM in my IDE has commit rights and no instinct for danger.

What the playbooks miss

Open any vendor's supply-chain hardening guide and you'll find the same content. Enable Dependabot. Install our scanner. Write a CODEOWNERS file. Eventually adopt SBOMs. Eventually reach maturity level three of some framework. Phase one through phase eight. Quarterly review. Stakeholders. Adoption metrics.

That kind of content exists to make the buyer feel covered. It isn't written for the developer who has thirty minutes between meetings and one repo that needs to not get owned this week.

There's also a more substantive miss. These playbooks assume the threat is a malicious external actor introducing a bad dependency. That's not wrong, just incomplete. The model that's growing is your own tooling introducing dependencies you'd have refused if it had asked. Agents don't probe and reject; they install and keep going. The same controls work either way, but the framing matters, because if you don't believe your own pipeline can be the problem, you'll keep deferring the controls that actually bite.

Six things, no phases

In rough order of how much each one helps:

The big one is turning off install scripts. npm install and pnpm install will execute arbitrary shell from any dependency that ships a postinstall hook, and that's the most common malware delivery path. ignore-scripts=true globally, plus pnpm approve-builds to opt into the small handful of packages that genuinely need to compile something — esbuild, sharp, better-sqlite3, you'll know the list — is about thirty seconds of work and probably the single largest reduction in attack surface you can ship today.

After that, wait seven days on new releases. The malware lifecycle on npm is on the order of a few days from publish to public advisory, so a 7-day release-age cooldown (via minimumReleaseAge in pnpm-workspace.yaml and the same field in Renovate) usually means you don't install the compromised one. Vulnerability fixes bypass the cooldown so real security work isn't slowed.

Freeze your lockfile in CI. npm ci. pnpm install --frozen-lockfile. The agent edits package.json, the agent has to commit the lockfile too, and now the lockfile change is something a human can see in the diff. No silent transitive drift.

Pin GitHub Actions to commit SHAs. Tags move; SHAs don't. The tj-actions/changed-files incident in 2024 compromised every workflow pinned to a tag — that's the kind of thing that happens once a year and ruins a lot of weekends. Every third-party action reference should be a 40-character hex with the human-readable tag in a comment after it. pinact and frizbee rewrite existing workflows for you. Renovate keeps the pins fresh.

CODEOWNERS on the supply-chain files. Workflows, package.json, lockfiles, .npmrc, Dockerfiles. This one's the whole point, honestly — it's what lets you keep using the agent at the speed it's capable of without merging an unreviewed dependency change. The agent gets the application code; you keep the package manifest.

Last, egress monitoring on the jobs that publish or deploy. step-security/harden-runner in audit for a week to learn the normal pattern, then block with an explicit allowlist on anything holding cloud credentials or publish rights. The first five controls reduce the odds of running something bad. This one is what bounds the damage when something gets through anyway.

None of these are clever. All of them are well-documented. The reason most repos don't have them isn't that they're hard — it's that the playbooks bury them under five other things and the developer bounces before getting there.

The bash script

I keep a bootstrap script in a public repo. One command drops the six controls into a target repo. It's intentionally boring: it writes config files you can read, pins Action references to specific SHAs, won't overwrite anything unless you pass --force, and has a --check mode that audits without writing.

It's not a product. There's nothing to install but bash. No telemetry, no API key, no dashboard. The output is plain files in your repo, which you should diff before you commit.

Repo: github.com/aireality-io/supply-chain-hardening. Use it, fork it, ignore it and write your own. The point isn't the script; it's that this stuff is genuinely small.

What this isn't

It's not a comprehensive program. It won't catch typosquats — Socket does that, and you should run Socket on top of this. It won't catch a patient compromised maintainer who waits past your cooldown. It doesn't replace reading the diff.

It's also not for you if you have a real security team and hundreds of repos. You need org-wide policy, an incident response function, and the boring playbook. Different scale, different game.

This is for the developer who has agents shipping code and a healthy fear that one of those agents is going to add the wrong dependency on the wrong Tuesday.

Theater versus signal

Most supply-chain effort is symbolic. Vendor questionnaires are symbolic. Frameworks are symbolic. A scanner whose output nobody reads is symbolic.

The six things above are testable. You can grep for them. Each one breaks a specific, documented attack path the moment the agent tries to take it.

Ship one this week and ship ignore-scripts=true. Ship two, add the cooldown. Ship all six and you've cut roughly 90% of the cheap shots, which is roughly 90% of what actually goes wrong in practice.

Everything past that is real engineering work — worth doing, but not theater, and not in an eight-phase playbook.

Intelligence briefings, delivered weekly

Autonomous AI strategy, agent architecture patterns, and enterprise deployment insights — curated by our fleet operations team.

Join 2,400+ AI leaders from Microsoft, Google, and Fortune 500 companies·No spam, unsubscribe anytime

Reality.

Autonomous AI consulting for enterprises ready to lead.

PLATFORM

Quarterback AI

Trigger AI

COMPANY

About

Insights

Resources

Contact

$ fleet status --live