Why Code Scanning Alone Isn't Enough for AI-Generated Code

Scanning finds what's wrong with the code. It doesn't answer how the code got there.

The Assumption That No Longer Holds

Every code scanning tool — Semgrep, Snyk, CodeQL, GitHub Advanced Security — was built on the same assumption:

Code enters the repository through a trusted process (developer → git commit → push → review → merge), and the scanner's job is to find bugs in that code.

For decades, this assumption held. Developers wrote code on company machines, committed through company Git, and reviewers checked it before merging.

AI coding tools broke this assumption.

What Changed

1. Code Now Originates Outside the Network

When a developer uses Claude Code or Gemini CLI, the code is generated on a machine with internet access — often outside the corporate network entirely. It then needs to be transferred into the internal network.

This transfer step has no standard protocol. Developers use:

Email attachments
Personal cloud storage
Chat messages
USB drives (where allowed)
Copy-paste from personal devices

None of these have signature verification, integrity checks, or audit trails.

2. Volume Exceeds Human Review Capacity

AI tools can generate hundreds of files in a single session. A developer might push an entire module — authentication, database layer, API endpoints — all generated in one afternoon.

Traditional code review was designed for human-written diffs of 50-200 lines. It doesn't scale to AI-generated codebases.

3. Identity Is No Longer Implicit

In a traditional workflow, the git author is the person who sat at the keyboard. It's not cryptographically verified, but it's generally trustworthy because the developer is on a managed machine with SSO.

With AI-generated code transferred from external environments, the git author name is just a string someone typed. There's no mathematical proof of who actually produced or sent the code.

4. AI Agents Can Act Autonomously

Modern AI coding tools can execute shell commands. An AI agent could theoretically:

Generate code
Run leeh push (or any transfer mechanism)
Repeat indefinitely

Without a human-in-the-loop gate, there's no guarantee a person reviewed what was sent.

What Scanning Catches — and What It Misses

Scanning catches:

SQL injection patterns
Hardcoded secrets (API keys, passwords)
Known vulnerable dependencies
Insecure cryptographic usage
XSS and CSRF patterns

Scanning misses:

Who sent this code? — No scanner verifies the sender's identity
Was it modified in transit? — No scanner checks integrity between origin and repository
Was it quarantined? — Scanners run after code is already in the repo
Did a human approve the transfer? — No scanner enforces human attestation
What was the intent? — Pattern matching finds known-bad code, but a novel exfiltration technique written by an AI might not match any existing rule

The Missing Layer: Inbound Verification

Code scanning is post-entry security. It analyzes what's already inside.

What's missing is pre-entry security — a controlled checkpoint that code must pass through before it reaches the repository.

Pre-entry (missing in most enterprises):
  Identity    → Who sent this? (cryptographic proof)
  Integrity   → Was it tampered with? (hash verification)
  Quarantine  → Is it isolated until verified? (3-state)
  Attestation → Did a human approve it? (optional gate)

Post-entry (already solved):
  SAST        → Semgrep, CodeQL
  Secrets     → gitleaks, TruffleHog
  Dependencies → Snyk, Dependabot
  Review      → Pull request review

You need both layers. Most enterprises only have the second one.

A Practical Example

Without inbound verification:

Developer generates auth module with Claude Code
  → Emails zip file to work address
  → Extracts on work laptop
  → git add, commit, push
  → CI runs Semgrep → no findings
  → Merged to main
  
Who sent it? Unknown (email is not identity verification)
Was it modified? Unknown (no integrity check)
Was it quarantined? No
Was it approved? Only the code review, not the transfer

With inbound verification (leeh):

Developer generates auth module with Claude Code
  → leeh push (Ed25519 signed, SHA-256 hashed)
  → Gateway verifies signature → valid
  → Gateway verifies hash → intact
  → Semgrep + gitleaks scan → clean
  → Spec Contract check → paths allowed
  → Quarantine → accepted
  → Committed to internal Git
  
Who sent it? alice (Ed25519 signature, non-repudiable)
Was it modified? No (SHA-256 verified)
Was it quarantined? Yes (pending → accepted)
Was it approved? Yes (scan passed, optionally human-approved)
Full audit trail? Yes (every step logged)

The Argument for Both

This isn't about replacing your scanner. It's about acknowledging that scanning alone is incomplete when code originates outside your network.

Security Layer	What It Answers	Tools
Inbound verification	Should this code be allowed in?	leeh
Static analysis	Does this code have vulnerabilities?	Semgrep, CodeQL
Secret detection	Does this code contain secrets?	gitleaks, TruffleHog
Dependency scanning	Are the dependencies safe?	Snyk, Dependabot
Code review	Does a human approve the logic?	Pull requests

Each layer answers a different question. Skip one, and you have a gap.

What To Do Next

Keep your scanners. They're essential. Don't remove them.
Add an inbound layer. Verify identity, integrity, and intent before code enters your repository.
Require signatures. Git author names are theater. Ed25519 signatures are math.
Quarantine first. Don't let code touch internal Git until all checks pass.
Log everything. When an auditor asks "how did this code get into your system?", have an answer.

leeh — LLM Escrow & Entry Hub. One-way secure inbound pipeline for AI-generated code.

GitHub · leeh.io

Why Code Scanning Alone Isn't Enough for AI-Generated Code

The Assumption That No Longer Holds

What Changed

1. Code Now Originates Outside the Network

2. Volume Exceeds Human Review Capacity

3. Identity Is No Longer Implicit

4. AI Agents Can Act Autonomously

What Scanning Catches — and What It Misses

Scanning catches:

Scanning misses:

The Missing Layer: Inbound Verification

A Practical Example

The Argument for Both

What To Do Next

Comments

More from this blog

AI Coding Is the New Software Supply Chain Risk

leeh vs Semgrep: They Scan What's Inside. We Guard the Gate.

Why Corporate Firewalls are Killing AI Productivity

Command Palette

The Assumption That No Longer Holds

What Changed

1. Code Now Originates Outside the Network

2. Volume Exceeds Human Review Capacity

3. Identity Is No Longer Implicit

4. AI Agents Can Act Autonomously

What Scanning Catches — and What It Misses

Scanning catches:

Scanning misses:

The Missing Layer: Inbound Verification

A Practical Example

The Argument for Both

What To Do Next

Comments

More from this blog