Trust but Verify: Making AI Code Generation Auditable

A developer on your team just merged a PR with 400 lines of new code. They wrote maybe 30 of those lines. The rest came from an AI assistant. Six months from now, when that code causes an incident, can you tell the auditor which lines were AI-generated, which model produced them, and whether anyone reviewed the AI’s output before it shipped?

If the answer is no, you have an auditability gap that grows with every AI-assisted commit.

The Provenance Problem

AI code generation has a provenance problem. Git records who committed the code. It does not record who — or what — wrote the code. A commit authored by a human developer that contains 90% AI-generated code looks identical in the git log to a commit where the developer wrote every line by hand.

This matters for three reasons.

Liability. When AI-generated code causes a security vulnerability, liability questions follow. Did the developer review the AI output? Was the review meaningful or perfunctory? Did the organization have a process for reviewing AI-generated code? Without provenance records, these questions are unanswerable.

Intellectual property. AI models are trained on code with various licenses. If a model reproduces copyrighted code verbatim, you need to know which parts of your codebase carry that risk. Without model attribution, you cannot audit for license contamination.

Quality assurance. AI-generated code has different failure modes than human-written code. It tends toward plausible-looking patterns that may be subtly wrong. It confidently implements deprecated APIs. It copies security anti-patterns from training data. Knowing which code is AI-generated tells your review process where to look harder.

What an Audit Trail Needs

A complete audit trail for AI-assisted changes requires four things that standard git tooling does not provide.

Model Attribution

Which AI model generated or assisted with each change? Not just “an AI was involved” but specifically: Claude 3.5 Sonnet, GPT-4, Gemini 1.5 Pro, Codex, a local Ollama model. The model identity matters because different models have different training data cutoffs, different license compliance profiles, and different known failure modes.

GuardSpine captures model attribution in the evidence bundle. When the AI council reviews a PR, each council member’s identity is recorded: model name, version, and the API endpoint used. When the PR itself was authored with AI assistance, the developer can annotate which model was used via a structured commit message or PR label.

Prompt Provenance

What prompt produced the AI-generated code? This is harder to capture than model identity because prompts happen in developer IDEs, chat interfaces, and terminal sessions that are not part of the code review pipeline.

GuardSpine does not try to capture every prompt — that would require invasive monitoring that developers would rightly reject. Instead, it captures provenance at the review level: what the AI council was shown (the diff), what each reviewer produced (the analysis), and what the approval chain looked like (human and AI decisions).

For teams that want deeper provenance, GuardSpine supports an optional ai_context field in the evidence bundle where developers can record the prompt or conversation that produced the AI-assisted code. This is opt-in and developer-controlled. No surveillance. Just documentation when the developer chooses to provide it.

Review Evidence

Proof that someone — human or AI — actually reviewed the AI-generated code. Not a checkmark. Not a “LGTM.” Specific evidence that the code was analyzed for correctness, security, and compliance.

The evidence bundle’s council vote items serve this purpose. Each AI reviewer produces an analysis with line-number references, risk assessments, and specific findings. For L3-L4 changes, the human reviewer’s approval is recorded with the escalation context they were shown and any comments they added.

This is the gap I see in most AI code generation workflows. The developer generates code with an AI, commits it, gets an approval from a teammate who glances at the diff for 30 seconds, and merges. The audit trail says “approved.” It does not say whether the approval was informed or reflexive.

Temporal Integrity

Proof that the review happened before the merge, not after. Proof that the evidence was not created or modified retroactively.

The evidence bundle’s timestamp is signed as part of the hash chain. The root hash commits to the sequence and timing of events. If someone tries to create an evidence bundle after the fact and backdate it, the signature timestamp from the signing service will not match, and the bundle’s root hash will not appear in any contemporaneous reference.

For stronger temporal integrity, GuardSpine supports publishing the root hash to an external timestamping service (RFC 3161). This provides third-party proof that the evidence bundle existed at a specific time.

The AI-Written, AI-Reviewed Paradox

Here is the uncomfortable question: if AI wrote the code and AI reviewed the code, what exactly was governed?

This is not a hypothetical. A developer uses Copilot to generate a function. They commit it. The AI council reviews it. All three AI reviewers approve. The evidence bundle is generated, signed, and sealed. Everything looks governed.

But no human was involved at any point. The code was generated by a model and approved by other models. If all the models share a blind spot — and for certain bug classes, they will — the governance process provided zero protection.

GuardSpine’s risk tier system addresses this directly. Changes to high-risk paths (auth, payment, crypto, PII handling) always escalate to human review regardless of AI council consensus. The system’s design acknowledges that AI reviewing AI-generated code is valuable but not sufficient for high-risk changes.

For L0-L2 changes, AI-reviewing-AI is appropriate. The risk is low, the review coverage is broad (three models catch more than one), and the evidence trail documents what happened. For L3-L4 changes, a human must be in the loop. This is not a technology limitation. It is a design principle about where human judgment is non-negotiable.

What Auditors Are Starting to Ask

I have had conversations with compliance teams at three organizations in the past two months where the auditor specifically asked about AI code generation practices. The questions are getting more specific:

“What percentage of your code changes involved AI generation tools?”
“Do you have a policy for reviewing AI-generated code?”
“Can you demonstrate that AI-generated code goes through the same review process as human-written code?”
“How do you detect if AI-generated code introduced a security vulnerability?”

Today, these questions are informational. Within two years, they will be control objectives. The organizations that can answer them with evidence will pass their audits. The organizations that answer with “we trust our developers to use AI responsibly” will get findings.

GuardSpine does not solve the AI code generation problem. It does not prevent developers from using AI tools, and it does not try to. What it does is ensure that every change — regardless of how it was authored — goes through a documented, risk-classified, evidence-generating review process. The provenance trail is a natural output of that process.

Building the Trail Today

You do not need GuardSpine to start building an AI code audit trail. You can start with three practices today:

Label AI-assisted PRs. Add a label or tag to any PR that used AI code generation. This is manual and imperfect, but it creates a searchable record. When the auditor asks “which changes involved AI?”, you can pull the list.

Require substantive review comments. A one-word approval on an AI-generated PR is not evidence of review. Require reviewers to reference specific lines and explain what they checked. This is annoying. It is also the difference between a defensible review and a rubber stamp.

Track model versions. If your team uses AI coding assistants, record which models and versions are in use. When a model is found to have a training data contamination issue or a systematic code generation flaw, you need to know which parts of your codebase were generated by that model.

These practices are manual, imperfect, and better than nothing. GuardSpine automates them.

Want to see the AI provenance trail on a real PR? Book a walkthrough and I will show you model attribution and review evidence on a live repository.