Using ADRs to Audit Quality in AI-Assisted Development

If you’ve used AI coding assistants at scale, you’ve experienced the paradox: they’re fast enough to ship features in hours, but comprehensive enough that you need new verification methods to ensure they actually built what you specified.

There’s a deeper problem: AI models tend to satisfice—a decision-making strategy where they aim for “good enough” rather than optimal. In practice, this means LLMs often report compliance, adherence, and completeness when they’ve actually taken shortcuts. These gaps are frequently visible only in the model’s internal reasoning (the “thinking” logs that most platforms hide from users), leaving you with code that appears correct but subtly violates your architectural decisions.

Traditional code review catches syntax. Security scans catch vulnerabilities. But who verifies that the AI understood and implemented your architectural decisions correctly?

I’ve been using Architecture Decision Records (ADRs) as an audit framework for AI-assisted development for the past six months. The approach is simple: capture requirements as ADRs before implementation, then audit the codebase for compliance after. It’s a third validation layer—independent of code review and security scanning—that catches design drift before it becomes technical debt.

Why ADRs Work for AI Code Audits

ADRs document the “why” behind decisions. They capture context, constraints, and consequences in a structured format. This makes them perfect for auditing because:

Explicit contracts: ADRs state what should exist in code, not just what shouldn’t. “Use JWT for authentication” is verifiable. “Avoid security issues” is not.

Machine-readable structure: The MADR (Markdown Any Decision Records) format uses consistent headings that AI can parse reliably. Decision, Context, Consequences—each section maps to specific audit queries.

Historical record: Unlike inline comments that get edited or deleted, ADRs persist. You can audit today’s code against decisions made months ago.

Team alignment: When AI implements a feature, ADRs provide the shared language for discussing whether it got the architecture right.

The key insight: ADRs function as acceptance criteria for AI-generated implementations. They’re the spec you audit against.

The Audit Process

I run ADR audits as a separate pass after code review and security scanning. The workflow:

Requirements phase: Document architectural decisions as ADRs (status: accepted)
Implementation phase: AI assistant builds features
Review phase: Standard code review + security scanning
Audit phase: Verify implementation matches ADR decisions
Remediation phase: Address gaps and violations

The audit phase is where ADRs prove their value. You’re not checking for bugs—you’re verifying design fidelity.

Phase 1: Discovery

Start by identifying active ADRs. Not all decisions need auditing:

# Find all ADR files
fd -e md . docs/adrs/

# Filter by status (accepted, active, approved)
grep -l "status: accepted\|status: active" docs/adrs/*.md

For each active ADR, extract:

The decision: What was chosen
Constraints: Requirements and boundaries
Technologies: Mandated or prohibited tools
Patterns: Preferred approaches

Example: ADR-005 might decide “Use PostgreSQL for all relational data” with a constraint of “No ORM, use raw SQL with prepared statements.” That’s two audit points.

Phase 2: Codebase Analysis

This is where AI assistants shine as auditors. Give them focused search tasks:

Evidence of compliance:

# Check for PostgreSQL usage
rg "import.*psycopg2|import.*asyncpg" --type py

# Verify prepared statements pattern
rg "execute\(.*%s" --type py

Evidence of violations:

# Look for prohibited ORMs
rg "from sqlalchemy import|import peewee" --type py

# Check for string interpolation (SQL injection risk)
rg "execute\(f\"" --type py

Gaps (where decision should apply but doesn’t):

# Find database queries that might not use prepared statements
rg "\.execute\(" --type py | grep -v "%s"

Examine source files, configuration, API definitions, dependencies, and infrastructure-as-code. The goal: comprehensive coverage of the decision’s scope.

Phase 3: Document Findings

Append an ## Audit section to each ADR. This creates a historical record of compliance over time:

## Audit

### 2026-01-04

**Status:** Compliant

**Findings:**

| Finding | Files | Lines | Assessment |
|---------|-------|-------|------------|
| PostgreSQL connection pooling implemented | `src/db/pool.py` | L12-45 | ✅ compliant |
| Prepared statements used consistently | `src/api/*.py` | multiple | ✅ compliant |
| Connection strings in environment vars | `.env.example`, `config.py` | L8, L34 | ✅ compliant |

**Summary:** All database access follows the ADR. No violations found. Connection pooling implemented with configurable limits.

**Action Required:** None

For violations, be specific:

### 2026-01-04

**Status:** Violated

**Findings:**

| Finding | Files | Lines | Assessment |
|---------|-------|-------|------------|
| SQLAlchemy ORM used in new feature | `src/features/reports.py` | L15-89 | ❌ violation |
| String formatting in query | `src/api/search.py` | L45 | ❌ violation |

**Summary:** New reports feature bypasses ADR-005 decision by using SQLAlchemy. Search endpoint uses string formatting instead of prepared statements (SQL injection risk).

**Action Required:**
- Rewrite reports feature to use psycopg2 with prepared statements
- Refactor search.py L45 to use parameterized query
- Add linter rule to prevent SQLAlchemy imports

Phase 4: Generate Summary Report

Create docs/adrs/README.md with an overview:

# ADR Compliance Summary

**Audit Date:** 2026-01-04  
**Audited By:** Claude Code  
**Codebase:** zircote/project-name

## Overview

| # | ADR | Status | Health | Action Required |
|---|-----|--------|--------|-----------------|
| 001 | [Use JWT for Authentication](001-jwt-authentication.md) | Accepted | ✅ Compliant | None |
| 002 | [API Versioning Strategy](002-api-versioning.md) | Accepted | ⚠️ Partial | Review needed |
| 003 | [Error Handling Approach](003-error-handling.md) | Accepted | ✅ Compliant | None |
| 005 | [Database Technology Choice](005-postgresql.md) | Accepted | ❌ Violated | Remediation required |

## Critical Findings

**ADR-005: Database Technology Choice** - New reports feature (src/features/reports.py) violates decision by using SQLAlchemy ORM instead of raw SQL with prepared statements. This introduces abstraction layer explicitly prohibited by ADR. **Impact:** Inconsistent data access patterns, potential N+1 query issues.

**Recommended Action:** Rewrite reports.py to match pattern in src/api/users.py (compliant example). Estimated effort: 4-6 hours.

## Recommendations

1. **Add pre-commit hook**: Check for SQLAlchemy imports to prevent future violations
2. **Consider new ADR**: Document approved logging strategy (inconsistent across modules)
3. **Update ADR-002**: API versioning partial compliance suggests decision may need refinement

Health Criteria

Use consistent health indicators:

✅ Compliant: All examined code follows the decision; no violations found
⚠️ Partial: Decision followed in most places; minor violations or gaps exist
❌ Violated: Significant code contradicts the decision; remediation required
❓ Unverifiable: Decision cannot be validated through static analysis

Partial compliance isn’t failure—it’s a signal to investigate. Maybe the decision needs updating. Maybe the implementation has legitimate exceptions. The audit surfaces these discussions.

Practical Execution Tips

Prioritize depth over breadth: Thoroughly analyze representative samples rather than superficial scans. For large codebases, focus on:

API boundaries (where decisions are most visible)
Core modules (where violations cause the most damage)
Recent commits (where AI-generated code appears)
Areas explicitly mentioned in ADRs

Preserve audit history: Don’t overwrite previous audit sections. Append new audits below. This creates a compliance timeline showing whether adherence improves or degrades.

Note vague ADRs: If a decision is too ambiguous to audit, that’s valuable feedback. Flag it: “ADR-008 states ‘use appropriate caching’ but doesn’t specify technology or criteria. Cannot verify compliance. Suggest clarifying decision.”

Focus on architectural scope: ADRs describe high-level decisions. Don’t audit implementation minutiae like variable names or formatting—that’s what linters handle.

Industry Context

ADR audits align with established practices:

Compliance-as-Code: The infrastructure-as-code movement taught us to version and validate configuration. ADR audits extend this to architecture—version your decisions, validate your implementation.

Continuous Compliance: SOC 2, ISO 27001, and similar frameworks require demonstrating adherence to documented policies. ADRs provide the documentation; audits provide the evidence.

AI Observability: As AI systems generate more code, observability becomes critical. ADR audits are architectural observability—visibility into whether AI actually implemented what you decided.

Prompt for AI-Assisted Audits

I use Claude Code to perform the audit itself. Here’s the prompt:

Audit the codebase for compliance with all active Architecture Decision Records (ADRs).

## Phase 1: Discovery

1. Locate all ADR documents (typically in `docs/adrs/`, `docs/adr/`, or `adr/`)
2. Identify ADRs with status: accepted, active, or approved (skip deprecated, superseded, rejected)
3. For each active ADR, extract:
   - The decision made
   - Implementation requirements or constraints
   - Technologies, patterns, or practices mandated or prohibited

## Phase 2: Codebase Analysis

For each active ADR, search the codebase for:
- Evidence of compliance (implementations following the decision)
- Evidence of violations (code contradicting the decision)
- Gaps (areas where the decision should apply but isn't implemented)

Examine: source files, configuration, API definitions, dependencies, infrastructure-as-code.

## Phase 3: Document Findings

Append an `## Audit` section to each ADR file with this structure:

```markdown
## Audit

### YYYY-MM-DD

**Status:** Compliant | Partial | Violated | Unverifiable

**Findings:**

| Finding | Files | Lines | Assessment |
|---------|-------|-------|------------|
| [specific observation] | `path/to/file.py` | L42-58 | compliant/violation/gap |
| ... | ... | ... | ... |

**Summary:** [1-2 sentence assessment of overall adherence]

**Action Required:** [None | List specific remediation needed]
```

## Phase 4: Generate Summary Report

Create `docs/adrs/README.md` with:

```markdown
# ADR Compliance Summary

**Audit Date:** YYYY-MM-DD
**Audited By:** Claude Code
**Codebase:** [repository name or path]

## Overview

| # | ADR | Status | Health | Action Required |
|---|-----|--------|--------|-----------------|
| 1 | [ADR-001: Title](001-title.md) | Accepted | ✅ Compliant | None |
| 2 | [ADR-002: Title](002-title.md) | Accepted | ⚠️ Partial | Review needed |

## Critical Findings

[List any ADRs with Violated status, summarizing conflicts and recommended actions]

## Recommendations

[Any patterns observed, suggested new ADRs, or ADRs that should be reconsidered]
```

## Health Criteria

- **✅ Compliant**: All examined code follows the decision; no violations found
- **⚠️ Partial**: Decision is followed in most places; minor violations or gaps exist
- **❌ Violated**: Significant code contradicts the decision; remediation required
- **❓ Unverifiable**: Decision cannot be validated through static analysis

## Execution Notes

- Prioritize depth over breadth—thoroughly analyze representative sample
- Focus on: API boundaries, core modules, recent commits, areas explicitly mentioned in ADRs
- If ADR is too vague to audit, note this and suggest clarifying the decision criteria
- Preserve existing Audit sections as historical record; append new audit below previous ones

Copy this into a new Claude Code session, and it will perform the complete audit workflow.

Automating with GitHub Actions

The real power comes from running audits automatically. Create .github/workflows/adr-audit.yml:

name: ADR Compliance Audit

on:
  schedule:
    - cron: '0 0 * * 1'  # Weekly on Monday
  workflow_dispatch:      # Manual trigger

jobs:
  audit:
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write
    
    steps:
      - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11  # v4.1.1
      
      - name: Run ADR Audit
        uses: anthropics/claude-code-action@a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0  # v1 - Replace with actual SHA
        with:
          api_key: $
          prompt_file: .github/prompts/adr-audit.md
          
      - name: Create PR if violations found
        if: failure()
        uses: peter-evans/create-pull-request@5e914681df9dc83aa4e4905692ca88beb2f9e91f  # v7.0.5
        with:
          title: "ADR Compliance Violations Detected"
          body: "Automated audit found ADR violations. Review docs/adrs/README.md for details."
          branch: adr-audit-$
          labels: architecture, audit

This runs weekly audits and automatically creates PRs when violations are detected. You can also trigger manually before releases.

Real-World Impact

After six months of ADR audits on AI-assisted projects:

Architectural drift caught early: Three times, audits revealed AI-generated features that subtly violated design decisions. Caught at audit rather than in production.

Documentation quality improved: Writing audit-focused ADRs forces clarity. If you can’t audit it, your decision was too vague.

Team confidence increased: I trust AI-assisted code more when I know it will be audited against documented decisions.

Faster reviews: Code reviewers focus on implementation quality. Architectural alignment is verified separately.

The time investment is minimal—15-30 minutes per audit for a medium-sized project. The return is significant: architectural consistency even as AI generates thousands of lines of code.

Getting Started

Adopt ADRs: Use MADR format for consistency
Choose your storage: Track ADRs in Git notes with zircote/git-adr or use file-based ADRs in docs/adrs/
Create audit prompt: Adapt the prompt above to your project structure
Run first audit: Manually audit existing ADRs to establish baseline
Automate: Set up GitHub Action for continuous compliance
Iterate: Refine ADRs based on audit findings

The goal isn’t bureaucracy—it’s confidence. When AI can generate entire features in minutes, architectural audits ensure speed doesn’t come at the cost of coherence.

The ADR audit approach builds on the MADR specification and complements existing quality gates. For tracking ADRs in Git, see git-adr. The audit prompt can be integrated with Claude Code, GitHub Copilot, or any AI coding assistant that supports file analysis and structured output.