How does AI read my codebase to write compliance policies?
How the AI Analysis Works
Step 1: Connect (Read-Only Access)
The AI tool connects to your GitHub repositories and cloud accounts with read-only permissions. It can see your code and configurations but can't modify anything.
Step 2: Analyze Code Patterns
The AI looks for specific patterns that map to SOC 2 controls:
| What It Finds | SOC 2 Mapping | Policy Output |
|---|---|---|
| NextAuth or Clerk configuration | CC6.1 (Access Controls) | "Users authenticate via [provider] with [MFA method]" |
| GitHub Actions workflow files | CC8.1 (Change Management) | "CI runs [test suite] on every PR before merge" |
| AWS IAM policies | CC6.1 (Logical Access) | "Cloud access restricted via IAM roles with [specific policies]" |
| Database encryption settings | CC6.7 (Data Protection) | "Data at rest encrypted using [method] on [provider]" |
| Sentry/DataDog configuration | CC7.2 (Monitoring) | "Application errors tracked via [tool] with [alerting rules]" |
Step 3: Map to Controls
The AI maps discovered patterns to Trust Services Criteria, identifying which controls your systems already satisfy and which have gaps.
Step 4: Generate Policies
For each control area, the AI writes policy statements that reference your actual systems. Instead of "the company uses encryption," it writes "customer data at rest is encrypted using AES-256 in Supabase PostgreSQL."
Step 5: Identify Gaps
The AI flags areas where it can't find controls — missing MFA enforcement, no branch protection, no logging configuration. These become your remediation priorities.
What the AI Doesn't Do
- Doesn't read your proprietary business logic or trade secrets (focuses on infrastructure and security patterns)
- Doesn't modify your code
- Doesn't store your source code
- Doesn't replace the auditor
Privacy Consideration
Codebase-aware tools typically analyze patterns and configurations, not business logic. They look at how you authenticate users, not what your application does with their data. Check each tool's data handling policy before connecting your repos.