SOC 2 application testing evidence: the complete guide

    Vanta and Drata automate infrastructure compliance. The 20% application testing gap is still manual screenshots. Here’s how to close it.

    Why this is hard to test

    • Compliance platforms (Vanta, Drata) automate infrastructure evidence but can’t see inside your application — they check “is MFA enabled?” not “does MFA actually work?”
    • CI logs expire, test dashboards aren’t formatted for auditors, and there’s no standard bridge between “test passed” and “auditable proof of control effectiveness”
    • Evidence collection requires 2 engineers for ~2 weeks per audit cycle — manually mapping Jira tickets to test runs to screenshots to Confluence pages
    • SOC 2 Type II requires continuous evidence, not point-in-time snapshots — yet most teams collect evidence once a quarter in a scramble

    Approach 1: Manual evidence collection

    1. 1.Identify which SOC 2 controls require application testing evidence (CC7.2 Change Management, CC6.1 Access Control, CC8.1 Monitoring)
    2. 2.For each control, identify which test cases demonstrate compliance (e.g., CC7.2: “tests run on every PR before merge”)
    3. 3.After each test run, capture screenshots of pass/fail results, timestamp them, and link to the git commit
    4. 4.Organize evidence by control ID in a shared document (Confluence, Notion, or Google Docs)
    5. 5.Before audit: compile evidence into a PDF per control, include test names, dates, results, and commit SHAs
    6. 6.Repeat every quarter or per audit window — plan for 2 engineers spending 60–80 hours total

    Approach 2: Zerocheck (automatic evidence generation)

    1. 1.Tag tests with SOC 2 control IDs in your test configuration (CC7.2, CC6.1, CC8.1)
    2. 2.Every test run automatically generates a timestamped, commit-bound evidence artifact per control
    3. 3.Evidence includes: test name, control ID, pass/fail, git commit SHA, timestamp, step screenshots
    4. 4.Export as PDF or JSON per control in one click — audit prep becomes a 2-hour review, not a 2-week sprint

    Which SOC 2 controls need application testing evidence

    SOC 2’s Trust Services Criteria define dozens of controls, but only a handful require evidence that your application actually works as intended. The three that matter most for engineering teams are CC7.2 (Change Management), CC6.1 (Access Control), and CC8.1 (Monitoring). CC7.2 — Change Management — requires you to prove that code changes are tested before they reach production. The control asks: do you have a systematic process for validating changes? For infrastructure, Vanta checks that branch protection rules exist and PRs require approvals. But branch protection doesn’t prove the code was tested. It proves the PR was approved. The auditor wants to see that tests actually ran against the change and passed. That means test execution evidence tied to specific commits. CC6.1 — Access Control — requires proof that your application enforces authentication and authorization. Vanta can verify that Okta SSO is configured or that AWS IAM policies restrict access. But it can’t prove that your login page actually works, that RBAC rules are enforced in the UI, or that a deprovisioned user is truly locked out at the application layer. Those require functional tests that exercise the access control behavior. CC8.1 — Monitoring — requires evidence that you actively monitor system health and detect anomalies. Infrastructure monitoring (Datadog alerts, PagerDuty incidents) covers part of this. But auditors increasingly ask: do you verify that your application’s core functionality is working? An API returning 200 OK doesn’t mean the checkout flow actually processes payments. Application-level health verification — real user flows running continuously — fills this gap. The common thread: infrastructure compliance tools cover configuration-level evidence. They confirm that controls are set up. But they can’t confirm that controls work at the application layer. “The checkout flow works on this commit” is a statement about application behavior, not infrastructure configuration. That’s the evidence gap.

    What auditors actually accept as evidence

    The evidence format matters more than most engineering teams realize. Auditors are not engineers — they’re evaluating whether your controls are effective based on documentation they can independently verify. The gold standard is evidence that is timestamped, immutable, and traceable to a specific system state. Screenshots have been the traditional format for application testing evidence. A screenshot showing a passed test suite with a visible timestamp and commit SHA is universally accepted. The problem is collection: someone has to take the screenshot, label it with the control ID, link it to the commit, and store it in the evidence repository. Multiply that by dozens of controls and quarterly audit windows, and you’re looking at days of manual labor. Structured artifacts — JSON or PDF documents containing test name, control ID, pass/fail result, git commit SHA, timestamp, and step-by-step screenshots — are increasingly preferred by auditors. They’re harder to fabricate than a single screenshot, they’re machine-readable for audit platforms, and they provide a complete chain of evidence from code change to test execution to result. Several Big Four firms have confirmed they accept structured test artifacts as primary evidence. CI logs are not sufficient. They expire (GitHub Actions logs are deleted after 90 days by default), they’re not formatted for audit review, and they don’t map to control IDs. An auditor handed a raw CI log has to manually search for relevant test results, match them to controls, and verify timestamps — work they shouldn’t have to do and often won’t. The key principle: evidence must be immutable and traceable. Once generated, it shouldn’t change. It should link directly to the code state (commit SHA) and the control it demonstrates (control ID). If an auditor can follow the chain from control requirement to test result to code change without asking you for help, your evidence is good.

    The Vanta gap: infrastructure vs. application evidence

    Vanta, Drata, and Sprinto have transformed SOC 2 compliance by automating roughly 80% of evidence collection. They pull infrastructure configurations from AWS, GCP, Azure, Okta, GitHub, and dozens of other integrations to prove that security controls are in place. IAM policies are correct. Branch protection is enabled. Vulnerability scans are running. Access reviews are completed. This is genuinely valuable work that used to consume weeks of manual effort. But these platforms have a fundamental limitation: they operate at the infrastructure and configuration layer. They can verify that MFA is enabled in Okta. They cannot verify that MFA actually works when a user logs into your application. They can confirm that GitHub branch protection requires PR reviews. They cannot confirm that your test suite actually runs and passes on those PRs. They can check that Datadog alerts are configured. They cannot check that your checkout flow processes payments correctly. This creates what we call the Vanta gap — the 20% of SOC 2 evidence that requires testing your application’s actual behavior. And ironically, this 20% is where teams spend the most manual hours. Infrastructure evidence is automated. Application evidence is still screenshots in a Google Doc. As Thomas Ptacek of Fly.io observed: “The guts of a SOC 2 audit are a giant spreadsheet questionnaire and a battery of screenshots.” Those screenshots are overwhelmingly application-level evidence — proof that the thing works, not just that it’s configured. The teams that struggle most with SOC 2 aren’t the ones missing Vanta. They’re the ones who have Vanta and assume compliance is handled. When the auditor asks for proof that access controls work at the application layer, or that code changes are tested before deployment, Vanta’s dashboard shows green checkmarks for the infrastructure side. The application side is a blank. That blank is where 2 engineers spend 2 weeks taking screenshots, mapping them to controls, and assembling evidence packets — every single audit cycle.

    Mapping tests to control IDs

    The most effective way to automate evidence generation is to make the mapping between tests and SOC 2 controls explicit in your test configuration. Rather than retroactively searching through test results to find evidence for each control, you tag tests with control IDs upfront. When tests run, evidence is automatically organized by control. The mapping is straightforward. A login test that verifies authentication works maps to CC6.1 (Access Control). A test that confirms CI runs on every PR before merge maps to CC7.2 (Change Management). An uptime or health check test that verifies core functionality maps to CC8.1 (Monitoring). A role-based access test that verifies admin-only pages reject regular users also maps to CC6.1. The key is making this mapping explicit in test metadata rather than inferring it after the fact. When the mapping lives in the test configuration, every test run automatically produces evidence organized by control ID. There’s no manual curation step, no post-hoc searching through results, and no risk of missing evidence for a control because someone forgot to take a screenshot. This approach also makes it easy to identify coverage gaps. If no tests are tagged with CC8.1, you know your monitoring control lacks evidence. If CC6.1 only has one test, you know your access control evidence is thin. The mapping doubles as a coverage report for your compliance posture.

    # zerocheck.config.yml — test-to-control mapping
    tests:
      - name: "Login with valid credentials"
        controls: ["CC6.1"]
        description: "Verifies authentication enforces valid credentials"
    
      - name: "RBAC: non-admin rejected from admin panel"
        controls: ["CC6.1"]
        description: "Verifies role-based access control enforcement"
    
      - name: "PR triggers CI and tests pass before merge"
        controls: ["CC7.2"]
        description: "Verifies change management testing requirement"
    
      - name: "Checkout flow processes payment end-to-end"
        controls: ["CC7.2", "CC8.1"]
        description: "Verifies critical flow and active monitoring"
    
      - name: "Health check verifies core app functionality"
        controls: ["CC8.1"]
        description: "Verifies application-level monitoring"

    Common pitfalls

    • Don’t collect evidence retroactively — set up continuous generation from day one so evidence accumulates automatically
    • Don’t rely on CI logs as evidence — they expire, they’re not auditor-friendly, and they don’t map to controls
    • Don’t assume your auditor will reject AI-generated evidence — the evidence is factual (timestamps, screenshots, pass/fail) regardless of how the tests were authored
    • Don’t skip application testing evidence because Vanta “handles compliance” — Vanta covers infrastructure, not your application’s behavior

    FAQ

    Will my auditor accept AI-generated evidence?

    The evidence is timestamped test results with screenshots and step traces — the same thing you’d collect manually. The AI writes the tests; the evidence is factual output. It’s no different from CI logs, just formatted for auditors.

    We already use Vanta. Why do we need this?

    Vanta automates infrastructure evidence: access reviews, config checks, vulnerability scans. It can’t prove your login flow works, your checkout processes payments correctly, or your access controls enforce RBAC at the application layer. That’s the 20% gap.

    How do you map tests to SOC 2 controls?

    Tag tests with control IDs (CC7.2, CC6.1, CC8.1) in your test config. Every run produces an artifact per control. We provide a mapping guide for common SaaS controls.

    We have zero tests. Can we still generate evidence?

    Yes. Zerocheck generates tests automatically by scanning your app. Tests run, evidence generates. Zero to evidence in under an hour.

    SOC 2 application testing evidence: the complete guide

    Skip the setup. Zerocheck handles it in plain English.

    See it run on your app