A study on vibe coding security found 53% of AI-generated code contains security vulnerabilities. But the second finding is wilder. After 5 rounds of asking GPT-4o to fix the vulnerabilities, the code had 37% MORE vulnerabilities than it started with. It confidently introduces new problems while "fixing" old ones.

~41% of committed code is now AI-generated (GitHub data). Everyone's talking about productivity gains. Far fewer people talk about verification.

When we're generating code 2-3x faster, the test coverage gap grows pretty fast. We're shipping faster than we can verify. And "write tests for your code" doesn't really scale when code volume triples overnight lol

Every $1 spent on testing saves $3-5 in downstream fixes. But teams rarely budget for testing because it is invisible until something breaks in production.

"works on my machine" is not the same as "works in production with real users and real payment providers." We're accelerating creation without accelerating verification and I don't think we've figured out the answer yet.

What is your team doing about this? Scaling testing alongside AI code, or just shipping faster and hoping?

41% of code is AI-generated. Who tests it?

Stop babysitting flaky tests