Every "pass locally, fail in CI" cause

·4 min read

Pulled from engineering blogs, HN threads, and GitHub issues. Figured I'd put it all in one place since this comes up constantly.

Timing / race conditions (~40% of cases): Hardcoded waits that work on your M2 but not a 2-core CI runner. Animations completing at different speeds. Network requests faster locally vs CI with shared infra. Database seeds running async.

Environment differences (~30%): Different browser versions. Viewport sizes (local 1440px vs CI headless 800px). Timezone, local vs UTC... insidious for anything date-related. Missing system fonts causing layout shifts. File system permissions.

Resource contention (~15%): Shared test databases between parallel workers. Port conflicts. CI runner memory pressure causing timeouts. Docker-in-Docker overhead.

Test isolation (~15%): Cookies/localStorage not cleared between tests. Execution order dependency. Shared state from a previous test that only passed locally because you ran it first.

Some HN thread confirmed roughly 85% traces to race conditions + environment, not test logic. Sounds about right.

Things that actually help: Playwright's --retries with screenshot-on-failure to identify flakes fast. Dedicated CI browser containers matching local. Isolated test databases per worker. Trace files on failure, not just screenshots.

What's your weirdest "pass locally fail in CI" root cause? Everyone has at least one absurd story :)

Stop babysitting flaky tests

Zerocheck runs E2E tests on every PR with recordings, screenshots, and step traces.

Get a demo