How to fix flaky Cypress tests in CI

    Cypress hides timing issues behind automatic retries. When those retries are not enough, tests fail randomly in CI and nobody can reproduce them locally. Here are the specific causes, code fixes, and alternatives.

    Why this is hard to test

    • Cypress runs inside a single browser tab, which means multi-tab flows, popup windows, and cross-origin redirects either require workarounds or simply cannot be tested reliably
    • cy.wait() with arbitrary millisecond values is the most common Cypress anti-pattern, masking real timing issues behind static delays that work locally but fail on slower CI runners
    • SPA frameworks like React and Vue re-render components after state changes, causing Cypress to grab a DOM element reference that becomes detached before the next command executes
    • Long test suites accumulate memory pressure because Cypress keeps DOM snapshots in the browser for time-travel debugging, leading to browser crashes and OOM kills in CI containers
    • Cypress Cloud's built-in flaky test detection and analytics require a paid tier ($67+/month), leaving teams on the free plan with no visibility into flake patterns

    Approach 1: Debug and fix Cypress-specific flakes

    1. 1.Enable screenshots on failure (screenshotOnRunFailure: true) and video recording (video: true) in cypress.config.ts to capture exactly what the browser showed when a test failed
    2. 2.Use Cypress Dashboard or Cypress Cloud to track which tests flake most frequently and identify patterns (timing, environment, specific specs)
    3. 3.Replace all cy.wait(ms) calls with cy.intercept() and cy.wait('@alias') to wait for specific network requests instead of guessing at delays
    4. 4.Add .should('be.visible') or .should('exist') guards before interacting with elements that may be re-rendered by your SPA framework
    5. 5.Set numTestsKeptInMemory: 0 in cypress.config.ts to reduce memory pressure in large suites running in CI
    6. 6.Split large spec files into smaller ones and use --spec glob patterns to run subsets in parallel across CI containers

    Approach 2: Replace selector-dependent tests with Zerocheck

    1. 1.Write tests in plain English: "User logs in, navigates to settings, and updates their profile" instead of chaining cy.get() selectors that break on every refactor
    2. 2.Zerocheck interacts with your UI visually, so detached DOM elements, re-rendered components, and changed selectors do not cause flakes
    3. 3.Cross-origin flows (OAuth, Stripe, popup windows) work without cy.origin() workarounds or chromeWebSecurity hacks
    4. 4.No memory pressure from DOM snapshot accumulation because Zerocheck does not run inside the browser
    5. 5.Tests run on every PR in your existing CI pipeline via standard GitHub Actions integration

    The cy.wait() trap

    The single most common cause of flaky Cypress tests is cy.wait() with a hardcoded millisecond value. It shows up in almost every Cypress codebase, and it is almost always wrong. The pattern looks like this: a test clicks a button that triggers an API call, then waits an arbitrary number of milliseconds before asserting on the result. cy.wait(3000) works on your MacBook because the API responds in 200ms and the UI updates within a second. In CI, the same API call takes 1.5 seconds because the runner is sharing resources with 20 other jobs. The UI update takes another second. Your 3-second wait now has a 500ms margin instead of 2.8 seconds, and on a bad day, it is not enough. The deeper problem is that cy.wait(ms) does not actually wait for anything meaningful. It is a timer, not a condition. It does not know whether the API call finished, whether the DOM updated, or whether the element you need is ready. It just pauses execution for a fixed duration and hopes for the best. Cypress has built-in retry-ability that most developers underuse. Commands like .should(), .contains(), and .find() automatically retry for up to 4 seconds (configurable via defaultCommandTimeout) until the assertion passes. This is genuinely useful and handles many timing issues without any cy.wait() at all. If you chain cy.get('.result').should('contain', 'Success'), Cypress retries the assertion until either the text appears or the timeout expires. But retry-ability has limits. It only works on the last command in a chain. If you write cy.get('.button').click().get('.result').should('contain', 'Success'), the click() is not retried, only the final .should() assertion is. And retry-ability does not help when the issue is a network request that has not completed yet, because Cypress retries DOM queries, not network conditions. The correct fix for network-dependent timing is cy.intercept() combined with cy.wait('@alias'). You register an intercept for the specific API endpoint your action triggers, give it an alias, perform the action, then wait for the aliased request to complete. This ties your test to a real condition (the API responded) rather than an arbitrary timer.

    // BAD: arbitrary wait that works locally, flakes in CI
    cy.get('[data-testid="submit"]').click();
    cy.wait(3000); // hoping the API responds in time
    cy.get('[data-testid="success-message"]')
      .should('contain', 'Order placed');
    
    // GOOD: wait for the actual network request to complete
    cy.intercept('POST', '/api/orders').as('createOrder');
    cy.get('[data-testid="submit"]').click();
    cy.wait('@createOrder'); // waits until the POST /api/orders responds
    cy.get('[data-testid="success-message"]')
      .should('contain', 'Order placed');
    
    // ALSO GOOD: assert on the response to catch API-level bugs
    cy.intercept('POST', '/api/orders').as('createOrder');
    cy.get('[data-testid="submit"]').click();
    cy.wait('@createOrder').then((interception) => {
      expect(interception.response.statusCode).to.equal(201);
      expect(interception.response.body).to.have.property('orderId');
    });
    cy.get('[data-testid="success-message"]')
      .should('contain', 'Order placed');

    Detached DOM elements

    The "detached from the DOM" error is one of the most confusing Cypress failures, and it happens specifically because of how modern SPA frameworks interact with Cypress's command queue. Here is what happens. Cypress runs cy.get('.user-row') and finds the element in the DOM. It stores a reference to that specific DOM node. Before the next command in the chain executes, React (or Vue, or Svelte) re-renders the component. The old DOM node is removed and a new one is inserted. Cypress tries to act on the stored reference and throws: "cy.click() failed because this element is detached from the DOM." This is not a bug in Cypress or in your framework. It is a fundamental timing conflict. Cypress grabs elements eagerly, while SPA frameworks replace elements asynchronously. The gap between "element found" and "element acted upon" is where detachment happens. The most common triggers are: state updates that cause a parent component to re-render (the child element is destroyed and recreated), route transitions where the old page's DOM is still present when Cypress queries but gets torn down before interaction, and list components where adding or removing an item re-renders the entire list. The fix is to ensure the element is stable before interacting with it. Adding .should('be.visible') before .click() forces Cypress to retry the entire chain until the element exists, is attached, and is visible. If the element gets detached during retries, Cypress re-queries from the start of the chain and finds the new instance. For more complex cases where you need to read data from an element that may re-render, use .should('exist').then() to break the command chain. Inside the .then() callback, the element is guaranteed to be attached at the moment the callback executes. If you need to interact with the element, do it inside the callback.

    // ERROR: element detached between cy.get() and .click()
    // CypressError: cy.click() failed because this element
    // is detached from the DOM
    cy.get('.user-row').first().click();
    
    // FIX 1: add .should('be.visible') to force retry until stable
    cy.get('.user-row')
      .first()
      .should('be.visible')
      .click();
    
    // FIX 2: for elements that re-render after data loads,
    // wait for the data to be present first
    cy.intercept('GET', '/api/users').as('getUsers');
    cy.wait('@getUsers');
    cy.get('.user-row')
      .should('have.length.greaterThan', 0)
      .first()
      .should('be.visible')
      .click();
    
    // FIX 3: use .then() for complex interactions where
    // you need the element reference to be guaranteed attached
    cy.get('[data-testid="editable-cell"]')
      .should('exist')
      .then(($el) => {
        // $el is guaranteed to be attached right now
        cy.wrap($el).click();
        cy.wrap($el).type('new value{enter}');
      });

    Single-tab and cross-origin limitations

    Cypress runs inside the browser alongside your application. This is what makes its developer experience great: time-travel debugging, automatic waiting, and real DOM access. But it also creates hard architectural limits that cause flakiness in specific categories of tests. The single-tab limitation means Cypress cannot natively test flows that open new tabs or windows. When your app calls window.open() for an OAuth callback, a help page, or a print preview, Cypress cannot follow. The test hangs waiting for an interaction in a tab it cannot see. Teams work around this by stubbing window.open() to redirect to the same tab, but this changes the behavior being tested. Your production users get a popup; your test gets an in-tab redirect. Bugs that only appear in the popup flow go undetected. Cross-origin restrictions are the bigger source of flakiness. Before Cypress 12, visiting a different origin mid-test was essentially impossible without setting chromeWebSecurity: false, which disables the browser's same-origin policy entirely. This works, but it means your tests run in a security context that no real user ever experiences. Cypress 12 introduced cy.origin(), which lets you execute commands in a different origin's context. This handles OAuth flows where you need to interact with a Google or GitHub login page, then return to your app. However, cy.origin() has real limitations that cause intermittent failures. It creates a new Cypress command context for the foreign origin, which means local variables, aliases, and hooks from the parent context are not available. Timing between the origin switch and command execution can vary, causing tests to fail when the foreign page loads slowly in CI. Stripe Elements iframes are another common source. Stripe renders payment fields in a cross-origin iframe on js.stripe.com. Cypress cannot interact with cross-origin iframes using standard commands. The cypress-iframe plugin helps but relies on disabling web security, and iframe loading time in CI is unpredictable. Popup-based authentication flows (like "Sign in with Google" buttons that open a popup window) are essentially untestable in Cypress. The popup is a separate browser context that Cypress cannot control. Most teams work around this by using programmatic login via API calls and injecting session tokens, skipping the popup flow entirely in tests. This is pragmatic but means the actual OAuth popup flow is never tested in CI.

    // Cypress 12+ cy.origin() for OAuth flows
    cy.visit('/login');
    cy.get('[data-testid="google-login"]').click();
    
    // Switch to Google's origin to interact with their login page
    cy.origin('https://accounts.google.com', () => {
      cy.get('input[type="email"]')
        .should('be.visible')
        .type('[email protected]');
      cy.get('#identifierNext').click();
      cy.get('input[type="password"]')
        .should('be.visible')
        .type('testpassword123');
      cy.get('#passwordNext').click();
    });
    
    // Back on your origin after redirect
    cy.url().should('include', '/dashboard');
    
    // LIMITATION: cy.origin() does not share aliases or state
    // This will NOT work:
    cy.intercept('/api/user').as('getUser');
    cy.origin('https://accounts.google.com', () => {
      // cy.wait('@getUser') - ERROR: alias not available here
    });
    
    // WORKAROUND for Stripe iframes:
    // Requires chromeWebSecurity: false in cypress.config.ts
    cy.get('iframe[src*="js.stripe.com"]')
      .its('0.contentDocument.body')
      .should('not.be.empty')
      .then(cy.wrap)
      .find('input[name="cardnumber"]')
      .type('4242424242424242');

    Memory pressure in large suites

    Cypress keeps DOM snapshots in browser memory for its time-travel debugging feature. Every command in every test stores a snapshot of the DOM at that moment so you can step backwards through the test in the Cypress runner and see what the page looked like at each step. This is one of Cypress's best features during local development. In CI, it is a memory leak. A single test with 50 commands stores 50 DOM snapshots. If your DOM is moderately complex (a dashboard with tables, charts, and navigation), each snapshot can be 2-5MB. Fifty commands means 100-250MB per test. Run 100 tests in the same browser session, and you are looking at 10-25GB of accumulated DOM snapshots. CI containers typically have 4-8GB of RAM. The math does not work. The symptoms are inconsistent. Sometimes the browser process gets OOM-killed and Cypress reports "The browser unexpectedly closed." Sometimes Chrome becomes extremely slow as it swaps to disk, and tests fail with timeout errors that look like application issues rather than resource issues. Sometimes the 95th test in a suite fails while the same test passes when run in isolation. This last pattern is the clearest signal of memory pressure. The fix has multiple parts. First, set numTestsKeptInMemory: 0 in your cypress.config.ts for CI runs. This tells Cypress to discard DOM snapshots from completed tests, freeing memory as the suite progresses. You lose time-travel debugging for past tests in CI, but you were never going to use it there anyway. Second, disable video recording in CI unless you actively review the recordings. Each test's video consumes memory during recording and disk space on the runner. Set video: false in your CI configuration. Third, split large spec files. If you have a spec with 30 tests, Cypress loads all 30 into the same browser session. Split it into 3 files of 10 tests each. Each file gets a fresh browser session with clean memory. Combined with Cypress's built-in spec parallelization or a CI matrix strategy, this can dramatically reduce both memory usage and total wall time.

    // cypress.config.ts - memory-optimized CI configuration
    import { defineConfig } from 'cypress';
    
    export default defineConfig({
      e2e: {
        // Discard DOM snapshots from completed tests in CI
        // Saves 100-250MB per test in memory
        numTestsKeptInMemory: 0,
    
        // Disable video recording to reduce memory and disk usage
        video: false,
    
        // Increase timeout for slower CI runners
        defaultCommandTimeout: 10000,
        requestTimeout: 15000,
    
        // Use spec pattern to organize smaller files
        specPattern: 'cypress/e2e/**/*.cy.{js,ts}',
      },
    });
    
    // .github/workflows/cypress.yml - parallel spec splitting
    # Split specs across multiple CI containers to limit
    # memory per container and reduce total run time
    jobs:
      cypress:
        strategy:
          matrix:
            spec-group:
              - 'cypress/e2e/auth/**/*.cy.ts'
              - 'cypress/e2e/checkout/**/*.cy.ts'
              - 'cypress/e2e/dashboard/**/*.cy.ts'
        steps:
          - uses: cypress-io/github-action@v6
            with:
              spec: ${{ matrix.spec-group }}
              config: 'numTestsKeptInMemory=0,video=false'

    Cypress Cloud vs DIY flake detection

    Cypress Cloud (formerly Cypress Dashboard) offers built-in flaky test analytics starting at $67/month for the Team plan, scaling to $300+/month for Business features. It automatically tags tests that fail on the first attempt but pass on retry, tracks flake rate over time, and surfaces your worst offenders. For teams that can afford it, Cypress Cloud provides real visibility into flake patterns. It is a good product. But many teams cannot justify $67-$300+/month for test analytics alone, especially early-stage teams already paying for CI runners, staging infrastructure, and other dev tools. If that is your situation, you can build basic flake detection yourself. The approach is straightforward. Run your Cypress suite with --retries 1, which tells Cypress to retry each failed test once. Cypress logs which tests failed on the first attempt and which passed on retry. A test that fails then passes on retry is, by definition, flaky. A test that fails on both attempts is likely a real regression. To track this over time, parse Cypress's JSON reporter output after each CI run. The mochawesome reporter produces structured JSON that includes each test's attempts. A simple script can extract tests with multiple attempts, calculate the flake rate per test (failures / total runs), and output a summary. Store the results in a JSON file committed to your repo, a database, or a simple Google Sheet. Over 30 days, you will see which tests flake most and can prioritize fixes. The DIY approach lacks Cypress Cloud's polish. You do not get a dashboard, automatic tagging, or trend visualization. But for teams spending $0 on flake detection today, a 30-line script that runs after each CI build gives you 80% of the value at zero additional cost. One important note: retries themselves mask the problem. A suite with --retries 2 will have a higher pass rate, but the underlying flakiness is unchanged. Use retries for detection and reporting, not as a permanent fix. If a test flakes more than 10% of the time, fix the root cause or delete the test.

    // scripts/flake-report.js
    // Run after: npx cypress run --retries 1 --reporter mochawesome
    const fs = require('fs');
    const results = JSON.parse(
      fs.readFileSync('mochawesome-report/mochawesome.json', 'utf8')
    );
    
    const flakyTests = [];
    
    function walkSuites(suites) {
      for (const suite of suites) {
        for (const test of suite.tests || []) {
          // A test with >1 attempt that ultimately passed is flaky
          if (test.pass && test.attempts && test.attempts > 1) {
            flakyTests.push({
              title: test.fullTitle,
              file: suite.file,
              attempts: test.attempts,
              duration: test.duration,
            });
          }
        }
        if (suite.suites) walkSuites(suite.suites);
      }
    }
    
    walkSuites(results.results);
    
    console.log(`\n=== Flaky Test Report ===`);
    console.log(`Total flaky tests: ${flakyTests.length}`);
    console.log(`Flake rate: ${(
      (flakyTests.length / results.stats.tests) * 100
    ).toFixed(1)}%\n`);
    
    for (const t of flakyTests) {
      console.log(`  FLAKY: ${t.title}`);
      console.log(`    File: ${t.file}`);
      console.log(`    Attempts: ${t.attempts}\n`);
    }
    
    // Append to historical tracking file
    const history = fs.existsSync('flake-history.json')
      ? JSON.parse(fs.readFileSync('flake-history.json', 'utf8'))
      : [];
    
    history.push({
      date: new Date().toISOString(),
      totalTests: results.stats.tests,
      flakyCount: flakyTests.length,
      flakyTests: flakyTests.map((t) => t.title),
    });
    
    fs.writeFileSync('flake-history.json', JSON.stringify(history, null, 2));

    Common pitfalls

    • Do not add cy.wait(ms) as the first fix for a flaky test. It hides the timing issue and makes it someone else's problem next month. Use cy.intercept() and cy.wait('@alias') instead
    • Do not set chromeWebSecurity: false globally unless you understand the tradeoff. It disables same-origin policy for all tests, meaning your tests run in a security context no real user ever has
    • Do not use --retries as a permanent solution. Retries mask flakiness and increase CI costs by 15-30%. Use retries for detection and reporting, then fix the root causes
    • Do not ignore "detached from DOM" errors by adding blanket cy.wait() before every interaction. Fix the root cause by waiting for the specific condition that signals the DOM is stable
    • Do not blame Cypress for all flakiness. In most codebases, the majority of flakes come from test design (shared state, non-deterministic data, missing explicit waits), not from the framework itself

    FAQ

    Why are my Cypress tests flaky in CI but pass locally?

    CI runners are slower than your local machine, have less memory, and share resources with other jobs. Timing assumptions that hold locally (API responds in 200ms, DOM updates in 50ms) break under CI resource pressure. The fix is to replace time-based waits with condition-based waits: cy.intercept() for network requests, .should('be.visible') for DOM elements, and explicit assertions rather than arbitrary delays.

    Should I use cy.wait() to fix flaky tests?

    Never use cy.wait(milliseconds). It is the most common anti-pattern in Cypress codebases and the #1 source of CI flakiness. Use cy.intercept() to register network aliases, then cy.wait('@alias') to wait for specific requests to complete. For DOM-related timing, use .should('be.visible') or .should('exist') assertions, which leverage Cypress's built-in retry-ability.

    How do I handle detached DOM errors in Cypress?

    Detached DOM errors happen when React, Vue, or other SPA frameworks re-render a component between the time Cypress finds an element and the time it acts on it. Add .should('be.visible') before .click() to force Cypress to re-query until the element is stable. For elements that re-render after data loads, wait for the data first using cy.intercept() and cy.wait('@alias'), then query the element.

    Is Cypress Cloud worth it for flaky test detection?

    Cypress Cloud ($67-$300+/month) provides real flaky test analytics, automatic tagging, parallelization, and trend tracking. It is a good product if the budget fits. For teams that cannot justify the cost, build DIY detection: run with --retries 1, parse the mochawesome JSON output, and track flake rate per test over time. You get 80% of the visibility at zero additional cost.

    Should I migrate from Cypress to Playwright?

    Cypress has real strengths: excellent developer experience, time-travel debugging, and a large ecosystem. But if you are constantly fighting single-tab limitations, cross-origin restrictions, or memory pressure in large suites, those are architectural constraints that no configuration change will solve. Playwright handles multi-tab, cross-origin, and memory management natively. A gradual migration, running both frameworks on different spec subsets, is lower risk than a full rewrite. Alternatively, intent-based tools like Zerocheck sidestep both frameworks' selector maintenance entirely.

    How to fix flaky Cypress tests in CI

    Skip the setup. Zerocheck handles it in plain English.

    See it run on your app