How to test OAuth and SSO flows end-to-end in CI

OAuth redirects, SAML assertions, and SSO callbacks break standard automation. Here is how to test them properly, with code examples for Playwright.

Why this is hard to test

•Cross-origin redirects during OAuth break Cypress entirely and require careful handling even in Playwright - the browser navigates away from your app to a third-party domain and back
•Social login providers (Google, GitHub, Microsoft) render consent screens in popup windows that exist outside the main browser context
•SAML SSO involves XML assertion parsing, certificate validation, and redirect chains between your app, the IdP, and sometimes a service provider proxy
•Identity provider sandboxes have limitations - Google test accounts require specific configuration, Okta dev tenants have rate limits, and Azure AD test environments behave differently from production
•Token expiry and refresh edge cases (expired access tokens, rotated refresh tokens, concurrent session invalidation) create timing-dependent bugs that only surface under specific conditions

Approach 1: Playwright + test IdP accounts

1.Create dedicated test accounts on each OAuth provider (Google Workspace test user, GitHub machine user, Okta dev tenant) - never use personal accounts for CI
2.Use Playwright's storageState to cache authenticated sessions - authenticate once in a setup project, then reuse the session across all tests that need auth
3.Handle OAuth redirects with page.waitForURL() to track the redirect chain: your app -> provider login -> consent screen -> your app callback
4.For popup-based flows (Google One Tap, GitHub OAuth popup), use page.waitForEvent('popup') to capture the popup window and interact with it
5.Store test credentials in CI secrets (GitHub Actions secrets, CircleCI contexts) - never commit OAuth tokens or passwords to your repo
6.Run auth-dependent tests sequentially, not in parallel, to avoid rate limiting from identity providers

Approach 2: Describe the auth flow in plain English with Zerocheck

1.Write the test as a user would describe it: "Log in with Google, authorize the app, and verify the dashboard loads with the user's name"
2.Zerocheck handles cross-origin redirects and popup windows visually - no special configuration for iframe boundaries or popup contexts
3.Session state is managed automatically between test steps - no manual storageState caching or cookie manipulation
4.Tests run on every PR in CI - catches regressions when your auth flow changes or when the provider updates their consent screen UI

Email Flow Testing →

OAuth with social login providers (Google, GitHub)

The core challenge with OAuth testing is the cross-origin redirect. When a user clicks "Sign in with Google," the browser navigates from your domain (app.example.com) to accounts.google.com, then back to your domain's callback URL. During this redirect, Playwright's page context changes origin, which means any locators targeting your app's DOM are invalid until the browser returns. For Google OAuth, you need a dedicated test account in a Google Workspace you control. Using a personal Gmail account in CI is fragile: Google may trigger CAPTCHA challenges, require phone verification, or flag the account for suspicious activity when it logs in from a new IP (your CI runner) every few minutes. Create a Workspace test user, disable 2FA for that user (or use an app password), and add the CI runner's IP range to your Workspace's trusted networks. GitHub OAuth is simpler because GitHub machine users are designed for automation. Create a machine user, generate a personal access token with the scopes your app requires, and pre-authorize your OAuth app for that user so the consent screen does not appear during tests. Skipping the consent screen reduces test flakiness from UI changes on GitHub's side. The Playwright approach for a redirect-based OAuth flow uses page.waitForURL() to track the redirect chain. After clicking the login button, wait for the URL to match the provider's domain, interact with the provider's login form, then wait for the redirect back to your app. The code below shows this pattern for Google OAuth. Session caching with storageState is critical for test suite performance. Without it, every test that requires authentication runs the full OAuth flow, which adds 5-15 seconds per test and hammers the identity provider with login requests. Playwright's storageState serializes cookies and localStorage to a JSON file after the first authentication, then restores that state before subsequent tests. Run the OAuth flow once in a global setup project, save the state, and reuse it across your entire suite.

import { test, expect } from "@playwright/test";
import path from "path";

const AUTH_FILE = path.join(
  __dirname, ".auth", "google-user.json"
);

// Global setup: authenticate once, save session
test("authenticate with Google OAuth", async ({ page }) => {
  // Navigate to your app's login page
  await page.goto("https://staging.example.com/login");

  // Click the Google login button
  await page.getByRole("button", {
    name: "Sign in with Google"
  }).click();

  // Wait for redirect to Google's login page
  await page.waitForURL("**/accounts.google.com/**");

  // Fill Google credentials
  await page.getByLabel("Email or phone").fill(
    process.env.GOOGLE_TEST_EMAIL!
  );
  await page.getByRole("button", { name: "Next" }).click();

  await page.getByLabel("Enter your password").fill(
    process.env.GOOGLE_TEST_PASSWORD!
  );
  await page.getByRole("button", { name: "Next" }).click();

  // Handle consent screen if it appears
  // (pre-authorized apps skip this)
  const consentButton = page.getByRole("button", {
    name: "Allow"
  });
  if (await consentButton.isVisible({ timeout: 3000 })
      .catch(() => false)) {
    await consentButton.click();
  }

  // Wait for redirect back to your app
  await page.waitForURL(
    "**/staging.example.com/dashboard**"
  );

  // Verify authentication succeeded
  await expect(
    page.getByText("Welcome")
  ).toBeVisible();

  // Save session state for reuse in other tests
  await page.context().storageState({ path: AUTH_FILE });
});

// Subsequent tests reuse the saved session
test.use({ storageState: AUTH_FILE });

test("authenticated user can access settings",
  async ({ page }) => {
    // No login needed - session restored from file
    await page.goto(
      "https://staging.example.com/settings"
    );
    await expect(
      page.getByRole("heading", { name: "Settings" })
    ).toBeVisible();
  }
);

SAML SSO with enterprise IdPs

SAML (Security Assertion Markup Language) is the authentication protocol behind enterprise SSO with providers like Okta, Azure AD (now Entra ID), and OneLogin. Testing SAML flows is harder than OAuth for two reasons: the protocol involves XML assertions that are opaque to most developers, and enterprise IdPs have sandbox environments with limitations that differ from production. The SAML flow works like this: your user clicks "Sign in with SSO" on your app. Your app (the Service Provider, or SP) generates a SAML AuthnRequest and redirects the browser to the Identity Provider (IdP). The IdP authenticates the user (via its own login form), generates a signed SAML Response containing assertions about the user (email, name, group memberships), and POST-redirects the browser back to your app's Assertion Consumer Service (ACS) URL. Your app validates the XML signature, extracts the user attributes, and creates a session. For testing, you need a sandbox IdP. Okta offers free developer tenants at developer.okta.com with up to 100 monthly active users. Azure AD (Entra ID) provides free test tenants through the Microsoft 365 Developer Program. Auth0 has a free tier that supports SAML. Create a test user in your sandbox IdP, configure your app as a SAML application in the IdP, and point your staging environment's SSO configuration at the sandbox. The Playwright approach follows the same redirect-tracking pattern as OAuth, but with an important difference: SAML uses POST-based redirects (the browser submits a hidden form rather than following a 302 redirect). Playwright handles this transparently since it tracks navigation regardless of the mechanism. Wait for the IdP's login page, fill credentials, submit, and wait for the redirect back to your app. Many teams skip SSO testing entirely because of the setup complexity. This is a mistake. SSO is the authentication method your highest-value customers (enterprise accounts) use. If SSO breaks after a deploy, your enterprise customers cannot access your product, and they are the ones with SLAs and the loudest voices. The Playwright code below shows a complete SAML SSO test flow with Okta as the IdP.

import { test, expect } from "@playwright/test";

test("SAML SSO login via Okta", async ({ page }) => {
  // Navigate to your app's SSO login page
  await page.goto(
    "https://staging.example.com/sso/login"
  );

  // Enter the SSO domain/email to trigger IdP lookup
  await page.getByLabel("Work email").fill(
    "[email protected]"
  );
  await page.getByRole("button", {
    name: "Continue with SSO"
  }).click();

  // Your app redirects to Okta (SAML AuthnRequest)
  // Wait for Okta's login page to load
  await page.waitForURL("**/*.okta.com/**");

  // Authenticate on the Okta side
  await page.getByLabel("Username").fill(
    process.env.OKTA_TEST_USERNAME!
  );
  await page.getByLabel("Password").fill(
    process.env.OKTA_TEST_PASSWORD!
  );
  await page.getByRole("button", {
    name: "Sign In"
  }).click();

  // Okta may show an MFA prompt in sandbox
  // For test accounts, configure Okta to skip MFA
  // or use the Okta test MFA bypass

  // Okta POSTs the SAML Response to your ACS URL
  // Playwright follows the POST redirect automatically
  await page.waitForURL(
    "**/staging.example.com/dashboard**",
    { timeout: 15000 }
  );

  // Verify the SAML attributes were parsed correctly
  await expect(
    page.getByText("[email protected]")
  ).toBeVisible();

  // Verify SSO-specific state
  // (e.g., user belongs to the correct org)
  await page.goto(
    "https://staging.example.com/settings/account"
  );
  await expect(
    page.getByText("TestCorp")
  ).toBeVisible();
  await expect(
    page.getByText("SSO Managed")
  ).toBeVisible();
});

Token refresh and session edge cases

The authentication flows most teams do not test are the ones that break in production at 2 AM: expired access tokens that should silently refresh, refresh token rotation that invalidates old tokens, and concurrent tab behavior when one tab logs out while another is mid-action. Access token expiry is the most common untested scenario. Your app gets a short-lived access token (typically 15-60 minutes) and a longer-lived refresh token (days to weeks). When the access token expires, your app should use the refresh token to get a new access token without interrupting the user. If this refresh logic has a bug - a race condition, a missing error handler, a redirect loop - the user gets silently logged out or sees a cryptic error. This happens to real users every day, but most test suites never exercise it because tests run in under 60 minutes and the access token never expires. To test token expiry in Playwright, you manipulate the browser's cookies or localStorage directly. If your app stores the access token in a cookie, use page.context().addCookies() to set an expired token. If it uses localStorage, use page.evaluate() to modify the stored token. Then trigger an API call (navigate to a page that fetches data) and assert that the app refreshes the token and loads successfully instead of showing an error or redirecting to login. Refresh token rotation is a security feature where each use of a refresh token invalidates the old one and issues a new one. If your app sends the same refresh token twice (a race condition between two concurrent API calls), the second request gets rejected and the user loses their session. Test this by making two API calls in rapid succession after the access token expires. Concurrent tab behavior is another gap. Open your app in two tabs, log out in one, then perform an action in the other. Does the second tab gracefully redirect to login, or does it show a broken state? Open your app in two tabs, let the access token expire, and perform actions in both simultaneously. Does each tab refresh the token independently, or does the refresh token rotation cause one tab to invalidate the other's session? The "remember me" feature deserves its own test. When a user checks "remember me," what token lifetime does your app set? Does closing and reopening the browser maintain the session? Does it survive a browser cookie clear? These are the edge cases that enterprise security reviews will ask about, and having a test that validates them is better than guessing.

import { test, expect } from "@playwright/test";

// Test: expired access token triggers silent refresh
test("handles expired access token", async ({ page }) => {
  // First, log in normally to establish a session
  await page.goto("https://staging.example.com/login");
  await page.getByLabel("Email").fill("[email protected]");
  await page.getByLabel("Password").fill("testpass123");
  await page.getByRole("button", { name: "Log in" })
    .click();
  await page.waitForURL("**/dashboard");

  // Expire the access token by manipulating storage
  // Adjust based on your app's token storage method
  await page.evaluate(() => {
    // If using localStorage
    const authData = JSON.parse(
      localStorage.getItem("auth") || "{}"
    );
    // Set expiry to the past
    authData.accessTokenExpiry = Date.now() - 60000;
    // Alternatively, corrupt the access token
    authData.accessToken = "expired_token_value";
    localStorage.setItem("auth", JSON.stringify(authData));
  });

  // Navigate to a page that requires an API call
  // This should trigger the refresh flow
  await page.goto(
    "https://staging.example.com/dashboard/projects"
  );

  // Assert the page loads successfully
  // (token was refreshed in the background)
  await expect(
    page.getByRole("heading", { name: "Projects" })
  ).toBeVisible({ timeout: 10000 });

  // Verify no error messages or login redirects
  await expect(page).not.toHaveURL("**/login");
});

// Test: concurrent tabs handle session correctly
test("concurrent tabs after logout",
  async ({ browser }) => {
    const context = await browser.newContext();

    // Open two tabs with authenticated sessions
    const tab1 = await context.newPage();
    const tab2 = await context.newPage();

    // Log in on tab1
    await tab1.goto(
      "https://staging.example.com/login"
    );
    await tab1.getByLabel("Email")
      .fill("[email protected]");
    await tab1.getByLabel("Password")
      .fill("testpass123");
    await tab1.getByRole("button", { name: "Log in" })
      .click();
    await tab1.waitForURL("**/dashboard");

    // Tab2 shares the session (same browser context)
    await tab2.goto(
      "https://staging.example.com/dashboard"
    );
    await expect(
      tab2.getByRole("heading", { name: "Dashboard" })
    ).toBeVisible();

    // Log out on tab1
    await tab1.getByRole("button", { name: "Log out" })
      .click();
    await tab1.waitForURL("**/login");

    // Try to perform an action on tab2
    await tab2.getByRole("link", { name: "Settings" })
      .click();

    // Tab2 should redirect to login, not show an error
    await expect(tab2).toHaveURL(/\/login/,
      { timeout: 10000 }
    );

    await context.close();
  }
);

Mocking vs real IdP: trade-offs

There are two schools of thought on testing OAuth and SSO in CI, and the right answer depends on what you are optimizing for. Option A: Mock the IdP response. Instead of hitting Google or Okta during tests, intercept the authentication redirect and return a fake token or SAML assertion directly. In Playwright, use page.route() to intercept the request to your OAuth callback URL and respond with a pre-crafted authentication response. Your app's auth handler processes the mocked response as if it came from the real IdP. The advantages of mocking are significant for CI. Tests run in 1-2 seconds instead of 5-15 seconds per auth flow. No dependency on external services, so your tests never fail because Google is rate-limiting your CI IP or Okta's sandbox is down. Fully deterministic: the same mock always produces the same result. No test credentials to manage or rotate. The disadvantage is the gap between your mock and reality. If Google changes their OAuth response format, your mock does not reflect that change. If your app has a bug in how it parses the real SAML assertion (wrong attribute mapping, certificate validation error), your mock will not catch it because the mock skips the parts that are hard. You are testing your app's auth handler in isolation, not the full integration. Option B: Use real test accounts on sandbox IdPs. Create test users on Google Workspace, Okta dev tenant, or Azure AD test tenant, and run the full OAuth or SAML flow in your tests. The browser navigates to the real IdP, fills real credentials, and returns with a real token. The advantages: you are testing the actual integration end-to-end. If the IdP changes their UI, your test catches it. If your SAML certificate expires, your test catches it. If your OAuth redirect URI is misconfigured after a deploy, your test catches it. These are real bugs that hit production, and mocks would not surface any of them. The disadvantages: tests are slower (5-15 seconds per auth flow), flakier (IdP login pages can be slow, rate-limited, or temporarily unavailable), and require credential management in CI. You need to rotate test account passwords, handle MFA enrollment, and deal with IdP-side rate limiting if you run tests frequently. The recommended approach for most teams: mock the IdP for CI (fast, reliable, runs on every PR) and run real IdP tests on a nightly or staging schedule (catches integration issues before they reach production). This gives you fast feedback on every pull request while still validating the full authentication integration at least once per day. In practice, the mocked CI tests catch 90% of auth regressions because most auth bugs are in your app's code (wrong redirect URI, broken session handling, missing error handling), not in the IdP integration. The nightly real-IdP tests catch the remaining 10%: certificate expirations, IdP UI changes, SAML attribute mapping issues, and token format changes. Running both provides comprehensive coverage without slowing down your development workflow.

Common pitfalls

—Do not disable CORS or security headers in your test environment to make auth tests easier - you will miss real security issues that only surface with proper headers enabled
—Do not hardcode OAuth tokens or SAML assertions in your test files - tokens expire, and hardcoded values become stale without warning. Use CI secrets and test account credentials
—Do not skip token refresh testing - expired access tokens that fail to refresh silently are one of the most common auth bugs in production, and they only surface after the token lifetime elapses
—Do not skip SSO testing because it is complex to set up - SSO is how your highest-value enterprise customers authenticate, and a broken SSO flow after a deploy is a severity-1 incident

FAQ

How do I test OAuth login in Playwright?

Use page.waitForURL() to track the redirect chain from your app to the OAuth provider and back. Create dedicated test accounts on the provider (Google Workspace test user, GitHub machine user). Cache the authenticated session using Playwright's storageState so you only run the full OAuth flow once per test suite, not once per test.

Can I test SSO without a real IdP?

Yes, for CI you can mock the IdP response using Playwright's page.route() to intercept the callback URL and return a crafted SAML assertion or OAuth token. This gives you fast, deterministic tests. But run real IdP tests on a nightly schedule to catch integration issues like certificate expiry, UI changes, or attribute mapping problems that mocks would hide.

How do I handle OAuth popups in automated tests?

Use Playwright's page.waitForEvent('popup') to capture the popup window when it opens. This returns a new Page object that you interact with the same way as the main page. Fill credentials in the popup, submit the form, and the popup closes automatically after the OAuth callback. The main page then has the authenticated session.

Should I mock OAuth in CI?

Yes for PR-level CI, where speed and reliability matter most. Mock the IdP response so auth tests run in 1-2 seconds instead of 5-15 seconds and never fail due to external service issues. Run real IdP tests on a nightly or staging schedule to catch the integration issues that mocks miss. This two-tier approach gives you fast PR feedback and comprehensive integration coverage.

How do I test token refresh flows?

Manipulate the token storage directly in the browser. Use page.evaluate() to set the access token expiry to the past in localStorage, or use page.context().addCookies() to set an expired token cookie. Then navigate to a page that triggers an API call. Assert that the page loads successfully (token refreshed silently) and that the user is not redirected to login.

How to test OAuth and SSO flows end-to-end in CI

Skip the setup. Zerocheck handles it in plain English.

See it run on your app