Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Comprehensive Playwright testing guide covering E2E, component, API, visual, accessibility, and security tests.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
debugging/flaky-tests.md
1# Debugging and Managing Flaky Tests23## Table of Contents451. [Understanding Flakiness Types](#understanding-flakiness-types)62. [Detection and Reproduction](#detection-and-reproduction)73. [Root Cause Analysis](#root-cause-analysis)84. [Fixing Strategies by Type](#fixing-strategies-by-type)95. [CI-Specific Flakiness](#ci-specific-flakiness)106. [Quarantine and Management](#quarantine-and-management)117. [Prevention Strategies](#prevention-strategies)1213## Understanding Flakiness Types1415### Categories of Flakiness1617Most flaky tests fall into distinct categories requiring different remediation:1819| Category | Symptoms | Common Causes |20| --------------------------- | ------------------------------- | ------------------------------------------------------ |21| **UI-driven** | Element not found, click missed | Missing waits, animations, dynamic rendering |22| **Environment-driven** | CI-only failures | Slower CPU, memory limits, cold browser starts |23| **Data/parallelism-driven** | Fails with multiple workers | Shared backend data, reused accounts, state collisions |24| **Test-suite-driven** | Fails when run with other tests | Leaked state, shared fixtures, order dependencies |2526### Flakiness Decision Tree2728```29Test fails intermittently30├─ Fails locally too?31│ ├─ YES → Timing/async issue → Check waits and assertions32│ └─ NO → CI-specific → Check environment differences33│34├─ Fails only with multiple workers?35│ └─ YES → Parallelism issue → Check data isolation36│37├─ Fails only when run after specific tests?38│ └─ YES → State leak → Check fixtures and cleanup39│40└─ Fails randomly regardless of conditions?41└─ External dependency → Check network/API stability42```4344## Detection and Reproduction4546### Confirming Flakiness4748```bash49# Run test multiple times to confirm instability50npx playwright test tests/checkout.spec.ts --repeat-each=205152# Run with single worker to isolate parallelism issues53npx playwright test --workers=15455# Run in CI-like conditions locally56CI=true npx playwright test --repeat-each=1057```5859### Reproduction Strategies6061```typescript62// playwright.config.ts - Enable artifacts for flaky test investigation63export default defineConfig({64retries: process.env.CI ? 2 : 0,65use: {66trace: "on-first-retry", // Capture trace on retry67video: "retain-on-failure",68screenshot: "only-on-failure",69},70});71```7273### Identify Flaky Tests Programmatically7475```typescript76// Track test results across runs77test.afterEach(async ({}, testInfo) => {78if (testInfo.retry > 0 && testInfo.status === "passed") {79console.warn(`FLAKY: ${testInfo.title} passed on retry ${testInfo.retry}`);80// Log to your tracking system81}82});83```8485## Root Cause Analysis8687### Event Logging for Race Conditions8889Add comprehensive event logging to expose timing issues:9091```typescript92test.beforeEach(async ({ page }) => {93page.on("console", (msg) =>94console.log(`CONSOLE [${msg.type()}]:`, msg.text()),95);96page.on("pageerror", (err) => console.error("PAGE ERROR:", err.message));97page.on("requestfailed", (req) =>98console.error(`REQUEST FAILED: ${req.url()}`),99);100});101```102103> **For comprehensive console error handling** (fail on errors, allowed patterns, fixtures), see [console-errors.md](console-errors.md).104105### Network Timing Analysis106107```typescript108// Capture slow or failed requests109test.beforeEach(async ({ page }) => {110const slowRequests: string[] = [];111112page.on("requestfinished", (request) => {113const timing = request.timing();114const duration = timing.responseEnd - timing.requestStart;115if (duration > 2000) {116slowRequests.push(`${request.url()} took ${duration}ms`);117}118});119120page.on("requestfailed", (request) => {121console.error(`Failed: ${request.url()} - ${request.failure()?.errorText}`);122});123});124```125126### Trace Analysis127128```bash129# View trace from failed CI run130npx playwright show-trace path/to/trace.zip131132# Generate trace for specific test133npx playwright test tests/flaky.spec.ts --trace on134```135136## Fixing Strategies by Type137138### UI-Driven Flakiness139140**Problem: Element not ready when action executes**141142```typescript143// ❌ BAD: No wait for element state144await page.click("#submit");145await page.fill("#username", "test"); // Element may not be ready146147// ✅ GOOD: Actions + assertions pattern (auto-waiting built-in)148await page.getByRole("button", { name: "Submit" }).click();149await expect(page.getByRole("heading", { name: "Dashboard" })).toBeVisible();150```151152**Problem: Animations or transitions interfere**153154```typescript155// ❌ BAD: Click during animation156await page.click(".menu-item");157158// ✅ GOOD: Wait for animation to complete159await page.getByRole("menuitem", { name: "Settings" }).click();160await expect(page.getByRole("dialog")).toBeVisible();161// Or disable animations entirely162await page.emulateMedia({ reducedMotion: "reduce" });163```164165**Problem: Brittle selectors**166167```typescript168// ❌ BAD: Fragile CSS chain169await page.click("div.container > div:nth-child(2) > button.btn-primary");170171// ✅ GOOD: Semantic selectors172await page.getByRole("button", { name: "Continue" }).click();173await page.getByTestId("checkout-button").click();174await page.getByLabel("Email address").fill("[email protected]");175```176177### Async/Timing Flakiness178179**Problem: Race between test and application**180181```typescript182// ❌ BAD: Arbitrary sleep183await page.click("#load-data");184await page.waitForTimeout(3000); // Hope data loads in 3s185186// ✅ GOOD: Wait for specific condition187await page.click("#load-data");188await expect(page.locator(".data-row")).toHaveCount(10, { timeout: 10000 });189190// ✅ BETTER: Wait for network response, then assert191const responsePromise = page.waitForResponse(192(r) =>193r.url().includes("/api/data") &&194r.request().method() === "GET" &&195r.ok(),196);197await page.click("#load-data");198await responsePromise;199await expect(page.locator(".data-row")).toHaveCount(10);200```201202> **For comprehensive waiting strategies** (navigation, element state, network, polling with `toPass()`), see [assertions-waiting.md](assertions-waiting.md#waiting-strategies).203204**Problem: Complex async state**205206```typescript207// Custom wait for application-specific conditions208await page.waitForFunction(() => {209const app = (window as any).__APP_STATE__;210return app?.isReady && !app?.isLoading;211});212213// Wait for multiple conditions214await Promise.all([215page.waitForResponse("**/api/user"),216page.waitForResponse("**/api/settings"),217page.getByRole("button", { name: "Load" }).click(),218]);219```220221### Data/Parallelism-Driven Flakiness222223**Problem: Tests share backend data**224225```typescript226// ❌ BAD: All workers use same user227const testUser = { email: "[email protected]", password: "pass123" };228229// ✅ GOOD: Unique data per worker230import { test as base } from "@playwright/test";231232export const test = base.extend<233{},234{ testUser: { email: string; id: string } }235>({236testUser: [237async ({}, use, workerInfo) => {238const email = `test-${workerInfo.workerIndex}-${Date.now()}@example.com`;239const user = await createTestUser(email);240await use(user);241await deleteTestUser(user.id);242},243{ scope: "worker" },244],245});246```247248**Problem: Shared storageState across workers**249250```typescript251// ❌ BAD: All workers share same auth state252use: {253storageState: '.auth/user.json',254}255256// ✅ GOOD: Per-worker auth state257export const test = base.extend<{}, { workerStorageState: string }>({258workerStorageState: [259async ({ browser }, use, workerInfo) => {260const id = workerInfo.workerIndex;261const fileName = `.auth/user-${id}.json`;262263if (!fs.existsSync(fileName)) {264const page = await browser.newPage({ storageState: undefined });265await authenticateUser(page, `worker${id}@test.com`);266await page.context().storageState({ path: fileName });267await page.close();268}269270await use(fileName);271},272{ scope: "worker" },273],274});275```276277### Test-Suite-Driven Flakiness (State Leaks)278279**Problem: Tests affect each other**280281```typescript282// ❌ BAD: Module-level state persists across tests283let sharedPage: Page;284285test.beforeAll(async ({ browser }) => {286sharedPage = await browser.newPage(); // Shared across tests!287});288289// ✅ GOOD: Use Playwright's default isolation (fresh context per test)290test("first test", async ({ page }) => {291// Fresh page for this test292});293294test("second test", async ({ page }) => {295// Fresh page for this test296});297```298299**Problem: Fixture cleanup not happening**300301```typescript302// ✅ GOOD: Proper fixture with cleanup303export const test = base.extend<{ tempFile: string }>({304tempFile: async ({}, use) => {305const file = `/tmp/test-${Date.now()}.json`;306fs.writeFileSync(file, "{}");307308await use(file);309310// Cleanup always runs, even on failure311if (fs.existsSync(file)) {312fs.unlinkSync(file);313}314},315});316```317318## CI-Specific Flakiness319320### Why Tests Fail Only in CI321322| CI Condition | Impact | Solution |323| ------------------ | ------------------------------------- | ---------------------------------------------------- |324| Slower CPU | Actions complete later than expected | Use auto-waiting, not timeouts |325| Cold browser start | No cached assets, slower initial load | Add explicit waits for first navigation |326| Headless mode | Different rendering behavior | Test locally in headless mode |327| Shared runners | Resource contention | Reduce parallelism or use dedicated runners |328| Network latency | API calls slower | Mock external APIs, increase timeouts for real calls |329330### Simulating CI Locally331332```bash333# Run headless with CI environment variable334CI=true npx playwright test335336# Limit CPU (Linux/Mac)337cpulimit -l 50 -- npx playwright test338339# Run in Docker matching CI environment340docker run -it --rm \341-v $(pwd):/work \342-w /work \343mcr.microsoft.com/playwright:v1.40.0-jammy \344npx playwright test345```346347### Consistent Viewport and Scale348349```typescript350// playwright.config.ts - Match CI rendering exactly351export default defineConfig({352use: {353viewport: { width: 1280, height: 720 },354deviceScaleFactor: 1,355},356});357```358359### Network Stubbing for External APIs360361```typescript362// Eliminate external API flakiness363test.beforeEach(async ({ page }) => {364// Stub unstable third-party APIs365await page.route("**/api.analytics.com/**", (route) =>366route.fulfill({ body: "" }),367);368await page.route("**/api.payment-provider.com/**", (route) =>369route.fulfill({ json: { status: "ok" } }),370);371});372373// Test-specific stub374test("checkout with payment", async ({ page }) => {375await page.route("**/api/payment", (route) =>376route.fulfill({ json: { success: true, transactionId: "test-123" } }),377);378// Test proceeds with deterministic response379});380```381382## Quarantine and Management383384### Quarantine Pattern385386```typescript387// playwright.config.ts - Separate flaky tests388export default defineConfig({389projects: [390{391name: "stable",392testIgnore: ["**/*.flaky.spec.ts"],393},394{395name: "quarantine",396testMatch: ["**/*.flaky.spec.ts"],397retries: 3,398},399],400});401```402403### Annotation-Based Quarantine404405```typescript406// Mark flaky tests with annotations407test("intermittent checkout issue", async ({ page }, testInfo) => {408testInfo.annotations.push({409type: "flaky",410description: "Investigating payment API timing - JIRA-1234",411});412413// Test implementation414});415416// Skip flaky test conditionally417test("known CI flaky", async ({ page }) => {418test.skip(!!process.env.CI, "Flaky in CI - investigating JIRA-5678");419// Test implementation420});421```422423## Prevention Strategies424425### Test Burn-In426427```bash428# Run new tests many times before merging429npx playwright test tests/new-feature.spec.ts --repeat-each=50430431# Run in parallel to expose race conditions432npx playwright test tests/new-feature.spec.ts --repeat-each=20 --workers=4433```434435### Isolation Checklist436437```typescript438// ✅ Each test should be self-contained439test.describe("User profile", () => {440test("can update name", async ({ page, testUser }) => {441// Uses unique testUser fixture442// No dependency on other tests443// Cleanup handled by fixture444});445446test("can update email", async ({ page, testUser }) => {447// Independent of "can update name"448// Own testUser, own state449});450});451```452453### Defensive Assertions454455```typescript456// ❌ BAD: Single point of failure457await expect(page.locator(".items")).toHaveCount(5);458459// ✅ GOOD: Progressive assertions that help diagnose460await expect(page.locator(".items-container")).toBeVisible();461await expect(page.locator(".loading")).not.toBeVisible();462await expect(page.locator(".items")).toHaveCount(5);463```464465### Retry Budget466467```typescript468// playwright.config.ts - Limit retries to avoid masking issues469export default defineConfig({470retries: process.env.CI ? 2 : 0, // Only retry in CI471expect: {472timeout: 10000, // Reasonable assertion timeout473},474timeout: 60000, // Test timeout475});476```477478## Anti-Patterns to Avoid479480| Anti-Pattern | Problem | Solution |481| ----------------------------------------- | ----------------------------------- | ---------------------------------------------- |482| `waitForTimeout()` as primary wait | Arbitrary, hides real timing issues | Use auto-waiting assertions |483| Increasing global timeout to "fix" flakes | Masks root cause, slows all tests | Find and fix actual timing issue |484| Retrying until pass | Hides systemic problems | Fix root cause, use retries for diagnosis only |485| Shared test data across workers | Race conditions, collisions | Isolate data per worker |486| Testing real external APIs | Network variability | Mock external dependencies |487| Module-level mutable state | Leaks between tests | Use fixtures with proper cleanup |488| Ignoring flaky tests | Problem compounds over time | Quarantine and track for fixing |489490## Related References491492- **Debugging**: See [debugging.md](debugging.md) for trace viewer and inspector493- **Fixtures**: See [fixtures-hooks.md](../core/fixtures-hooks.md) for worker-scoped isolation494- **Performance**: See [performance.md](../infrastructure-ci-cd/performance.md) for parallel execution patterns495- **Assertions**: See [assertions-waiting.md](../core/assertions-waiting.md) for auto-waiting patterns496- **Global Setup**: See [global-setup.md](../core/global-setup.md) for setup vs fixtures decision497