Source from repo
Playwright Best Practices

Comprehensive Playwright testing guide covering E2E, component, API, visual, accessibility, and security tests.
currents-devGitHub currents-devSource repo Original GitHub link Publisher page
Files
Skill
n/a
Size
755.6 KB
Entrypoint
SKILL.md
Format
git-repo
Open file
debugging/flaky-tests.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown497 linesFree
debugging/flaky-tests.md
1# Debugging and Managing Flaky Tests
2 
3## Table of Contents
4 
51. [Understanding Flakiness Types](#understanding-flakiness-types)
62. [Detection and Reproduction](#detection-and-reproduction)
73. [Root Cause Analysis](#root-cause-analysis)
84. [Fixing Strategies by Type](#fixing-strategies-by-type)
95. [CI-Specific Flakiness](#ci-specific-flakiness)
106. [Quarantine and Management](#quarantine-and-management)
117. [Prevention Strategies](#prevention-strategies)
12 
13## Understanding Flakiness Types
14 
15### Categories of Flakiness
16 
17Most flaky tests fall into distinct categories requiring different remediation:
18 
19| Category                    | Symptoms                        | Common Causes                                          |
20| --------------------------- | ------------------------------- | ------------------------------------------------------ |
21| **UI-driven**               | Element not found, click missed | Missing waits, animations, dynamic rendering           |
22| **Environment-driven**      | CI-only failures                | Slower CPU, memory limits, cold browser starts         |
23| **Data/parallelism-driven** | Fails with multiple workers     | Shared backend data, reused accounts, state collisions |
24| **Test-suite-driven**       | Fails when run with other tests | Leaked state, shared fixtures, order dependencies      |
25 
26### Flakiness Decision Tree
27 
28```
29Test fails intermittently
30├─ Fails locally too?
31│  ├─ YES → Timing/async issue → Check waits and assertions
32│  └─ NO → CI-specific → Check environment differences
33│
34├─ Fails only with multiple workers?
35│  └─ YES → Parallelism issue → Check data isolation
36│
37├─ Fails only when run after specific tests?
38│  └─ YES → State leak → Check fixtures and cleanup
39│
40└─ Fails randomly regardless of conditions?
41   └─ External dependency → Check network/API stability
42```
43 
44## Detection and Reproduction
45 
46### Confirming Flakiness
47 
48```bash
49# Run test multiple times to confirm instability
50npx playwright test tests/checkout.spec.ts --repeat-each=20
51 
52# Run with single worker to isolate parallelism issues
53npx playwright test --workers=1
54 
55# Run in CI-like conditions locally
56CI=true npx playwright test --repeat-each=10
57```
58 
59### Reproduction Strategies
60 
61```typescript
62// playwright.config.ts - Enable artifacts for flaky test investigation
63export default defineConfig({
64  retries: process.env.CI ? 2 : 0,
65  use: {
66    trace: "on-first-retry", // Capture trace on retry
67    video: "retain-on-failure",
68    screenshot: "only-on-failure",
69  },
70});
71```
72 
73### Identify Flaky Tests Programmatically
74 
75```typescript
76// Track test results across runs
77test.afterEach(async ({}, testInfo) => {
78  if (testInfo.retry > 0 && testInfo.status === "passed") {
79    console.warn(`FLAKY: ${testInfo.title} passed on retry ${testInfo.retry}`);
80    // Log to your tracking system
81  }
82});
83```
84 
85## Root Cause Analysis
86 
87### Event Logging for Race Conditions
88 
89Add comprehensive event logging to expose timing issues:
90 
91```typescript
92test.beforeEach(async ({ page }) => {
93  page.on("console", (msg) =>
94    console.log(`CONSOLE [${msg.type()}]:`, msg.text()),
95  );
96  page.on("pageerror", (err) => console.error("PAGE ERROR:", err.message));
97  page.on("requestfailed", (req) =>
98    console.error(`REQUEST FAILED: ${req.url()}`),
99  );
100});
101```
102 
103> **For comprehensive console error handling** (fail on errors, allowed patterns, fixtures), see [console-errors.md](console-errors.md).
104 
105### Network Timing Analysis
106 
107```typescript
108// Capture slow or failed requests
109test.beforeEach(async ({ page }) => {
110  const slowRequests: string[] = [];
111 
112  page.on("requestfinished", (request) => {
113    const timing = request.timing();
114    const duration = timing.responseEnd - timing.requestStart;
115    if (duration > 2000) {
116      slowRequests.push(`${request.url()} took ${duration}ms`);
117    }
118  });
119 
120  page.on("requestfailed", (request) => {
121    console.error(`Failed: ${request.url()} - ${request.failure()?.errorText}`);
122  });
123});
124```
125 
126### Trace Analysis
127 
128```bash
129# View trace from failed CI run
130npx playwright show-trace path/to/trace.zip
131 
132# Generate trace for specific test
133npx playwright test tests/flaky.spec.ts --trace on
134```
135 
136## Fixing Strategies by Type
137 
138### UI-Driven Flakiness
139 
140**Problem: Element not ready when action executes**
141 
142```typescript
143// ❌ BAD: No wait for element state
144await page.click("#submit");
145await page.fill("#username", "test"); // Element may not be ready
146 
147// ✅ GOOD: Actions + assertions pattern (auto-waiting built-in)
148await page.getByRole("button", { name: "Submit" }).click();
149await expect(page.getByRole("heading", { name: "Dashboard" })).toBeVisible();
150```
151 
152**Problem: Animations or transitions interfere**
153 
154```typescript
155// ❌ BAD: Click during animation
156await page.click(".menu-item");
157 
158// ✅ GOOD: Wait for animation to complete
159await page.getByRole("menuitem", { name: "Settings" }).click();
160await expect(page.getByRole("dialog")).toBeVisible();
161// Or disable animations entirely
162await page.emulateMedia({ reducedMotion: "reduce" });
163```
164 
165**Problem: Brittle selectors**
166 
167```typescript
168// ❌ BAD: Fragile CSS chain
169await page.click("div.container > div:nth-child(2) > button.btn-primary");
170 
171// ✅ GOOD: Semantic selectors
172await page.getByRole("button", { name: "Continue" }).click();
173await page.getByTestId("checkout-button").click();
174await page.getByLabel("Email address").fill("[email protected]");
175```
176 
177### Async/Timing Flakiness
178 
179**Problem: Race between test and application**
180 
181```typescript
182// ❌ BAD: Arbitrary sleep
183await page.click("#load-data");
184await page.waitForTimeout(3000); // Hope data loads in 3s
185 
186// ✅ GOOD: Wait for specific condition
187await page.click("#load-data");
188await expect(page.locator(".data-row")).toHaveCount(10, { timeout: 10000 });
189 
190// ✅ BETTER: Wait for network response, then assert
191const responsePromise = page.waitForResponse(
192  (r) =>
193    r.url().includes("/api/data") &&
194    r.request().method() === "GET" &&
195    r.ok(),
196);
197await page.click("#load-data");
198await responsePromise;
199await expect(page.locator(".data-row")).toHaveCount(10);
200```
201 
202> **For comprehensive waiting strategies** (navigation, element state, network, polling with `toPass()`), see [assertions-waiting.md](assertions-waiting.md#waiting-strategies).
203 
204**Problem: Complex async state**
205 
206```typescript
207// Custom wait for application-specific conditions
208await page.waitForFunction(() => {
209  const app = (window as any).__APP_STATE__;
210  return app?.isReady && !app?.isLoading;
211});
212 
213// Wait for multiple conditions
214await Promise.all([
215  page.waitForResponse("**/api/user"),
216  page.waitForResponse("**/api/settings"),
217  page.getByRole("button", { name: "Load" }).click(),
218]);
219```
220 
221### Data/Parallelism-Driven Flakiness
222 
223**Problem: Tests share backend data**
224 
225```typescript
226// ❌ BAD: All workers use same user
227const testUser = { email: "[email protected]", password: "pass123" };
228 
229// ✅ GOOD: Unique data per worker
230import { test as base } from "@playwright/test";
231 
232export const test = base.extend<
233  {},
234  { testUser: { email: string; id: string } }
235>({
236  testUser: [
237    async ({}, use, workerInfo) => {
238      const email = `test-${workerInfo.workerIndex}-${Date.now()}@example.com`;
239      const user = await createTestUser(email);
240      await use(user);
241      await deleteTestUser(user.id);
242    },
243    { scope: "worker" },
244  ],
245});
246```
247 
248**Problem: Shared storageState across workers**
249 
250```typescript
251// ❌ BAD: All workers share same auth state
252use: {
253  storageState: '.auth/user.json',
254}
255 
256// ✅ GOOD: Per-worker auth state
257export const test = base.extend<{}, { workerStorageState: string }>({
258  workerStorageState: [
259    async ({ browser }, use, workerInfo) => {
260      const id = workerInfo.workerIndex;
261      const fileName = `.auth/user-${id}.json`;
262 
263      if (!fs.existsSync(fileName)) {
264        const page = await browser.newPage({ storageState: undefined });
265        await authenticateUser(page, `worker${id}@test.com`);
266        await page.context().storageState({ path: fileName });
267        await page.close();
268      }
269 
270      await use(fileName);
271    },
272    { scope: "worker" },
273  ],
274});
275```
276 
277### Test-Suite-Driven Flakiness (State Leaks)
278 
279**Problem: Tests affect each other**
280 
281```typescript
282// ❌ BAD: Module-level state persists across tests
283let sharedPage: Page;
284 
285test.beforeAll(async ({ browser }) => {
286  sharedPage = await browser.newPage(); // Shared across tests!
287});
288 
289// ✅ GOOD: Use Playwright's default isolation (fresh context per test)
290test("first test", async ({ page }) => {
291  // Fresh page for this test
292});
293 
294test("second test", async ({ page }) => {
295  // Fresh page for this test
296});
297```
298 
299**Problem: Fixture cleanup not happening**
300 
301```typescript
302// ✅ GOOD: Proper fixture with cleanup
303export const test = base.extend<{ tempFile: string }>({
304  tempFile: async ({}, use) => {
305    const file = `/tmp/test-${Date.now()}.json`;
306    fs.writeFileSync(file, "{}");
307 
308    await use(file);
309 
310    // Cleanup always runs, even on failure
311    if (fs.existsSync(file)) {
312      fs.unlinkSync(file);
313    }
314  },
315});
316```
317 
318## CI-Specific Flakiness
319 
320### Why Tests Fail Only in CI
321 
322| CI Condition       | Impact                                | Solution                                             |
323| ------------------ | ------------------------------------- | ---------------------------------------------------- |
324| Slower CPU         | Actions complete later than expected  | Use auto-waiting, not timeouts                       |
325| Cold browser start | No cached assets, slower initial load | Add explicit waits for first navigation              |
326| Headless mode      | Different rendering behavior          | Test locally in headless mode                        |
327| Shared runners     | Resource contention                   | Reduce parallelism or use dedicated runners          |
328| Network latency    | API calls slower                      | Mock external APIs, increase timeouts for real calls |
329 
330### Simulating CI Locally
331 
332```bash
333# Run headless with CI environment variable
334CI=true npx playwright test
335 
336# Limit CPU (Linux/Mac)
337cpulimit -l 50 -- npx playwright test
338 
339# Run in Docker matching CI environment
340docker run -it --rm \
341  -v $(pwd):/work \
342  -w /work \
343  mcr.microsoft.com/playwright:v1.40.0-jammy \
344  npx playwright test
345```
346 
347### Consistent Viewport and Scale
348 
349```typescript
350// playwright.config.ts - Match CI rendering exactly
351export default defineConfig({
352  use: {
353    viewport: { width: 1280, height: 720 },
354    deviceScaleFactor: 1,
355  },
356});
357```
358 
359### Network Stubbing for External APIs
360 
361```typescript
362// Eliminate external API flakiness
363test.beforeEach(async ({ page }) => {
364  // Stub unstable third-party APIs
365  await page.route("**/api.analytics.com/**", (route) =>
366    route.fulfill({ body: "" }),
367  );
368  await page.route("**/api.payment-provider.com/**", (route) =>
369    route.fulfill({ json: { status: "ok" } }),
370  );
371});
372 
373// Test-specific stub
374test("checkout with payment", async ({ page }) => {
375  await page.route("**/api/payment", (route) =>
376    route.fulfill({ json: { success: true, transactionId: "test-123" } }),
377  );
378  // Test proceeds with deterministic response
379});
380```
381 
382## Quarantine and Management
383 
384### Quarantine Pattern
385 
386```typescript
387// playwright.config.ts - Separate flaky tests
388export default defineConfig({
389  projects: [
390    {
391      name: "stable",
392      testIgnore: ["**/*.flaky.spec.ts"],
393    },
394    {
395      name: "quarantine",
396      testMatch: ["**/*.flaky.spec.ts"],
397      retries: 3,
398    },
399  ],
400});
401```
402 
403### Annotation-Based Quarantine
404 
405```typescript
406// Mark flaky tests with annotations
407test("intermittent checkout issue", async ({ page }, testInfo) => {
408  testInfo.annotations.push({
409    type: "flaky",
410    description: "Investigating payment API timing - JIRA-1234",
411  });
412 
413  // Test implementation
414});
415 
416// Skip flaky test conditionally
417test("known CI flaky", async ({ page }) => {
418  test.skip(!!process.env.CI, "Flaky in CI - investigating JIRA-5678");
419  // Test implementation
420});
421```
422 
423## Prevention Strategies
424 
425### Test Burn-In
426 
427```bash
428# Run new tests many times before merging
429npx playwright test tests/new-feature.spec.ts --repeat-each=50
430 
431# Run in parallel to expose race conditions
432npx playwright test tests/new-feature.spec.ts --repeat-each=20 --workers=4
433```
434 
435### Isolation Checklist
436 
437```typescript
438// ✅ Each test should be self-contained
439test.describe("User profile", () => {
440  test("can update name", async ({ page, testUser }) => {
441    // Uses unique testUser fixture
442    // No dependency on other tests
443    // Cleanup handled by fixture
444  });
445 
446  test("can update email", async ({ page, testUser }) => {
447    // Independent of "can update name"
448    // Own testUser, own state
449  });
450});
451```
452 
453### Defensive Assertions
454 
455```typescript
456// ❌ BAD: Single point of failure
457await expect(page.locator(".items")).toHaveCount(5);
458 
459// ✅ GOOD: Progressive assertions that help diagnose
460await expect(page.locator(".items-container")).toBeVisible();
461await expect(page.locator(".loading")).not.toBeVisible();
462await expect(page.locator(".items")).toHaveCount(5);
463```
464 
465### Retry Budget
466 
467```typescript
468// playwright.config.ts - Limit retries to avoid masking issues
469export default defineConfig({
470  retries: process.env.CI ? 2 : 0, // Only retry in CI
471  expect: {
472    timeout: 10000, // Reasonable assertion timeout
473  },
474  timeout: 60000, // Test timeout
475});
476```
477 
478## Anti-Patterns to Avoid
479 
480| Anti-Pattern                              | Problem                             | Solution                                       |
481| ----------------------------------------- | ----------------------------------- | ---------------------------------------------- |
482| `waitForTimeout()` as primary wait        | Arbitrary, hides real timing issues | Use auto-waiting assertions                    |
483| Increasing global timeout to "fix" flakes | Masks root cause, slows all tests   | Find and fix actual timing issue               |
484| Retrying until pass                       | Hides systemic problems             | Fix root cause, use retries for diagnosis only |
485| Shared test data across workers           | Race conditions, collisions         | Isolate data per worker                        |
486| Testing real external APIs                | Network variability                 | Mock external dependencies                     |
487| Module-level mutable state                | Leaks between tests                 | Use fixtures with proper cleanup               |
488| Ignoring flaky tests                      | Problem compounds over time         | Quarantine and track for fixing                |
489 
490## Related References
491 
492- **Debugging**: See [debugging.md](debugging.md) for trace viewer and inspector
493- **Fixtures**: See [fixtures-hooks.md](../core/fixtures-hooks.md) for worker-scoped isolation
494- **Performance**: See [performance.md](../infrastructure-ci-cd/performance.md) for parallel execution patterns
495- **Assertions**: See [assertions-waiting.md](../core/assertions-waiting.md) for auto-waiting patterns
496- **Global Setup**: See [global-setup.md](../core/global-setup.md) for setup vs fixtures decision
497
Preparing the source view

Playwright Best Practices

debugging/flaky-tests.md