Source from repo
Browser Automation with playwright-cli

Automate browser interactions for web testing, form filling, screenshots, and data extraction using the playwright-cli tool.
microsoftGitHub microsoftOfficialSource repo Original GitHub link Publisher page
Files
Skill
n/a
Size
56.0 KB
Entrypoint
SKILL.md
Format
git-repo
Open file
references/spec-driven-testing.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown306 linesFree
references/spec-driven-testing.md
1# Spec-driven testing (plan → generate → heal)
2 
3End-to-end workflow for authoring and maintaining Playwright tests using `playwright-cli`. The three sections below can be used independently:
4 
5- **Planning** — explore the app, produce a spec file describing what to test.
6- **Generate** — turn a spec into Playwright test files. Update the spec if it's vague or stale.
7- **Heal** — diagnose failing tests, fix the code, reconcile the spec with reality.
8 
9All three lean on the same mechanic: run `npx playwright test --debug=cli` in the background, then `playwright-cli attach tw-XXXX` to drive the paused page interactively. See [playwright-tests.md](playwright-tests.md) for the debug/attach mechanics and [test-generation.md](test-generation.md) for how every `playwright-cli` action emits Playwright TypeScript.
10 
11---
12 
13## 1. Planning
14 
15Goal: produce a spec file (e.g. `specs/<feature>.plan.md`) that enumerates the scenarios to test. **Always** write the spec to a file.
16 
17### 1.1 Prerequisite: workspace
18 
19Check the workspace has Playwright installed before anything else:
20 
21```bash
22# Either of these confirms a workspace:
23test -f playwright.config.ts || test -f playwright.config.js
24npx --no-install playwright --version
25```
26 
27If there is no Playwright install, bootstrap one and let the user pick the defaults:
28 
29```bash
30npm init playwright@latest
31```
32 
33### 1.2 Prerequisite: seed test
34 
35A **seed test** is a minimal test that lands the page in the state every scenario starts from: navigation to the app, any required login, feature flags, etc. Scenarios assume a fresh start *after* the seed. `--debug=cli` pauses *inside* this test, so the seed is where every planning and generation session begins.
36 
37Minimum viable seed:
38 
39```ts
40// tests/seed.spec.ts
41import { test } from '@playwright/test';
42 
43test('seed', async ({ page }) => {
44  await page.goto('https://example.com/');
45});
46```
47 
48Preferred — push navigation into a fixture so scenario tests reuse it:
49 
50```ts
51// tests/fixtures.ts
52import { test as baseTest } from '@playwright/test';
53export { expect } from '@playwright/test';
54 
55export const test = baseTest.extend({
56  page: async ({ page }, use) => {
57    await page.goto('https://example.com/');
58    await use(page);
59  },
60});
61```
62 
63```ts
64// tests/seed.spec.ts
65import { test } from './fixtures';
66 
67test('seed', async ({ page }) => {
68  // Fixture already navigates. This empty body tells agents where to start.
69});
70```
71 
72If no seed exists, create one that at least navigates to the app.
73 
74### 1.3 Explore the app
75 
76Launch the app via the seed in the background and attach:
77 
78```bash
79PLAYWRIGHT_HTML_OPEN=never npx playwright test tests/seed.spec.ts --debug=cli
80# wait for "Debugging Instructions" and the session name tw-XXXX
81playwright-cli attach tw-XXXX
82```
83 
84Resume so the seed runs, then probe the app:
85 
86```bash
87playwright-cli resume                   # resume so that seed test runs fully
88playwright-cli snapshot                 # inventory of interactive elements
89playwright-cli click e5                 # follow a flow
90playwright-cli eval "location.href"     # read URL / state
91playwright-cli show --annotate          # ask the user to point at something
92```
93 
94Map out:
95 
96- Interactive surfaces (forms, buttons, lists, filters, modals).
97- Primary user journeys end-to-end.
98- Edge cases: empty states, validation errors, very long input, boundary values.
99- Persistence: reload, local/session storage, URL fragments.
100- Navigation: which controls change the URL, back/forward behaviour.
101 
102**Important**: Do not just open the app url with playwright-cli, always go through the test to capture any custom setup done there.
103**Important**: Stop the background test when done exploring.
104 
105### 1.4 Write the spec file
106 
107Save under `specs/<feature>.plan.md`. Use this structure:
108 
109```markdown
110# <Feature> Test Plan
111 
112## Application Overview
113 
114<One paragraph describing what the feature does and why it matters.>
115 
116## Test Scenarios
117 
118### 1. <Group Name>
119 
120**Seed:** `tests/seed.spec.ts`
121 
122#### 1.1. <kebab-case-scenario-name>
123 
124**File:** `tests/<group>/<kebab-case-scenario-name>.spec.ts`
125 
126**Steps:**
127  1. <Concrete user step>
128    - expect: <observable outcome>
129    - expect: <another observable outcome>
130  2. <Next step>
131    - expect: <outcome>
132 
133#### 1.2. <next-scenario>
134...
135 
136### 2. <Next Group>
137 
138**Seed:** `tests/seed.spec.ts`
139...
140```
141 
142Guidelines:
143 
144- Each scenario is independent and starts from the seed's fresh state — never chain scenarios.
145- Scenario names are kebab-case and match the test file name (`should-add-single-todo` → `should-add-single-todo.spec.ts`).
146- Cover happy path, edge cases, validation, negative flows, persistence.
147- Write steps at the user level ("Type 'Buy milk' into the input"), not the API level ("call `fill`").
148- Put observable outcomes in `- expect:` bullets; each becomes an assertion during generation.
149 
150---
151 
152## 2. Generate
153 
154Goal: take a spec file and produce Playwright test files. Optionally update the spec if it has drifted.
155 
156### 2.1 Inputs
157 
158- **Spec file**, e.g. `specs/basic-operations.plan.md`.
159- **Target**: either a single scenario (e.g. `1.2`), a whole group (`1`), or all.
160- **Seed file**, read from the `**Seed:**` line of the scenario's group.
161 
162### 2.2 Generate one scenario
163 
164For each target scenario, in sequence (never in parallel — scenarios share the seed session):
165 
166```bash
167PLAYWRIGHT_HTML_OPEN=never npx playwright test <seed-file> --debug=cli   # background
168playwright-cli attach tw-XXXX
169# resume
170```
171 
172**Do not** just open the app url with playwright-cli, always go through the test to capture any custom setup done there.
173 
174Walk the scenario's `Steps:` one by one with `playwright-cli`, treating the spec as the plan and the live app as the source of truth. If a step is vague ("click the button" — which button?), references an element that no longer exists, or contradicts the app's actual behaviour, use your judgement: update the spec to match what the app really does, then keep going. Editing the spec mid-generation is expected.
175 
176Every action prints the equivalent Playwright TypeScript (see [test-generation.md](test-generation.md)):
177 
178```bash
179playwright-cli snapshot                         # find refs
180playwright-cli fill e3 "John Doe"               # -> page.getByRole('textbox', {...}).fill(...)
181playwright-cli press Enter
182playwright-cli click e7
183```
184 
185For each `- expect:` bullet, add an explicit assertion. See [test-generation.md](test-generation.md) for details.
186 
187Collect the generated code and write the test file at the path given in the spec:
188 
189```ts
190// spec: specs/basic-operations.plan.md
191// seed: tests/seed.spec.ts
192import { test, expect } from './fixtures';   // or '@playwright/test' if no fixtures file
193 
194test.describe('Singing in and out', () => {
195  test('should sign in', async ({ page }) => {
196    // 1. Navigate to the application
197    // (handled by the seed fixture)
198 
199    // 2. Type 'John Doe' into the username field
200    await page.getByRole('textbox', { name: 'username' }).fill('John Doe');
201 
202    // 3. Type password
203    await page.getByRole('textbox', { name: 'password' }).fill('TestPassword');
204 
205    // 4. Press Enter to submit
206    await page.getByRole('textbox', { name: 'password' }).press('Enter');
207 
208    await expect(page.getByRole('heading')).toContainText('Welcome, John Doe!');
209  });
210});
211```
212 
213Rules:
214 
215- **One test per file.** File path, describe name, and test name come verbatim from the spec (minus the ordinal).
216- Prefix each numbered step with a `// N. <step text>` comment before its actions.
217- Use the describe group name verbatim from the spec (no `1.` ordinal).
218- Import from `./fixtures` if the project has one; otherwise `@playwright/test`.
219- **Important**: close the CLI session and stop the background test before moving to the next scenario.
220 
221### 2.3 Generate multiple scenarios
222 
223Loop 2.2 over the targeted scenarios one at a time, restarting the seed between each so every test starts from a clean page. This is safe to parallelise due to unique generated session names - just make sure each test run is stopped.
224 
225### 2.4 Run generated tests
226 
227After generation, run the new tests once:
228 
229```bash
230PLAYWRIGHT_HTML_OPEN=never npx playwright test tests/<group>/<scenario>.spec.ts
231```
232 
233Any failure goes to Section 3.
234 
235---
236 
237## 3. Heal
238 
239Goal: fix failing tests, and update the spec if the app's intended behaviour changed.
240 
241### 3.1 Find failing tests
242 
243```bash
244PLAYWRIGHT_HTML_OPEN=never npx playwright test
245```
246 
247Record the list of failing `<file>:<line>` entries and process them one at a time. Do not attempt parallel fixes — shared state and the single CLI session make that fragile.
248 
249### 3.2 Debug one failure
250 
251Run the single failing test in debug mode in the background, then attach:
252 
253```bash
254PLAYWRIGHT_HTML_OPEN=never npx playwright test tests/<group>/<scenario>.spec.ts:<line> --debug=cli
255# wait for "Debugging Instructions" and the tw-XXXX session name
256playwright-cli attach tw-XXXX
257```
258 
259The test is paused at the start. Step forward or run to until just before the failing action or assertion, then diagnose:
260 
261```bash
262playwright-cli snapshot                # did the element change / move / rename?
263playwright-cli console                 # app-side errors?
264playwright-cli network                 # failed request? wrong payload?
265playwright-cli show --annotate         # ask the user to point somewhere
266```
267 
268Common causes: selector drift, new wrapper element, label/ARIA rename, timing (transition, async load), assertion text updated in the app, test data leaking between runs.
269 
270Rehearse the corrected interaction with `playwright-cli` — the generated code in the output is what you paste back into the test.
271 
272### 3.3 Apply the fix
273 
274Edit the test file: update the locator, assertion, step order, or inputs to match the corrected behaviour. Stop the background debug run. Rerun the single test to confirm green.
275 
276Never skip hooks or add sleeps as a fix. Never use `networkidle`.
277 
278### 3.4 Reconcile with the spec
279 
280Open the spec referenced by the `// spec:` header in the test file and locate the scenario that matches the test.
281 
282- **Fix was purely technical** (locator drift, better assertion shape) and the spec's user-level behaviour still matches the app → leave the spec alone.
283- **Fix changed user-visible steps, inputs, order, or expected outcomes** that the spec describes → update the spec to match reality. Keep the scenario id and file path stable; only the step / expect lines change.
284- **Unclear whether the app change is intentional** (spec is stale) **or a regression** (test was right, app is wrong) → **stop and ask the user**. Provide:
285  - the scenario id (e.g. `2.3`),
286  - the spec lines that no longer match,
287  - the observed app behaviour (quote a snapshot excerpt or a concrete outcome).
288 
289Only after the user answers, either update the spec (intentional change) or file/flag the test as covering a bug (regression).
290 
291### 3.5 Iteration and giving up
292 
293- Fix failures one at a time; rerun after each.
294- If after thorough investigation you are confident the test is correct but the app is wrong *and* the user has confirmed it's a bug: mark the test `test.fixme(...)` with a comment pointing at the user's decision or issue link. Never silently skip.
295 
296---
297 
298## Cross-references
299 
300| For... | See |
301|---|---|
302| `--debug=cli` / attach mechanics | [playwright-tests.md](playwright-tests.md) |
303| How `playwright-cli` actions become TS | [test-generation.md](test-generation.md) |
304| Mocking requests during exploration/generation | [request-mocking.md](request-mocking.md) |
305| Managing the CLI browser session | [session-management.md](session-management.md) |
306
Preparing the source view

Browser Automation with playwright-cli

references/spec-driven-testing.md