Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Fetch any URL via Chrome CDP and convert the rendered page to clean markdown with YouTube transcript support.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
references/quality-gate.md
1# Quality Gate & Recovery23Headless Chrome can silently return low-quality content — layout shells, login walls, or framework payloads — without the CLI returning a non-zero exit code. Read this after every headless run so you can catch and recover from those cases.45## Checks the Agent Must Run671. Confirm the markdown title matches the target page, not a generic site shell82. Confirm the body contains the expected article/page content, not just navigation, footer, or a generic error93. Watch for obvious failure signs:10- `Application error`11- `This page could not be found`12- Login, signup, subscribe, or verification shells13- Extremely short markdown for a page that should be long-form14- Raw framework payloads or mostly boilerplate content154. Do NOT accept a run as successful just because the CLI exited `0`1617**Tip**: run with `--format json` to get structured signals including `status`, `login.state`, and `interaction`. `"status": "needs_interaction"` means the page requires manual interaction.1819## Recovery Workflow20211. Start headless (default) unless there is already a clear reason to use interaction mode222. Review markdown quality immediately after the run233. If the content is low quality or indicates login/CAPTCHA:24- `--wait-for interaction` for auto-detected gates (login, CAPTCHA, Cloudflare)25- `--wait-for force` when the page needs manual browsing, scroll loading, or complex interaction264. If `--wait-for` is used, tell the user exactly what to do:27- Login required → sign in in the browser28- CAPTCHA visible → solve it29- Slow loading → wait until content is visible30- `--wait-for force` → press Enter when ready315. If JSON output shows `"status": "needs_interaction"`, switch to `--wait-for interaction` automatically3233## Capture Modes3435| Mode | Behavior | Use When |36|------|----------|----------|37| Default | Headless Chrome, auto-extract on network idle | Public pages, static content |38| `--headless` | Explicit headless (same as default) | Clarify intent |39| `--wait-for interaction` | Opens visible Chrome, auto-detects login/CAPTCHA gates, waits for them to clear, then continues | Login-required, CAPTCHA-protected |40| `--wait-for force` | Opens visible Chrome, auto-detects OR accepts Enter keypress to continue | Complex flows, lazy loading, paywalls |4142**Interaction gate auto-detection**: Cloudflare Turnstile / "just a moment" pages, Google reCAPTCHA, hCaptcha, custom challenge / verification screens.43