Source from repo

URL to Markdown

Fetch any URL via Chrome CDP and convert the rendered page to clean markdown with YouTube transcript support.

jimliuGitHub jimliuSource repo Original GitHub link Publisher page

Files

Skill

n/a

Size

271.9 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

references/adapters.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown78 linesFree

references/adapters.md

1# Adapters & Media
2 
3Read when choosing an adapter, handling media, or answering adapter-specific questions.
4 
5## Built-in Adapters
6 
7| Adapter | URLs | Key Features |
8|---------|------|-------------|
9| `x` | x.com, twitter.com | Tweets, threads, X Articles, media, login detection |
10| `youtube` | youtube.com, youtu.be | Transcript/captions, chapters, cover image, metadata |
11| `hn` | news.ycombinator.com | Threaded comments, story metadata, nested replies |
12| `generic` | Any URL (fallback) | Defuddle extraction, Readability fallback, auto-scroll, network idle detection |
13 
14Adapter is auto-selected based on URL. Override with `--adapter <name>`.
15 
16### YouTube
17 
18- Extracts transcripts/captions when available
19- Transcript format: `[MM:SS] Text segment` with chapter headings
20- Availability depends on YouTube exposing a caption track; videos with captions disabled or restricted playback may produce description-only output
21- Use `--wait-for force` if the page needs time to finish loading player metadata
22 
23### X/Twitter
24 
25- Extracts single tweets, threads, and X Articles
26- Auto-detects login state; if logged out and content requires auth, JSON output shows `"status": "needs_interaction"`
27- Use `--wait-for interaction` for login-protected content
28 
29### Hacker News
30 
31- Parses threaded comments with proper nesting and reply hierarchy
32- Includes story metadata (title, URL, author, score, comment count)
33- Shows comment deletion/dead status
34 
35## Media Download Workflow
36 
37Driven by `download_media` in EXTEND.md:
38 
39| Setting | Behavior |
40|---------|----------|
41| `1` (always) | Run CLI with `--download-media --output <path>` |
42| `0` (never) | Run CLI with `--output <path>` (no media download) |
43| `ask` (default) | Follow the ask-each-time flow below |
44 
45### Ask-Each-Time Flow
46 
471. Run the CLI **without** `--download-media` with `--output <path>` → markdown saved
482. Check the saved markdown for remote media URLs (`https://` in image/video links)
493. **If no remote media found** → done, no prompt needed
504. **If remote media found** → ask via `AskUserQuestion`:
51   - header: "Media", question: "Download N images/videos to local files?"
52   - "Yes" — Download to local directories
53   - "No" — Keep remote URLs
545. If the user confirms → run the CLI **again** with `--download-media --output <same-path>` (overwrites markdown with localized links)
55 
56### Media Layout
57 
58When `--download-media` is enabled:
59 
60- Images → `imgs/` next to the output file (or `--media-dir`)
61- Videos → `videos/` next to the output file (or `--media-dir`)
62- Markdown media links are rewritten to local relative paths
63 
64## Output Format
65 
66Markdown to stdout (or file with `--output`).
67 
68JSON output (`--format json`) returns structured data:
69 
70- `adapter` — which adapter handled the URL
71- `status` — `"ok"` or `"needs_interaction"`
72- `login` — login state detection (`logged_in`, `logged_out`, `unknown`)
73- `interaction` — interaction gate details (kind, provider, prompt)
74- `document` — structured content (url, title, author, publishedAt, content blocks, metadata)
75- `media` — collected media assets with url, kind, role
76- `markdown` — converted markdown text
77- `downloads` — media download results (when `--download-media` used)
78

Marketplace

Source from repo

URL to Markdown

Fetch any URL via Chrome CDP and convert the rendered page to clean markdown with YouTube transcript support.

jimliuGitHub jimliuSource repo Original GitHub link Publisher page

Files

Skill

n/a

Size

271.9 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

references/adapters.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown78 linesFree

references/adapters.md

1# Adapters & Media
2 
3Read when choosing an adapter, handling media, or answering adapter-specific questions.
4 
5## Built-in Adapters
6 
7| Adapter | URLs | Key Features |
8|---------|------|-------------|
9| `x` | x.com, twitter.com | Tweets, threads, X Articles, media, login detection |
10| `youtube` | youtube.com, youtu.be | Transcript/captions, chapters, cover image, metadata |
11| `hn` | news.ycombinator.com | Threaded comments, story metadata, nested replies |
12| `generic` | Any URL (fallback) | Defuddle extraction, Readability fallback, auto-scroll, network idle detection |
13 
14Adapter is auto-selected based on URL. Override with `--adapter <name>`.
15 
16### YouTube
17 
18- Extracts transcripts/captions when available
19- Transcript format: `[MM:SS] Text segment` with chapter headings
20- Availability depends on YouTube exposing a caption track; videos with captions disabled or restricted playback may produce description-only output
21- Use `--wait-for force` if the page needs time to finish loading player metadata
22 
23### X/Twitter
24 
25- Extracts single tweets, threads, and X Articles
26- Auto-detects login state; if logged out and content requires auth, JSON output shows `"status": "needs_interaction"`
27- Use `--wait-for interaction` for login-protected content
28 
29### Hacker News
30 
31- Parses threaded comments with proper nesting and reply hierarchy
32- Includes story metadata (title, URL, author, score, comment count)
33- Shows comment deletion/dead status
34 
35## Media Download Workflow
36 
37Driven by `download_media` in EXTEND.md:
38 
39| Setting | Behavior |
40|---------|----------|
41| `1` (always) | Run CLI with `--download-media --output <path>` |
42| `0` (never) | Run CLI with `--output <path>` (no media download) |
43| `ask` (default) | Follow the ask-each-time flow below |
44 
45### Ask-Each-Time Flow
46 
471. Run the CLI **without** `--download-media` with `--output <path>` → markdown saved
482. Check the saved markdown for remote media URLs (`https://` in image/video links)
493. **If no remote media found** → done, no prompt needed
504. **If remote media found** → ask via `AskUserQuestion`:
51   - header: "Media", question: "Download N images/videos to local files?"
52   - "Yes" — Download to local directories
53   - "No" — Keep remote URLs
545. If the user confirms → run the CLI **again** with `--download-media --output <same-path>` (overwrites markdown with localized links)
55 
56### Media Layout
57 
58When `--download-media` is enabled:
59 
60- Images → `imgs/` next to the output file (or `--media-dir`)
61- Videos → `videos/` next to the output file (or `--media-dir`)
62- Markdown media links are rewritten to local relative paths
63 
64## Output Format
65 
66Markdown to stdout (or file with `--output`).
67 
68JSON output (`--format json`) returns structured data:
69 
70- `adapter` — which adapter handled the URL
71- `status` — `"ok"` or `"needs_interaction"`
72- `login` — login state detection (`logged_in`, `logged_out`, `unknown`)
73- `interaction` — interaction gate details (kind, provider, prompt)
74- `document` — structured content (url, title, author, publishedAt, content blocks, metadata)
75- `media` — collected media assets with url, kind, role
76- `markdown` — converted markdown text
77- `downloads` — media download results (when `--download-media` used)
78

URL to Markdown

references/adapters.md

Preparing the source view

URL to Markdown

references/adapters.md