shirenchuang

Web Content Fetcher — 网页正文提取

Extract clean Markdown content from any URL using a three-tier strategy: Jina Reader, Scrapling, or web_fetch.

Price

Free

Files

Rating

0.0

Reviews

Source

Source repo

Get free Preview source

About

Fetches web page main content and converts it to clean Markdown (preserving headings, links, images, code blocks) via a three-tier fallback: Jina Reader (fast, 200/day free), Scrapling+html2text (unlimited, handles WeChat/anti-bot sites like Substack and Medium), and direct web_fetch (static pages). Includes domain-based routing shortcuts to skip Jina for known anti-scraping platforms. Has a two-failure stop rule to prevent infinite retries.

By shirenchuang

Identity GitHub shirenchuang

What the agent sees

name

skills-sh-shirenchuang-web-content-fetcher-web-content-fetcher

description

Extract clean Markdown content from any URL using a three-tier strategy: Jina Reader, Scrapling, or web_fetch.