Source from repo

Agent Skills for Context Engineering

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.

muratcankoylanGitHub muratcankoylanSource repo Original GitHub link

Files

241

Skill

n/a

Size

2.6 MB

Entrypoint

SKILL.md

Format

git-repo

Open file

examples/llm-as-judge-skills/tools/research/read-url.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown163 linesFree

examples/llm-as-judge-skills/tools/research/read-url.md

1# Read URL Tool
2 
3## Purpose
4 
5Extract and parse content from a given URL. Returns structured text content with metadata about the source.
6 
7## Tool Definition
8 
9```typescript
10import { tool } from "ai";
11import { z } from "zod";
12 
13export const readUrl = tool({
14  description: `Read and extract content from a URL.
15Returns the main text content, stripped of navigation and ads.
16Use after webSearch to get full content from relevant results.`,
17 
18  parameters: z.object({
19    url: z.string().url()
20      .describe("The URL to read"),
21    
22    contentType: z.enum(["auto", "article", "documentation", "paper", "code"]).default("auto")
23      .describe("Hint for content type to optimize extraction"),
24    
25    maxLength: z.number().min(1000).max(50000).default(10000)
26      .describe("Maximum characters to return"),
27    
28    extractSections: z.boolean().default(true)
29      .describe("Whether to identify and label sections"),
30    
31    includeMetadata: z.boolean().default(true)
32      .describe("Include author, date, and other metadata")
33  }),
34 
35  execute: async (input) => {
36    return extractUrlContent(input);
37  }
38});
39```
40 
41## Input Schema
42 
43| Field | Type | Required | Description |
44|-------|------|----------|-------------|
45| url | string | Yes | URL to read |
46| contentType | enum | No | Content type hint |
47| maxLength | number | No | Max chars (default: 10000) |
48| extractSections | boolean | No | Label sections |
49| includeMetadata | boolean | No | Include metadata |
50 
51## Output Schema
52 
53```typescript
54interface ReadUrlResult {
55  success: boolean;
56  
57  url: string;
58  title: string;
59  
60  content: {
61    full: string;
62    sections?: {
63      heading: string;
64      level: number;  // h1=1, h2=2, etc.
65      content: string;
66    }[];
67  };
68  
69  metadata?: {
70    author?: string;
71    publishedDate?: string;
72    lastModified?: string;
73    description?: string;
74    keywords?: string[];
75    source: string;
76  };
77  
78  stats: {
79    totalCharacters: number;
80    truncated: boolean;
81    sectionsFound: number;
82  };
83  
84  error?: {
85    code: string;
86    message: string;
87  };
88}
89```
90 
91## Usage Example
92 
93```typescript
94const content = await readUrl.execute({
95  url: "https://eugeneyan.com/writing/llm-evaluators/",
96  contentType: "article",
97  maxLength: 15000,
98  extractSections: true,
99  includeMetadata: true
100});
101 
102// Result:
103// {
104//   success: true,
105//   url: "https://eugeneyan.com/writing/llm-evaluators/",
106//   title: "Evaluating the Effectiveness of LLM-Evaluators",
107//   content: {
108//     full: "LLM-evaluators, also known as LLM-as-a-Judge...",
109//     sections: [
110//       {
111//         heading: "Key considerations before adopting an LLM-evaluator",
112//         level: 2,
113//         content: "Before reviewing the literature..."
114//       },
115//       ...
116//     ]
117//   },
118//   metadata: {
119//     author: "Eugene Yan",
120//     publishedDate: "2024-06-15",
121//     source: "eugeneyan.com"
122//   },
123//   stats: {
124//     totalCharacters: 15000,
125//     truncated: true,
126//     sectionsFound: 8
127//   }
128// }
129```
130 
131## Content Type Handling
132 
133| Type | Optimization |
134|------|-------------|
135| article | Prioritize main content, skip sidebars |
136| documentation | Preserve code blocks, keep structure |
137| paper | Extract abstract, sections, references |
138| code | Preserve formatting, syntax highlighting |
139| auto | Detect type from content |
140 
141## Error Handling
142 
143```typescript
144const errorCodes = {
145  "URL_NOT_FOUND": "Page does not exist (404)",
146  "ACCESS_DENIED": "Page requires authentication (401/403)",
147  "TIMEOUT": "Request timed out",
148  "BLOCKED": "Access blocked by robots.txt or rate limit",
149  "INVALID_CONTENT": "Content could not be parsed",
150  "UNSUPPORTED_TYPE": "Content type not supported (e.g., binary)"
151};
152```
153 
154## Implementation Notes
155 
1561. **Respect robots.txt**: Check and honor robots.txt directives
1572. **Rate Limiting**: Don't hammer the same domain
1583. **User Agent**: Use appropriate user agent string
1594. **Timeouts**: Set reasonable timeouts (10-30s)
1605. **JavaScript Rendering**: Consider headless browser for JS-heavy sites
1616. **Caching**: Cache content for repeated reads
162 
163

Agent Skills for Context Engineering

examples/llm-as-judge-skills/tools/research/read-url.md

Preparing the source view

Agent Skills for Context Engineering

examples/llm-as-judge-skills/tools/research/read-url.md