Source from repo
Agent Skills for Context Engineering

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
muratcankoylanGitHub muratcankoylanSource repo Original GitHub link
Files
241
Skill
n/a
Size
2.6 MB
Entrypoint
SKILL.md
Format
git-repo
Open file
docs/vercel_tool.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown140 linesFree
docs/vercel_tool.md
1---
2name: vercel-tool-reduction
3description: Vercel's case study on removing 80% of their agent's specialized tools and replacing them with a single file system agent tool, resulting in 100% success rate and improved performance.
4doc_type: blog
5source_url: https://vercel.com/blog/we-removed-80-percent-of-our-agents-tools
6---
7 
8We removed 80% of our agent's tools
9 
10Andrew Qu
11Chief of Software, Vercel
124 min read
13 
14 
15Copy URL
16Copied to clipboard!
17Dec 22, 2025
18It got better.
19 
20We spent months building a sophisticated internal text-to-SQL agent, d0, with specialized tools, heavy prompt engineering, and careful context management. It worked… kind of. But it was fragile, slow, and required constant maintenance.
21 
22So we tried something different. We deleted most of it and stripped the agent down to a single tool: execute arbitrary bash commands. We call this a file system agent. Claude gets direct access to your files and figures things out using grep, cat, and ls.
23 
24The agent got simpler and better at the same time. 100% success rate instead of 80%. Fewer steps, fewer tokens, faster responses. All by doing less.
25 
26Link to headingWhat is d0
27If v0 is our AI for building UI, d0 is our AI for understanding data.
28 
29d0 enables anyone to make data-driven decisions by asking it questions in Slack
30d0 enables anyone to make data-driven decisions by asking it questions in Slack
31d0 translates natural language questions into SQL queries against our analytics infrastructure, letting anyone on the team get answers without writing code or waiting on the data team.
32 
33When d0 works well, it democratizes data access across the company. When it breaks, people lose trust and go back to pinging analysts in Slack. We need d0 to be fast, accurate, and reliable.
34 
35Link to headingGetting out of the model's way
36Looking back, we were solving problems the model could handle on its own. We assumed it would get lost in complex schemas, make bad joins, or hallucinate table names. So we built guardrails. We pre-filtered context, constrained its options, and wrapped every interaction in validation logic. We were doing the model’s thinking for it:
37 
38Built multiple specialized tools (schema lookup, query validation, error recovery, etc.)
39 
40Added heavy prompt engineering to constrain reasoning
41 
42Utilized careful context management to avoid overwhelming the model
43 
44Wrote hand-coded retrieval to surface “relevant” schema information and dimensional attributes
45 
46Every edge case meant another patch, and every model update meant re-calibrating our constraints. We were spending more time maintaining the scaffolding than improving the agent.
47 
48[email protected] ToolLoopAgent
49 
50import { ToolLoopAgent } from 'ai';
51import { GetEntityJoins, LoadCatalog, /*...*/ } from '@/lib/tools'
52const agent = new ToolLoopAgent({
53  model: "anthropic/claude-opus-4.5",
54  instructions: "",
55  tools: {
56      GetEntityJoins, LoadCatalog, RecallContext, LoadEntityDetails, 
57      SearchCatalog, ClarifyIntent, SearchSchema, GenerateAnalysisPlan, 
58      FinalizeQueryPlan, FinalizeNoData, JoinPathFinder, SyntaxValidator, 
59      FinalizeBuild, ExecuteSQL, FormatResults, VisualizeData, ExplainResults
60    },
61});
62Link to headingA new idea, what if we just… stopped?
63We realized we were fighting gravity. Constraining the model’s reasoning. Summarizing information that it could read on its own. Building tools to protect it from complexity that it could handle.
64 
65So we stopped. The hypothesis was, what if we just give Claude access to the raw Cube DSL files and let it cook? What if bash is all you need? Models are getting smarter and context windows are getting larger, so maybe the best agent architecture is almost no architecture at all.
66 
67Link to headingv2: The file system is the agent
68The new stack:
69 
70Model: Claude Opus 4.5 via the AI SDK
71 
72Execution: Vercel Sandbox for context exploration
73 
74Routing: Vercel Gateway for request handling and observability
75 
76Server: Next.js API route using Vercel Slack Bolt
77 
78Data layer: Cube semantic layer as a directory of YAML, Markdown, and JSON files
79 
80The file system agent now browses our semantic layer the way a human analyst would. It reads files, greps for patterns, builds mental models, and writes SQL using standard Unix tools like grep, cat, find, and ls.
81 
82This works because the semantic layer is already great documentation. The files contain dimension definitions, measure calculations, and join relationships. We were building tools to summarize what was already legible. Claude just needed access to read it directly.
83 
84[email protected] ToolLoopAgent
85 
86import { Sandbox } from "@vercel/sandbox";
87import { files } from './semantic-catalog'
88import { tool, ToolLoopAgent } from "ai";
89import { ExecuteSQL } from "@/lib/tools";}
90 
91const sandbox = await Sandbox.create();
92await sandbox.writeFiles(files);
93 
94const executeCommandTool(sandbox: Sandbox) {
95  return tool({
96    /* ... */
97    execute: async ({ command }) => {
98      const result = await sandbox.exec(command);
99      return { /* */ };
100    }
101  })
102}
103 
104const agent = new ToolLoopAgent({
105  model: "anthropic/claude-opus-4.5",
106  instructions: "",
107  tools: {
108    ExecuteCommand: executeCommandTool(sandbox),
109    ExecuteSQL,
110   },
111})
112Link to heading3.5x faster, 37% fewer tokens, 100% success rate
113We benchmarked the old architecture against the new file system approach across 5 representative queries.
114 
115Metric	Advanced (old)	File system (new)	Change
116Avg execution time	274.8s	77.4s	3.5x faster
117Success rate	4/5 (80%)	5/5 (100%)	+20%
118Avg token usage	~102k tokens	~61k tokens	37% fewer tokens
119Avg steps	~12 steps	~7 steps	42% fewer steps
120The file system agent won every comparison. The old architecture’s worst case was Query 2, which took 724 seconds, 100 steps, and 145,463 tokens before failing. The file system agent completed the same query in 141 seconds with 19 steps and 67,483 tokens, and it actually succeeded.
121 
122The qualitative shift matters just as much. The agent catches edge cases we never anticipated and explains its reasoning in ways we can follow.
123 
124Link to headingLessons learned
125Don’t fight gravity. File systems are an incredibly powerful abstraction. Grep is 50 years old and still does exactly what we need. We were building custom tools for what Unix already solves.
126 
127We were constraining reasoning because we didn’t trust the model to reason. With Opus 4.5, that constraint became a liability. The model makes better choices when we stop making choices for it.
128 
129This only worked because our semantic layer was already good documentation. The YAML files are well-structured, consistently named, and contain clear definitions. If your data layer is a mess of legacy naming conventions and undocumented joins, giving Claude raw file access won’t save you. You’ll just get faster bad queries.
130 
131Addition by subtraction is real. The best agents might be the ones with the fewest tools. Every tool is a choice you’re making for the model. Sometimes the model makes better choices.
132 
133Link to headingWhat this means for agent builders
134The temptation is always to account for every possibility. Resist it. Start with the simplest possible architecture. Model + file system + goal. Add complexity only when you’ve proven it’s necessary.
135 
136But simple architecture isn’t enough on its own. The model needs good context to work with. Invest in documentation, clear naming, and well-structured data. That foundation matters more than clever tooling.
137 
138Models are improving faster than your tooling can keep up. Build for the model that you’ll have in six months, not for the one that you have today.
139 
140If you’re building agents, we’d love to hear what you’re learning.
Preparing the source view

Agent Skills for Context Engineering

docs/vercel_tool.md