Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
docs/vercel_tool.md
1---2name: vercel-tool-reduction3description: Vercel's case study on removing 80% of their agent's specialized tools and replacing them with a single file system agent tool, resulting in 100% success rate and improved performance.4doc_type: blog5source_url: https://vercel.com/blog/we-removed-80-percent-of-our-agents-tools6---78We removed 80% of our agent's tools910Andrew Qu11Chief of Software, Vercel124 min read131415Copy URL16Copied to clipboard!17Dec 22, 202518It got better.1920We spent months building a sophisticated internal text-to-SQL agent, d0, with specialized tools, heavy prompt engineering, and careful context management. It worked… kind of. But it was fragile, slow, and required constant maintenance.2122So we tried something different. We deleted most of it and stripped the agent down to a single tool: execute arbitrary bash commands. We call this a file system agent. Claude gets direct access to your files and figures things out using grep, cat, and ls.2324The agent got simpler and better at the same time. 100% success rate instead of 80%. Fewer steps, fewer tokens, faster responses. All by doing less.2526Link to headingWhat is d027If v0 is our AI for building UI, d0 is our AI for understanding data.2829d0 enables anyone to make data-driven decisions by asking it questions in Slack30d0 enables anyone to make data-driven decisions by asking it questions in Slack31d0 translates natural language questions into SQL queries against our analytics infrastructure, letting anyone on the team get answers without writing code or waiting on the data team.3233When d0 works well, it democratizes data access across the company. When it breaks, people lose trust and go back to pinging analysts in Slack. We need d0 to be fast, accurate, and reliable.3435Link to headingGetting out of the model's way36Looking back, we were solving problems the model could handle on its own. We assumed it would get lost in complex schemas, make bad joins, or hallucinate table names. So we built guardrails. We pre-filtered context, constrained its options, and wrapped every interaction in validation logic. We were doing the model’s thinking for it:3738Built multiple specialized tools (schema lookup, query validation, error recovery, etc.)3940Added heavy prompt engineering to constrain reasoning4142Utilized careful context management to avoid overwhelming the model4344Wrote hand-coded retrieval to surface “relevant” schema information and dimensional attributes4546Every edge case meant another patch, and every model update meant re-calibrating our constraints. We were spending more time maintaining the scaffolding than improving the agent.4748[email protected] ToolLoopAgent4950import { ToolLoopAgent } from 'ai';51import { GetEntityJoins, LoadCatalog, /*...*/ } from '@/lib/tools'52const agent = new ToolLoopAgent({53model: "anthropic/claude-opus-4.5",54instructions: "",55tools: {56GetEntityJoins, LoadCatalog, RecallContext, LoadEntityDetails,57SearchCatalog, ClarifyIntent, SearchSchema, GenerateAnalysisPlan,58FinalizeQueryPlan, FinalizeNoData, JoinPathFinder, SyntaxValidator,59FinalizeBuild, ExecuteSQL, FormatResults, VisualizeData, ExplainResults60},61});62Link to headingA new idea, what if we just… stopped?63We realized we were fighting gravity. Constraining the model’s reasoning. Summarizing information that it could read on its own. Building tools to protect it from complexity that it could handle.6465So we stopped. The hypothesis was, what if we just give Claude access to the raw Cube DSL files and let it cook? What if bash is all you need? Models are getting smarter and context windows are getting larger, so maybe the best agent architecture is almost no architecture at all.6667Link to headingv2: The file system is the agent68The new stack:6970Model: Claude Opus 4.5 via the AI SDK7172Execution: Vercel Sandbox for context exploration7374Routing: Vercel Gateway for request handling and observability7576Server: Next.js API route using Vercel Slack Bolt7778Data layer: Cube semantic layer as a directory of YAML, Markdown, and JSON files7980The file system agent now browses our semantic layer the way a human analyst would. It reads files, greps for patterns, builds mental models, and writes SQL using standard Unix tools like grep, cat, find, and ls.8182This works because the semantic layer is already great documentation. The files contain dimension definitions, measure calculations, and join relationships. We were building tools to summarize what was already legible. Claude just needed access to read it directly.8384[email protected] ToolLoopAgent8586import { Sandbox } from "@vercel/sandbox";87import { files } from './semantic-catalog'88import { tool, ToolLoopAgent } from "ai";89import { ExecuteSQL } from "@/lib/tools";}9091const sandbox = await Sandbox.create();92await sandbox.writeFiles(files);9394const executeCommandTool(sandbox: Sandbox) {95return tool({96/* ... */97execute: async ({ command }) => {98const result = await sandbox.exec(command);99return { /* */ };100}101})102}103104const agent = new ToolLoopAgent({105model: "anthropic/claude-opus-4.5",106instructions: "",107tools: {108ExecuteCommand: executeCommandTool(sandbox),109ExecuteSQL,110},111})112Link to heading3.5x faster, 37% fewer tokens, 100% success rate113We benchmarked the old architecture against the new file system approach across 5 representative queries.114115Metric Advanced (old) File system (new) Change116Avg execution time 274.8s 77.4s 3.5x faster117Success rate 4/5 (80%) 5/5 (100%) +20%118Avg token usage ~102k tokens ~61k tokens 37% fewer tokens119Avg steps ~12 steps ~7 steps 42% fewer steps120The file system agent won every comparison. The old architecture’s worst case was Query 2, which took 724 seconds, 100 steps, and 145,463 tokens before failing. The file system agent completed the same query in 141 seconds with 19 steps and 67,483 tokens, and it actually succeeded.121122The qualitative shift matters just as much. The agent catches edge cases we never anticipated and explains its reasoning in ways we can follow.123124Link to headingLessons learned125Don’t fight gravity. File systems are an incredibly powerful abstraction. Grep is 50 years old and still does exactly what we need. We were building custom tools for what Unix already solves.126127We were constraining reasoning because we didn’t trust the model to reason. With Opus 4.5, that constraint became a liability. The model makes better choices when we stop making choices for it.128129This only worked because our semantic layer was already good documentation. The YAML files are well-structured, consistently named, and contain clear definitions. If your data layer is a mess of legacy naming conventions and undocumented joins, giving Claude raw file access won’t save you. You’ll just get faster bad queries.130131Addition by subtraction is real. The best agents might be the ones with the fewest tools. Every tool is a choice you’re making for the model. Sometimes the model makes better choices.132133Link to headingWhat this means for agent builders134The temptation is always to account for every possibility. Resist it. Start with the simplest possible architecture. Model + file system + goal. Add complexity only when you’ve proven it’s necessary.135136But simple architecture isn’t enough on its own. The model needs good context to work with. Invest in documentation, clear naming, and well-structured data. That foundation matters more than clever tooling.137138Models are improving faster than your tooling can keep up. Build for the model that you’ll have in six months, not for the one that you have today.139140If you’re building agents, we’d love to hear what you’re learning.