LLM Format Output Reference
Overview
The --format llm output is a compact, token-optimized hybrid XML/text format designed specifically for AI agent consumption. It provides structured audit data in a format that balances machine readability with token efficiency.
Key Characteristics
- 40-70% smaller than verbose XML format
- 1-space indentation for minimal token usage
- Hybrid structure: XML tags + text prefixes for metadata
- Inline attributes: Metadata stored as XML attributes, not nested elements
- Comma-separated lists: Pages and arrays formatted inline
- Flattened hierarchy: Reduced nesting depth compared to verbose XML
Format Structure
1. Document Header
<?xml version="1.0" encoding="UTF-8"?>
<audit version="0.0.24">2. Site Information
<site url="https://example.com" crawled="51" date="2025-01-18T10:30:00Z"/>Attributes:
url- Base URL auditedcrawled- Number of pages crawleddate- ISO 8601 timestamp
3. Health Score
<score overall="85" grade="B">
<cat name="Core SEO" score="92"/>
<cat name="Technical SEO" score="88"/>
<cat name="Content Quality" score="76"/>
</score>Attributes:
overall- 0-100 health scoregrade- Letter grade (A-F)- Categories with individual scores
4. Summary
<summary passed="120" warnings="15" failed="8"/>Attributes:
passed- Number of passed checkswarnings- Number of warningsfailed- Number of failed checks
5. Issues Section
Issues are grouped by category with compact inline metadata:
<issues>
<category name="Core SEO" errors="2" warnings="3">
<rule id="core/meta-title" severity="error" status="fail">
Missing or empty meta title tags
Desc: Every page should have a unique meta title
Fix: Add descriptive <title> tags to each page
Pages (2): https://example.com/about, https://example.com/contact
Items (2):
- https://example.com/about
- https://example.com/contact
</rule>
<rule id="core/meta-description" severity="warning" status="warn">
Desc: Pages should have meta descriptions
Fix: Add <meta name="description"> tags
Pages (5): https://example.com/page1, https://example.com/page2, ...
</rule>
</category>
<category name="Performance" errors="0" warnings="1">
...
</category>
</issues>#### Rule Structure
Each <rule> element contains:
Attributes:
id- Rule identifier (e.g.,core/meta-title)severity-error,warning, orinfostatus-pass,warn, orfail
Text Content (in order):
- Message (optional) - Human-readable issue summary
- Desc: - Rule description (what's being checked)
- Fix: - Recommended solution (how to fix)
- Pages (n): - Comma-separated list of affected URLs
- Items (n): - Dash-prefixed list of specific items with metadata
6. Items Format
Items provide detailed context about affected elements:
Items (3):
- https://example.com/missing-title (title: "")
- https://example.com/duplicate-title (title: "Home Page") (from: https://example.com/other)
- /broken-link [status: 404, type: internal] (from: https://example.com/contact)Item format:
- <id>- Primary identifier (URL, selector, etc.)(<label>)- Optional human-readable label if different from id[key: value, ...]- Metadata in square brackets(from: <sources>)- Source pages where item appears
Diff Output (LLM Format)
When using squirrel report --diff or --regression-since with --format llm, the output is a compact XML diff format:
<?xml version="1.0" encoding="UTF-8"?>
<diff version="0.0.24">
<baseline id="a7b3c2d1" url="https://example.com" date="2026-01-17T10:30:00Z" pages="42" score="87" grade="B"/>
<current id="b9c4e1f2" url="https://example.com" date="2026-01-18T10:30:00Z" pages="44" score="84" grade="B"/>
<summary added="3" removed="1" changed="2" regressions="1" improvements="1"/>
<added>
<issue fp="abc123" rule="core/meta-title" severity="error" status="fail" check="meta-title" category="core" weight="8">
Missing page title
Target: page /about
</issue>
</added>
<removed>
...
</removed>
<changed>
<change type="regression" fp="def456" rule="links/broken-links" severity="warning" status="fail" check="broken-links">
warn→fail: Broken link count increased
Before:
<issue ...> ... </issue>
After:
<issue ...> ... </issue>
</change>
</changed>
</diff>Key fields:
fp: deterministic fingerprint for the issue instancerule,check,severity,status: rule and check metadataTarget:: item/page/check targetchange type:regression,improvement, orchange
Example Output
<?xml version="1.0" encoding="UTF-8"?>
<audit version="0.0.24">
<site url="https://example.com" crawled="51" date="2025-01-18T10:30:00Z"/>
<score overall="78" grade="C">
<cat name="Core SEO" score="85"/>
<cat name="Technical SEO" score="92"/>
<cat name="Performance" score="65"/>
</score>
<summary passed="98" warnings="12" failed="5"/>
<issues>
<category name="Core SEO" errors="2" warnings="1">
<rule id="core/meta-title" severity="error" status="fail">
Missing meta title on 2 pages
Desc: Every page should have a unique meta title
Fix: Add descriptive <title> tags to each page
Pages (2): https://example.com/about, https://example.com/contact
Items (2):
- https://example.com/about
- https://example.com/contact
</rule>
</category>
</issues>
</audit>Usage
The LLM format is available via both audit and report commands:
# Direct LLM output (single step)
squirrel audit https://example.com --format llm
# Or two-step workflow
squirrel audit https://example.com
squirrel report <audit-id> --format llm
# Pipe directly to AI agent
squirrel audit https://example.com --format llm | claudeThe audit command supports --format llm directly for convenience. Use the two-step workflow when you need to generate reports in multiple formats from a single audit.
Comparison with Other Formats
| Format | Size | Structure | Best For |
|---|---|---|---|
xml | 209KB | Verbose, 2-space indent, fully nested | Enterprise integration, archival |
llm | 125KB | Compact, 1-space indent, hybrid | AI agents, token-limited contexts |
json | 180KB | Structured data | Programmatic processing |
text | 45KB | Plain text, no structure | Simple piping, grep |
Token Efficiency
The LLM format achieves 40-70% size reduction compared to verbose XML through:
- 1-space indentation instead of 2-4 spaces
- Inline attributes instead of nested elements
- Text prefixes (Desc:, Fix:) instead of XML tags
- Comma-separated lists instead of multiple elements
- Flattened hierarchy - fewer nesting levels
XML Character Escaping
Special characters are properly escaped:
&→&<→<>→>"→"'→'
Design Philosophy
The LLM format is optimized for:
- Token efficiency - Critical for API cost and context limits
- Easy parsing - XML structure for reliable extraction
- Human readability - AI agents can explain issues naturally
- Progressive disclosure - Summary → Categories → Rules → Items
- Actionable insights - Fix recommendations included inline
Implementation Notes
- Generated by
generateLlmReport()inapp/src/reports/output/llm.ts - Empty issues section renders as self-closing:
<issues/> - All text content is XML-escaped for safety
- Indentation uses spaces only (no tabs)
- Line endings are Unix-style (
\n)