PRD: X-to-Book Multi-Agent System

Overview

A multi-agent system that monitors target X (Twitter) accounts daily, synthesizes their content, and generates structured books from accumulated insights. The system uses context engineering principles to handle high-volume social data while maintaining coherent long-form output.

Problem Statement

Manual curation of insights from X accounts is time-consuming and inconsistent. Existing tools dump raw data without synthesis. We need a system that:

Continuously monitors specified X accounts
Extracts meaningful patterns and insights across time
Produces structured, coherent daily book outputs
Maintains temporal awareness of how narratives evolve

Architecture

Multi-Agent Pattern Selection: Supervisor/Orchestrator

Based on the context engineering patterns, we use a supervisor architecture because:

Book production has clear sequential phases (scrape, analyze, synthesize, write, edit)
Quality gates require central coordination
Human oversight points are well-defined
Context isolation per phase prevents attention saturation

User Config -> Orchestrator -> [Scraper, Analyzer, Synthesizer, Writer, Editor] -> Daily Book

Agent Definitions

#### 1. Orchestrator Agent Purpose: Central coordinator that manages workflow, maintains state, routes to specialists.

Context Budget: Reserved for task decomposition, quality gates, and synthesis coordination. Does not carry raw tweet data.

Responsibilities:

Decompose daily book task into subtasks
Route to appropriate specialist agents
Implement checkpoint/resume for long-running operations
Aggregate results without paraphrasing (avoid telephone game problem)

class OrchestratorState(TypedDict):
    target_accounts: List[str]
    current_phase: str
    phase_outputs: Dict[str, Any]
    quality_scores: Dict[str, float]
    book_outline: str
    checkpoints: List[Dict]

#### 2. Scraper Agent Purpose: Fetch and normalize content from target X accounts.

Context Budget: Minimal. Operates on one account at a time, outputs to file system.

Tools:

fetch_timeline(account_id, since_date, until_date) - Retrieve tweets in date range
fetch_thread(tweet_id) - Expand full thread context
fetch_engagement_metrics(tweet_ids) - Get likes/retweets/replies
write_to_store(account_id, data) - Persist to file system

Output: Structured JSON per account, written to file system (not passed through context).

#### 3. Analyzer Agent Purpose: Extract patterns, themes, and insights from raw content.

Context Budget: Moderate. Processes one account's data at a time via file system reads.

Responsibilities:

Topic extraction and clustering
Sentiment analysis over time
Key insight identification
Thread narrative extraction
Controversy/debate identification

Output: Structured analysis per account with:

Top themes (ranked by frequency and engagement)
Notable quotes (with context)
Narrative arcs (multi-tweet threads)
Temporal patterns (time-of-day, response patterns)

#### 4. Synthesizer Agent Purpose: Cross-account pattern recognition and theme consolidation.

Context Budget: High. Receives summaries from all analyzed accounts.

Responsibilities:

Identify cross-account themes
Detect agreement/disagreement patterns
Build narrative connections
Generate book outline with chapter structure

Output: Book outline with:

Chapter structure
Theme assignments per chapter
Source attribution map
Suggested narrative flow

#### 5. Writer Agent Purpose: Generate book content from outline and source material.

Context Budget: Per-chapter allocation. Works on one chapter at a time.

Responsibilities:

Draft chapter content following outline
Integrate quotes with proper attribution
Maintain consistent voice and style
Handle transitions between themes

Output: Draft chapters in markdown format.

#### 6. Editor Agent Purpose: Quality assurance and refinement.

Context Budget: Per-chapter. Reviews one chapter at a time.

Responsibilities:

Fact-check against source material
Verify quote accuracy
Check narrative coherence
Flag potential issues for human review

Output: Edited chapters with revision notes.

Memory System Design

Architecture: Temporal Knowledge Graph

Based on the memory-systems skill, we need a temporal knowledge graph because:

Facts about accounts change over time (opinions shift, topics evolve)
We need time-travel queries ("What was @account's position on X in January?")
Cross-account relationships require graph traversal
Simple vector stores lose relationship structure

Entity Types

entities = {
    "Account": {
        "properties": ["handle", "display_name", "bio", "follower_count", "following_count"]
    },
    "Tweet": {
        "properties": ["content", "timestamp", "engagement_score", "thread_id"]
    },
    "Theme": {
        "properties": ["name", "description", "first_seen", "last_seen"]
    },
    "Book": {
        "properties": ["date", "title", "chapter_count", "word_count"]
    },
    "Chapter": {
        "properties": ["title", "theme", "word_count", "source_accounts"]
    }
}

Relationship Types

relationships = {
    "POSTED": {
        "from": "Account",
        "to": "Tweet",
        "temporal": True
    },
    "DISCUSSES": {
        "from": "Tweet",
        "to": "Theme",
        "temporal": True,
        "properties": ["sentiment", "stance"]
    },
    "RESPONDS_TO": {
        "from": "Tweet",
        "to": "Tweet"
    },
    "AGREES_WITH": {
        "from": "Account",
        "to": "Account",
        "temporal": True,
        "properties": ["on_theme"]
    },
    "DISAGREES_WITH": {
        "from": "Account",
        "to": "Account",
        "temporal": True,
        "properties": ["on_theme"]
    },
    "CONTAINS": {
        "from": "Book",
        "to": "Chapter"
    },
    "SOURCES": {
        "from": "Chapter",
        "to": "Tweet"
    }
}

Memory Retrieval Patterns

# What has @account said about AI in the last 30 days?
query_account_theme_temporal(account_id, theme="AI", days=30)

# Which accounts disagree on crypto?
query_disagreement_network(theme="crypto")

# What quotes should be in today's book about regulation?
query_quotable_content(theme="regulation", min_engagement=100)

Context Optimization Strategy

Challenge

X data is high-volume. A target account with 20 tweets/day across 10 accounts = 200 tweets/day. Each tweet with thread context averages 500 tokens. Daily raw context = 100k tokens before analysis.

Optimization Techniques

#### 1. Observation Masking Raw tweet data is processed by Scraper, written to file system, and never passed through Orchestrator context.

# Instead of passing raw tweets through context
# Scraper writes to file system
scraper.write_to_store(account_id, raw_tweets)

# Analyzer reads from file system
raw_data = analyzer.read_from_store(account_id)

#### 2. Compaction Triggers

COMPACTION_THRESHOLD = 0.7  # 70% context utilization

if context_utilization > COMPACTION_THRESHOLD:
    # Summarize older phase outputs
    phase_outputs = compact_phase_outputs(phase_outputs)

#### 3. Progressive Disclosure

Book outline loads first (lightweight). Full chapter content loads only when Writer is working on that chapter.

# Level 1: Outline only
book_outline = {
    "chapters": [
        {"title": "Chapter 1", "themes": ["AI", "Regulation"], "word_count_target": 2000}
    ]
}

# Level 2: Full chapter context (only when writing)
chapter_context = load_chapter_context(chapter_id)

#### 4. KV-Cache Optimization

System prompt and tool definitions are stable across runs. Structure context for cache hits:

context_order = [
    system_prompt,       # Stable, cacheable
    tool_definitions,    # Stable, cacheable
    account_config,      # Semi-stable
    daily_outline,       # Changes daily
    current_task         # Changes per call
]

Tool Design

Consolidation Principle Applied

Instead of multiple narrow tools, we implement comprehensive tools per domain:

#### X Data Tool (Consolidated)

def x_data_tool(
    action: Literal["fetch_timeline", "fetch_thread", "fetch_engagement", "search"],
    account_id: Optional[str] = None,
    tweet_id: Optional[str] = None,
    query: Optional[str] = None,
    since_date: Optional[str] = None,
    until_date: Optional[str] = None,
    format: Literal["concise", "detailed"] = "concise"
) -> Dict:
    """
    Unified X data retrieval tool.
    
    Use when:
    - Fetching timeline for target account monitoring
    - Expanding thread context for full conversation
    - Getting engagement metrics for content prioritization
    - Searching for specific topics across accounts
    
    Actions:
    - fetch_timeline: Get tweets from account in date range
    - fetch_thread: Expand full thread from single tweet
    - fetch_engagement: Get likes/retweets/replies
    - search: Search across accounts for query
    
    Returns:
    - concise: tweet_id, content_preview, timestamp, engagement_score
    - detailed: full content, thread context, all engagement metrics, reply preview
    
    Errors:
    - RATE_LIMITED: Wait {retry_after} seconds
    - ACCOUNT_PRIVATE: Cannot access private account
    - NOT_FOUND: Tweet/account does not exist
    """

#### Memory Tool (Consolidated)

def memory_tool(
    action: Literal["store", "query", "update_validity", "consolidate"],
    entity_type: Optional[str] = None,
    entity_id: Optional[str] = None,
    relationship_type: Optional[str] = None,
    query_params: Optional[Dict] = None,
    as_of_date: Optional[str] = None
) -> Dict:
    """
    Unified memory system tool.
    
    Use when:
    - Storing new facts discovered from X data
    - Querying historical information about accounts/themes
    - Updating validity periods when facts change
    - Running consolidation to merge duplicate facts
    
    Actions:
    - store: Add new entity or relationship
    - query: Retrieve entities/relationships matching params
    - update_validity: Mark fact as expired with valid_until
    - consolidate: Merge duplicates and cleanup
    
    Returns entity/relationship data or query results.
    """

#### Writing Tool (Consolidated)

def writing_tool(
    action: Literal["draft", "edit", "format", "export"],
    content: Optional[str] = None,
    chapter_id: Optional[str] = None,
    style_guide: Optional[str] = None,
    output_format: Literal["markdown", "html", "pdf"] = "markdown"
) -> Dict:
    """
    Unified book writing tool.
    
    Use when:
    - Drafting new chapter content
    - Editing existing content for quality
    - Formatting content for output
    - Exporting final book
    
    Actions:
    - draft: Create initial chapter draft
    - edit: Apply revisions to existing content
    - format: Apply styling and formatting
    - export: Generate final output file
    """

Evaluation Framework

Multi-Dimensional Rubric

Based on the evaluation skill, we define quality dimensions:

Dimension	Weight	Excellent	Acceptable	Failed
Source Accuracy	30%	All quotes verified, proper attribution	Minor attribution errors	Fabricated quotes
Thematic Coherence	25%	Clear narrative thread, logical flow	Some disconnected sections	No coherent narrative
Completeness	20%	Covers all major themes from sources	Misses some themes	Major gaps
Insight Quality	15%	Novel synthesis across sources	Restates obvious points	No synthesis
Readability	10%	Engaging, well-structured prose	Adequate but dry	Unreadable

Automated Evaluation Pipeline

def evaluate_daily_book(book: Book, source_data: Dict) -> EvaluationResult:
    scores = {}
    
    # Source accuracy: verify quotes against original tweets
    scores["source_accuracy"] = verify_quotes(book.chapters, source_data)
    
    # Thematic coherence: LLM-as-judge for narrative flow
    scores["thematic_coherence"] = judge_coherence(book)
    
    # Completeness: check theme coverage
    scores["completeness"] = calculate_theme_coverage(book, source_data)
    
    # Insight quality: LLM-as-judge for synthesis
    scores["insight_quality"] = judge_insights(book, source_data)
    
    # Readability: automated metrics + LLM judge
    scores["readability"] = assess_readability(book)
    
    overall = weighted_average(scores, DIMENSION_WEIGHTS)
    
    return EvaluationResult(
        passed=overall >= 0.7,
        scores=scores,
        overall=overall,
        flagged_issues=identify_issues(scores)
    )

Human Review Triggers

Overall score < 0.7
Source accuracy < 0.8
Any fabricated quote detected
New account added (first book needs review)
Controversial topic detected

Data Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                              DAILY PIPELINE                                  │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 1. SCRAPE PHASE                                                              │
│    Scraper Agent → X API → File System (raw_data/{account}/{date}.json)     │
│    Context: Minimal (tool calls only)                                        │
│    Output: Raw tweet data persisted to file system                           │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 2. ANALYZE PHASE                                                             │
│    Analyzer Agent → File System → Memory Store                               │
│    Context: One account at a time                                            │
│    Output: Structured analysis per account + Knowledge Graph updates         │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 3. SYNTHESIZE PHASE                                                          │
│    Synthesizer Agent → Analysis Summaries → Book Outline                     │
│    Context: Summaries from all accounts (compacted)                          │
│    Output: Book outline with chapter structure                               │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 4. WRITE PHASE                                                               │
│    Writer Agent → Outline + Relevant Sources → Draft Chapters                │
│    Context: One chapter at a time (progressive disclosure)                   │
│    Output: Draft markdown chapters                                           │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 5. EDIT PHASE                                                                │
│    Editor Agent → Draft + Sources → Final Chapters                           │
│    Context: One chapter at a time                                            │
│    Output: Edited chapters with revision notes                               │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 6. EVALUATE PHASE                                                            │
│    Evaluation Pipeline → Final Book → Quality Report                         │
│    Output: Pass/fail with scores, flagged issues                             │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 7. PUBLISH (if passed) or HUMAN REVIEW (if flagged)                          │
└─────────────────────────────────────────────────────────────────────────────┘

Failure Modes and Mitigations

Failure: Orchestrator Context Saturation

Symptom: Orchestrator accumulates phase outputs, degrading routing decisions. Mitigation: Phase outputs stored in file system, Orchestrator receives only summaries. Implement checkpointing to persist state.

Failure: X API Rate Limiting

Symptom: Scraper hits rate limits, incomplete data. Mitigation:

Implement circuit breaker with exponential backoff
Checkpoint partial scrapes for resume
Schedule scraping across time windows

Failure: Quote Hallucination

Symptom: Writer generates quotes not in source material. Mitigation:

Strict source attribution in writing prompt
Editor agent verifies all quotes against source
Automated quote verification in evaluation

Failure: Theme Drift

Symptom: Book themes diverge from actual source content. Mitigation:

Synthesizer receives grounded summaries only
Writer tool includes source verification step
Evaluation checks theme-source alignment

Failure: Coordination Overhead

Symptom: Agent communication latency exceeds content value. Mitigation:

Batch phase outputs
Use file system for inter-agent data (no context passing for large payloads)
Parallelize where possible (Scraper can run per-account in parallel)

Configuration

# config.yaml
target_accounts:
  - handle: "@account1"
    priority: high
    themes_of_interest: ["AI", "startups"]
  - handle: "@account2"
    priority: medium
    themes_of_interest: ["regulation", "policy"]

schedule:
  scrape_time: "06:00"  # UTC
  publish_time: "08:00"
  timezone: "UTC"

book_settings:
  target_word_count: 5000
  min_chapters: 3
  max_chapters: 7
  style: "analytical"  # analytical | narrative | summary

quality_thresholds:
  min_overall_score: 0.7
  min_source_accuracy: 0.8
  require_human_review_below: 0.75

memory:
  retention_days: 90
  consolidation_frequency: "weekly"
  
context_limits:
  orchestrator: 50000
  scraper: 20000
  analyzer: 80000
  synthesizer: 100000
  writer: 80000
  editor: 60000

Implementation Phases

Phase 1: Core Pipeline (Week 1-2)

Orchestrator with basic routing
Scraper with X API integration
File system storage
Basic Writer producing markdown output

Phase 2: Analysis Layer (Week 3-4)

Analyzer agent with theme extraction
Synthesizer with cross-account patterns
Book outline generation

Phase 3: Memory System (Week 5-6)

Temporal knowledge graph implementation
Entity and relationship storage
Temporal queries for historical context

Phase 4: Quality Layer (Week 7-8)

Editor agent
Evaluation pipeline
Human review interface

Phase 5: Production Hardening (Week 9-10)

Checkpoint/resume
Circuit breakers
Monitoring and alerting
Consolidation jobs

Technical Stack (Recommended)

Component	Technology	Rationale
Agent Framework	LangGraph	Graph-based state machines with explicit nodes/edges
Knowledge Graph	Neo4j or Memgraph	Native temporal queries, relationship traversal
Vector Store	Weaviate or Pinecone	Hybrid search (semantic + metadata filtering)
X API	Official API or Scraping fallback	Rate limits require careful management
Storage	PostgreSQL + S3	Structured data + blob storage for content
Orchestration	Temporal.io	Durable workflows with checkpoint/resume

Open Questions

X API Access: Official API vs scraping? Rate limits on official API are restrictive. Scraping has legal/TOS considerations.

Book Format: Pure prose vs mixed media (including original tweet embeds)?

Attribution Model: How prominent should account attribution be? Full quotes with handles vs paraphrased insights?

Monetization: If books are sold, what are the IP implications of synthesizing public tweets?

Human-in-the-Loop: How much editorial control? Full review of every book vs exception-based review?

References

Agent Skills for Context Engineering - Context engineering patterns
Multi-agent patterns skill - Supervisor architecture selection
Memory systems skill - Temporal knowledge graph design
Context optimization skill - Observation masking and compaction strategies
Tool design skill - Consolidation principle for tools
Evaluation skill - Multi-dimensional rubrics

Preparing the source view

Agent Skills for Context Engineering

examples/x-to-book-system/PRD.md

PRD: X-to-Book Multi-Agent System

Overview

Problem Statement

Architecture

Multi-Agent Pattern Selection: Supervisor/Orchestrator

Agent Definitions

Memory System Design

Architecture: Temporal Knowledge Graph

Entity Types

Relationship Types

Memory Retrieval Patterns

Context Optimization Strategy

Challenge

Optimization Techniques

Tool Design

Consolidation Principle Applied

Evaluation Framework

Multi-Dimensional Rubric

Automated Evaluation Pipeline

Human Review Triggers

Data Flow

Failure Modes and Mitigations

Failure: Orchestrator Context Saturation

Failure: X API Rate Limiting

Failure: Quote Hallucination

Failure: Theme Drift

Failure: Coordination Overhead

Configuration

Implementation Phases

Phase 1: Core Pipeline (Week 1-2)

Phase 2: Analysis Layer (Week 3-4)

Phase 3: Memory System (Week 5-6)

Phase 4: Quality Layer (Week 7-8)

Phase 5: Production Hardening (Week 9-10)

Technical Stack (Recommended)

Open Questions

References