Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
examples/x-to-book-system/PRD.md
1# PRD: X-to-Book Multi-Agent System23## Overview45A multi-agent system that monitors target X (Twitter) accounts daily, synthesizes their content, and generates structured books from accumulated insights. The system uses context engineering principles to handle high-volume social data while maintaining coherent long-form output.67## Problem Statement89Manual curation of insights from X accounts is time-consuming and inconsistent. Existing tools dump raw data without synthesis. We need a system that:10- Continuously monitors specified X accounts11- Extracts meaningful patterns and insights across time12- Produces structured, coherent daily book outputs13- Maintains temporal awareness of how narratives evolve1415## Architecture1617### Multi-Agent Pattern Selection: Supervisor/Orchestrator1819Based on the context engineering patterns, we use a **supervisor architecture** because:201. Book production has clear sequential phases (scrape, analyze, synthesize, write, edit)212. Quality gates require central coordination223. Human oversight points are well-defined234. Context isolation per phase prevents attention saturation2425```26User Config -> Orchestrator -> [Scraper, Analyzer, Synthesizer, Writer, Editor] -> Daily Book27```2829### Agent Definitions3031#### 1. Orchestrator Agent32**Purpose**: Central coordinator that manages workflow, maintains state, routes to specialists.3334**Context Budget**: Reserved for task decomposition, quality gates, and synthesis coordination. Does not carry raw tweet data.3536**Responsibilities**:37- Decompose daily book task into subtasks38- Route to appropriate specialist agents39- Implement checkpoint/resume for long-running operations40- Aggregate results without paraphrasing (avoid telephone game problem)4142```python43class OrchestratorState(TypedDict):44target_accounts: List[str]45current_phase: str46phase_outputs: Dict[str, Any]47quality_scores: Dict[str, float]48book_outline: str49checkpoints: List[Dict]50```5152#### 2. Scraper Agent53**Purpose**: Fetch and normalize content from target X accounts.5455**Context Budget**: Minimal. Operates on one account at a time, outputs to file system.5657**Tools**:58- `fetch_timeline(account_id, since_date, until_date)` - Retrieve tweets in date range59- `fetch_thread(tweet_id)` - Expand full thread context60- `fetch_engagement_metrics(tweet_ids)` - Get likes/retweets/replies61- `write_to_store(account_id, data)` - Persist to file system6263**Output**: Structured JSON per account, written to file system (not passed through context).6465#### 3. Analyzer Agent66**Purpose**: Extract patterns, themes, and insights from raw content.6768**Context Budget**: Moderate. Processes one account's data at a time via file system reads.6970**Responsibilities**:71- Topic extraction and clustering72- Sentiment analysis over time73- Key insight identification74- Thread narrative extraction75- Controversy/debate identification7677**Output**: Structured analysis per account with:78- Top themes (ranked by frequency and engagement)79- Notable quotes (with context)80- Narrative arcs (multi-tweet threads)81- Temporal patterns (time-of-day, response patterns)8283#### 4. Synthesizer Agent84**Purpose**: Cross-account pattern recognition and theme consolidation.8586**Context Budget**: High. Receives summaries from all analyzed accounts.8788**Responsibilities**:89- Identify cross-account themes90- Detect agreement/disagreement patterns91- Build narrative connections92- Generate book outline with chapter structure9394**Output**: Book outline with:95- Chapter structure96- Theme assignments per chapter97- Source attribution map98- Suggested narrative flow99100#### 5. Writer Agent101**Purpose**: Generate book content from outline and source material.102103**Context Budget**: Per-chapter allocation. Works on one chapter at a time.104105**Responsibilities**:106- Draft chapter content following outline107- Integrate quotes with proper attribution108- Maintain consistent voice and style109- Handle transitions between themes110111**Output**: Draft chapters in markdown format.112113#### 6. Editor Agent114**Purpose**: Quality assurance and refinement.115116**Context Budget**: Per-chapter. Reviews one chapter at a time.117118**Responsibilities**:119- Fact-check against source material120- Verify quote accuracy121- Check narrative coherence122- Flag potential issues for human review123124**Output**: Edited chapters with revision notes.125126---127128## Memory System Design129130### Architecture: Temporal Knowledge Graph131132Based on the memory-systems skill, we need a **temporal knowledge graph** because:133- Facts about accounts change over time (opinions shift, topics evolve)134- We need time-travel queries ("What was @account's position on X in January?")135- Cross-account relationships require graph traversal136- Simple vector stores lose relationship structure137138### Entity Types139140```python141entities = {142"Account": {143"properties": ["handle", "display_name", "bio", "follower_count", "following_count"]144},145"Tweet": {146"properties": ["content", "timestamp", "engagement_score", "thread_id"]147},148"Theme": {149"properties": ["name", "description", "first_seen", "last_seen"]150},151"Book": {152"properties": ["date", "title", "chapter_count", "word_count"]153},154"Chapter": {155"properties": ["title", "theme", "word_count", "source_accounts"]156}157}158```159160### Relationship Types161162```python163relationships = {164"POSTED": {165"from": "Account",166"to": "Tweet",167"temporal": True168},169"DISCUSSES": {170"from": "Tweet",171"to": "Theme",172"temporal": True,173"properties": ["sentiment", "stance"]174},175"RESPONDS_TO": {176"from": "Tweet",177"to": "Tweet"178},179"AGREES_WITH": {180"from": "Account",181"to": "Account",182"temporal": True,183"properties": ["on_theme"]184},185"DISAGREES_WITH": {186"from": "Account",187"to": "Account",188"temporal": True,189"properties": ["on_theme"]190},191"CONTAINS": {192"from": "Book",193"to": "Chapter"194},195"SOURCES": {196"from": "Chapter",197"to": "Tweet"198}199}200```201202### Memory Retrieval Patterns203204```python205# What has @account said about AI in the last 30 days?206query_account_theme_temporal(account_id, theme="AI", days=30)207208# Which accounts disagree on crypto?209query_disagreement_network(theme="crypto")210211# What quotes should be in today's book about regulation?212query_quotable_content(theme="regulation", min_engagement=100)213```214215---216217## Context Optimization Strategy218219### Challenge220221X data is high-volume. A target account with 20 tweets/day across 10 accounts = 200 tweets/day. Each tweet with thread context averages 500 tokens. Daily raw context = 100k tokens before analysis.222223### Optimization Techniques224225#### 1. Observation Masking226Raw tweet data is processed by Scraper, written to file system, and never passed through Orchestrator context.227228```python229# Instead of passing raw tweets through context230# Scraper writes to file system231scraper.write_to_store(account_id, raw_tweets)232233# Analyzer reads from file system234raw_data = analyzer.read_from_store(account_id)235```236237#### 2. Compaction Triggers238239```python240COMPACTION_THRESHOLD = 0.7 # 70% context utilization241242if context_utilization > COMPACTION_THRESHOLD:243# Summarize older phase outputs244phase_outputs = compact_phase_outputs(phase_outputs)245```246247#### 3. Progressive Disclosure248249Book outline loads first (lightweight). Full chapter content loads only when Writer is working on that chapter.250251```python252# Level 1: Outline only253book_outline = {254"chapters": [255{"title": "Chapter 1", "themes": ["AI", "Regulation"], "word_count_target": 2000}256]257}258259# Level 2: Full chapter context (only when writing)260chapter_context = load_chapter_context(chapter_id)261```262263#### 4. KV-Cache Optimization264265System prompt and tool definitions are stable across runs. Structure context for cache hits:266267```python268context_order = [269system_prompt, # Stable, cacheable270tool_definitions, # Stable, cacheable271account_config, # Semi-stable272daily_outline, # Changes daily273current_task # Changes per call274]275```276277---278279## Tool Design280281### Consolidation Principle Applied282283Instead of multiple narrow tools, we implement comprehensive tools per domain:284285#### X Data Tool (Consolidated)286287```python288def x_data_tool(289action: Literal["fetch_timeline", "fetch_thread", "fetch_engagement", "search"],290account_id: Optional[str] = None,291tweet_id: Optional[str] = None,292query: Optional[str] = None,293since_date: Optional[str] = None,294until_date: Optional[str] = None,295format: Literal["concise", "detailed"] = "concise"296) -> Dict:297"""298Unified X data retrieval tool.299300Use when:301- Fetching timeline for target account monitoring302- Expanding thread context for full conversation303- Getting engagement metrics for content prioritization304- Searching for specific topics across accounts305306Actions:307- fetch_timeline: Get tweets from account in date range308- fetch_thread: Expand full thread from single tweet309- fetch_engagement: Get likes/retweets/replies310- search: Search across accounts for query311312Returns:313- concise: tweet_id, content_preview, timestamp, engagement_score314- detailed: full content, thread context, all engagement metrics, reply preview315316Errors:317- RATE_LIMITED: Wait {retry_after} seconds318- ACCOUNT_PRIVATE: Cannot access private account319- NOT_FOUND: Tweet/account does not exist320"""321```322323#### Memory Tool (Consolidated)324325```python326def memory_tool(327action: Literal["store", "query", "update_validity", "consolidate"],328entity_type: Optional[str] = None,329entity_id: Optional[str] = None,330relationship_type: Optional[str] = None,331query_params: Optional[Dict] = None,332as_of_date: Optional[str] = None333) -> Dict:334"""335Unified memory system tool.336337Use when:338- Storing new facts discovered from X data339- Querying historical information about accounts/themes340- Updating validity periods when facts change341- Running consolidation to merge duplicate facts342343Actions:344- store: Add new entity or relationship345- query: Retrieve entities/relationships matching params346- update_validity: Mark fact as expired with valid_until347- consolidate: Merge duplicates and cleanup348349Returns entity/relationship data or query results.350"""351```352353#### Writing Tool (Consolidated)354355```python356def writing_tool(357action: Literal["draft", "edit", "format", "export"],358content: Optional[str] = None,359chapter_id: Optional[str] = None,360style_guide: Optional[str] = None,361output_format: Literal["markdown", "html", "pdf"] = "markdown"362) -> Dict:363"""364Unified book writing tool.365366Use when:367- Drafting new chapter content368- Editing existing content for quality369- Formatting content for output370- Exporting final book371372Actions:373- draft: Create initial chapter draft374- edit: Apply revisions to existing content375- format: Apply styling and formatting376- export: Generate final output file377"""378```379380---381382## Evaluation Framework383384### Multi-Dimensional Rubric385386Based on the evaluation skill, we define quality dimensions:387388| Dimension | Weight | Excellent | Acceptable | Failed |389|-----------|--------|-----------|------------|--------|390| Source Accuracy | 30% | All quotes verified, proper attribution | Minor attribution errors | Fabricated quotes |391| Thematic Coherence | 25% | Clear narrative thread, logical flow | Some disconnected sections | No coherent narrative |392| Completeness | 20% | Covers all major themes from sources | Misses some themes | Major gaps |393| Insight Quality | 15% | Novel synthesis across sources | Restates obvious points | No synthesis |394| Readability | 10% | Engaging, well-structured prose | Adequate but dry | Unreadable |395396### Automated Evaluation Pipeline397398```python399def evaluate_daily_book(book: Book, source_data: Dict) -> EvaluationResult:400scores = {}401402# Source accuracy: verify quotes against original tweets403scores["source_accuracy"] = verify_quotes(book.chapters, source_data)404405# Thematic coherence: LLM-as-judge for narrative flow406scores["thematic_coherence"] = judge_coherence(book)407408# Completeness: check theme coverage409scores["completeness"] = calculate_theme_coverage(book, source_data)410411# Insight quality: LLM-as-judge for synthesis412scores["insight_quality"] = judge_insights(book, source_data)413414# Readability: automated metrics + LLM judge415scores["readability"] = assess_readability(book)416417overall = weighted_average(scores, DIMENSION_WEIGHTS)418419return EvaluationResult(420passed=overall >= 0.7,421scores=scores,422overall=overall,423flagged_issues=identify_issues(scores)424)425```426427### Human Review Triggers428429- Overall score < 0.7430- Source accuracy < 0.8431- Any fabricated quote detected432- New account added (first book needs review)433- Controversial topic detected434435---436437## Data Flow438439```440┌─────────────────────────────────────────────────────────────────────────────┐441│ DAILY PIPELINE │442└─────────────────────────────────────────────────────────────────────────────┘443│444▼445┌─────────────────────────────────────────────────────────────────────────────┐446│ 1. SCRAPE PHASE │447│ Scraper Agent → X API → File System (raw_data/{account}/{date}.json) │448│ Context: Minimal (tool calls only) │449│ Output: Raw tweet data persisted to file system │450└─────────────────────────────────────────────────────────────────────────────┘451│452▼453┌─────────────────────────────────────────────────────────────────────────────┐454│ 2. ANALYZE PHASE │455│ Analyzer Agent → File System → Memory Store │456│ Context: One account at a time │457│ Output: Structured analysis per account + Knowledge Graph updates │458└─────────────────────────────────────────────────────────────────────────────┘459│460▼461┌─────────────────────────────────────────────────────────────────────────────┐462│ 3. SYNTHESIZE PHASE │463│ Synthesizer Agent → Analysis Summaries → Book Outline │464│ Context: Summaries from all accounts (compacted) │465│ Output: Book outline with chapter structure │466└─────────────────────────────────────────────────────────────────────────────┘467│468▼469┌─────────────────────────────────────────────────────────────────────────────┐470│ 4. WRITE PHASE │471│ Writer Agent → Outline + Relevant Sources → Draft Chapters │472│ Context: One chapter at a time (progressive disclosure) │473│ Output: Draft markdown chapters │474└─────────────────────────────────────────────────────────────────────────────┘475│476▼477┌─────────────────────────────────────────────────────────────────────────────┐478│ 5. EDIT PHASE │479│ Editor Agent → Draft + Sources → Final Chapters │480│ Context: One chapter at a time │481│ Output: Edited chapters with revision notes │482└─────────────────────────────────────────────────────────────────────────────┘483│484▼485┌─────────────────────────────────────────────────────────────────────────────┐486│ 6. EVALUATE PHASE │487│ Evaluation Pipeline → Final Book → Quality Report │488│ Output: Pass/fail with scores, flagged issues │489└─────────────────────────────────────────────────────────────────────────────┘490│491▼492┌─────────────────────────────────────────────────────────────────────────────┐493│ 7. PUBLISH (if passed) or HUMAN REVIEW (if flagged) │494└─────────────────────────────────────────────────────────────────────────────┘495```496497---498499## Failure Modes and Mitigations500501### Failure: Orchestrator Context Saturation502**Symptom**: Orchestrator accumulates phase outputs, degrading routing decisions.503**Mitigation**: Phase outputs stored in file system, Orchestrator receives only summaries. Implement checkpointing to persist state.504505### Failure: X API Rate Limiting506**Symptom**: Scraper hits rate limits, incomplete data.507**Mitigation**:508- Implement circuit breaker with exponential backoff509- Checkpoint partial scrapes for resume510- Schedule scraping across time windows511512### Failure: Quote Hallucination513**Symptom**: Writer generates quotes not in source material.514**Mitigation**:515- Strict source attribution in writing prompt516- Editor agent verifies all quotes against source517- Automated quote verification in evaluation518519### Failure: Theme Drift520**Symptom**: Book themes diverge from actual source content.521**Mitigation**:522- Synthesizer receives grounded summaries only523- Writer tool includes source verification step524- Evaluation checks theme-source alignment525526### Failure: Coordination Overhead527**Symptom**: Agent communication latency exceeds content value.528**Mitigation**:529- Batch phase outputs530- Use file system for inter-agent data (no context passing for large payloads)531- Parallelize where possible (Scraper can run per-account in parallel)532533---534535## Configuration536537```yaml538# config.yaml539target_accounts:540- handle: "@account1"541priority: high542themes_of_interest: ["AI", "startups"]543- handle: "@account2"544priority: medium545themes_of_interest: ["regulation", "policy"]546547schedule:548scrape_time: "06:00" # UTC549publish_time: "08:00"550timezone: "UTC"551552book_settings:553target_word_count: 5000554min_chapters: 3555max_chapters: 7556style: "analytical" # analytical | narrative | summary557558quality_thresholds:559min_overall_score: 0.7560min_source_accuracy: 0.8561require_human_review_below: 0.75562563memory:564retention_days: 90565consolidation_frequency: "weekly"566567context_limits:568orchestrator: 50000569scraper: 20000570analyzer: 80000571synthesizer: 100000572writer: 80000573editor: 60000574```575576---577578## Implementation Phases579580### Phase 1: Core Pipeline (Week 1-2)581- Orchestrator with basic routing582- Scraper with X API integration583- File system storage584- Basic Writer producing markdown output585586### Phase 2: Analysis Layer (Week 3-4)587- Analyzer agent with theme extraction588- Synthesizer with cross-account patterns589- Book outline generation590591### Phase 3: Memory System (Week 5-6)592- Temporal knowledge graph implementation593- Entity and relationship storage594- Temporal queries for historical context595596### Phase 4: Quality Layer (Week 7-8)597- Editor agent598- Evaluation pipeline599- Human review interface600601### Phase 5: Production Hardening (Week 9-10)602- Checkpoint/resume603- Circuit breakers604- Monitoring and alerting605- Consolidation jobs606607---608609## Technical Stack (Recommended)610611| Component | Technology | Rationale |612|-----------|------------|-----------|613| Agent Framework | LangGraph | Graph-based state machines with explicit nodes/edges |614| Knowledge Graph | Neo4j or Memgraph | Native temporal queries, relationship traversal |615| Vector Store | Weaviate or Pinecone | Hybrid search (semantic + metadata filtering) |616| X API | Official API or Scraping fallback | Rate limits require careful management |617| Storage | PostgreSQL + S3 | Structured data + blob storage for content |618| Orchestration | Temporal.io | Durable workflows with checkpoint/resume |619620---621622## Open Questions6236241. **X API Access**: Official API vs scraping? Rate limits on official API are restrictive. Scraping has legal/TOS considerations.6256262. **Book Format**: Pure prose vs mixed media (including original tweet embeds)?6276283. **Attribution Model**: How prominent should account attribution be? Full quotes with handles vs paraphrased insights?6296304. **Monetization**: If books are sold, what are the IP implications of synthesizing public tweets?6316325. **Human-in-the-Loop**: How much editorial control? Full review of every book vs exception-based review?633634---635636## References637638- [Agent Skills for Context Engineering](https://github.com/muratcankoylan/Agent-Skills-for-Context-Engineering) - Context engineering patterns639- Multi-agent patterns skill - Supervisor architecture selection640- Memory systems skill - Temporal knowledge graph design641- Context optimization skill - Observation masking and compaction strategies642- Tool design skill - Consolidation principle for tools643- Evaluation skill - Multi-dimensional rubrics644645