Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
researcher/source-registry.md
1# Source Registry23Use this registry to decide what an autonomous researcher should monitor, what evidence is admissible, and what should be rejected before spending evaluation tokens.45## Priority Sources67| Tier | Source Class | Examples | Use |8| --- | --- | --- | --- |9| 1 | Peer-reviewed papers and major preprints | arXiv, OpenReview, conference proceedings | New mechanisms, benchmark results, ablations |10| 1 | AI lab engineering and research posts | OpenAI, Anthropic, DeepMind, Google Research, Meta, Microsoft, Cohere, Mistral, xAI | Production patterns, model behavior, agent architecture |11| 1 | Reproducible public code and benchmarks | GitHub repos, benchmark harnesses, leaderboards with logs | Harness design, validation methodology, implementation patterns |12| 2 | Infrastructure and agent product teams | Cursor, Vercel, LangChain, Cognition, Ramp, Prime Intellect, Modal, Browserbase | Operational lessons and system design patterns |13| 2 | Recognized practitioner deep dives | Maintainers, researchers, benchmark authors with public track record | Field reports and failure modes |14| 3 | Newsletters, summaries, podcasts, videos | Technical summaries with source links | Discovery leads only; evaluate primary sources before accepting |1516## Exclusion Rules1718Reject or defer sources that match any of these patterns:19201. Anonymous or unverifiable author with no primary evidence.212. Vendor marketing with no mechanism, artifact, metric, or reproducible claim.223. Basic tutorials that restate prompt engineering or RAG fundamentals.234. Claims based only on screenshots, demos, or private anecdotes without enough detail to implement.245. Content whose main insight is already covered in the repo without new evidence, failure modes, or implementation detail.2526## Monitoring Queries2728Use these query families when running web or paper discovery:2930- `context engineering agent systems tool design evaluation memory compression`31- `harness engineering AI agents eval harness agent loop scratchpad`32- `autonomous research agent self improving agents experiment loop`33- `LLM agent evaluation rubric source quality citation accuracy`34- `agent memory durable scratchpad file system state`35- `AlphaEvolve FunSearch autoresearch autonomous experimentation`36- `OpenAI Anthropic Cohere DeepMind agent engineering blog`3738## Source Metadata3940Every candidate source must record:4142```yaml43url: ""44title: ""45author_or_org: ""46published_at: ""47source_type: "paper | engineering_blog | documentation | benchmark | code | talk | other"48retrieval_status: "retrieved | partial | failed"49primary_or_secondary: "primary | secondary"50candidate_reason: ""51```5253## Refresh Cadence5455- Weekly: lab blogs, arXiv/OpenReview, public benchmark repos, and active engineering blogs.56- Monthly: older source revalidation for volatile claims, especially model-specific thresholds and benchmark numbers.57- Before PR: re-fetch every cited source and confirm the evidence still supports the proposed skill change.5859## Acceptance Biases To Avoid60611. Do not accept a weak artifact because the organization is famous.622. Do not reject negative or failed experiments if they reveal a practical failure mode.633. Do not overvalue long reports. The target is implementable mechanism density.644. Do not accept benchmark claims without checking evaluation setup, baselines, and limitations.655. Do not treat secondary summaries as sources of truth when primary sources are available.66