Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
researcher/runbooks/autonomous-research-loop.md
1# Autonomous Research Loop23This runbook defines how an agent should operate when asked to find research and turn it into repo changes.45## Setup671. Create a run ID with `python researcher/scripts/research_loop.py init --title "..." --url "..."`.82. Read `../source-registry.md` and select source classes for the mission.93. Read `../mechanisms/registry.jsonl` to understand accepted mechanisms before claiming novelty.104. Read the relevant rubrics before evaluating anything.115. Declare locked surfaces: rubrics, manifests, mechanism registry, and merge policy are not editable during scoring.126. Declare editable surfaces: evaluations, proposals, drafts, run-local mechanism proposals, and append-only logs.1314## Loop1516Repeat until source queue is empty or the human stops the run:17181. Discover candidates from the source registry.192. Fetch primary sources whenever available and record them with `research_loop.py retrieve`.203. Record retrieval status before evaluating.214. Apply `../rubrics/content-curation.md`.225. Reject failed gates immediately and log why.236. For approved or reviewed sources, extract mechanisms and artifacts into the proposal.247. Apply `../rubrics/skill-change.md` or `../rubrics/harness-change.md`.258. Draft a proposal with `../templates/skill-proposal.md` and any mechanism proposals with `../templates/mechanism-proposal.jsonl`.269. Run `python researcher/scripts/research_loop.py novelty --run-dir <run>` before changing published skills; registry overlap is the primary duplicate signal.2710. If multiple drafts compete, apply `../rubrics/pairwise-skill-revision.md` and run `compare_skill_revisions.py`.2811. If the proposal passes, prepare repo changes in normal repo structure.2912. Run deterministic repo and run-readiness validation and record results.3013. Prepare PR summary and test plan, but do not merge.3114. Close the run with `accepted`, `rejected`, `reference-only`, or `abandoned` rationale.3233## Novelty And Refresh Rules3435- Before drafting a new skill, compare against accepted mechanisms and existing skill boundaries.36- Use `novelty_check.py` as a fast mechanism-overlap gate, then apply human or LLM judgment for semantic novelty.37- For long-running runs, refresh upstream sources before finalizing a proposal.38- Preserve rejected ideas so future agents do not rediscover the same failed path.39- Require a pruning pass when a proposal adds multiple rules or concepts. Remove any piece that does not change behavior.40- Store raw source exports under the run's `sources/evidence/raw/` directory, never at the repository root.41- Promote accepted or candidate mechanisms only through `research_loop.py promote-mechanisms` after run readiness and recorded human review.4243## Failure Handling4445| Failure | Action |46| --- | --- |47| Source fetch fails | Retry once with an alternate URL, then record `partial` or `failed` |48| JSON evaluation invalid | Save raw output and route to human review |49| Evidence weak but relevant | Route to human review, do not publish automatically |50| Skill draft exceeds 500 lines | Move detail to references or reject the draft |51| Manifest sync uncertain | Stop and request human review before PR |52| Conflicting sources | Record both claims and prefer no published change until resolved |5354## PR Preparation Policy5556Agents may prepare PRs only after:57581. Content and skill or harness rubrics pass.592. Deterministic checks pass.603. Every source cited in the change was retrieved.614. The PR body includes unresolved risks.625. The PR states that merge requires human approval.6364The user rule remains binding: do not push anything to GitHub without explicit approval.6566## Handover6768Before context compaction, interruption, or model handoff, update the run thread with:6970- Current best candidate.71- Evaluations completed and their file paths.72- Rejected candidates and reasons.73- Open risks.74- Next action.75