Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
researcher/rubrics/skill-change.md
1# Skill Change Rubric23Use this rubric after a source passes content curation. It decides whether the extracted mechanism should change the published skill corpus.45## Hard Gates67| Gate | Pass | Fail |8| --- | --- | --- |9| S1 Distinct Activation | The change has a clear trigger or improves an existing trigger | No clear activation scenario |10| S2 Implementable Guidance | The change tells an agent what to do, when to do it, and what to avoid | Adds only background knowledge |11| S3 Corpus Fit | The change belongs in an existing skill or justifies a new skill boundary | Duplicates existing content without improvement |12| S4 Evidence Traceability | Every non-obvious claim maps to a retrieved source or internal example | Unsupported or uncited claim |13| S5 Maintainer Burden | The claim is stable enough for `SKILL.md` or isolated in references if volatile | Adds brittle numbers or vendor-specific churn to core instructions |1415Any failed gate routes to `HUMAN_REVIEW` or `REJECT`; do not publish automatically.1617## Scoring1819| Dimension | Weight | What To Check |20| --- | --- | --- |21| Actionability | 30% | Can a future agent apply this without additional research? |22| Relevance | 25% | Does it improve context engineering, harness engineering, evaluation, memory, tools, or agent architecture? |23| Non-Duplication | 20% | Does it add a new mechanism, failure mode, or sharper operating rule? |24| Evidence | 15% | Is the claim backed by reproducible artifacts, benchmarks, or credible production experience? |25| Skill Ergonomics | 10% | Does it keep discovery, line count, and progressive disclosure clean? |2627Score each dimension 0, 1, or 2. Approve only when weighted total is at least 1.4 and no hard gate fails.2829## New Skill vs Existing Skill3031Update an existing skill when:3233- The mechanism shares the same activation scenario.34- The current skill already owns the concept.35- The update is a sharper guideline, gotcha, example, or reference.3637Create a new skill only when:3839- The activation scenario is distinct and likely to be recognized by future agents.40- The workflow has its own operating sequence.41- Combining it with an existing skill would blur boundaries or exceed the 500-line budget.4243Keep as reference-only when:4445- The source is credible but volatile.46- The mechanism is interesting but not yet an operating rule.47- Evidence is useful for background but not enough for published instructions.4849## Required Proposal Fields5051Every proposed skill change must include:5253```yaml54target: "new skill | existing skill | reference only"55target_path: ""56activation_trigger: ""57mechanism: ""58evidence:59- source_url: ""60retrieved: true61supports: ""62proposed_delta:63section: ""64change_type: "add | update | remove"65summary: ""66risks:67- ""68review_decision: "approve | human_review | reject"69```7071## Failure Modes72731. **Encyclopedia bloat**: Adding every interesting paper turns skills into literature reviews. Only publish mechanisms that change agent behavior.742. **Claim rot**: Model-specific numbers age quickly. Put volatile evidence in dated references, not timeless guidance.753. **Trigger collision**: Similar descriptions cause agents to activate the wrong skill. Keep skill boundaries sharper than taxonomy labels.764. **Reference laundering**: Secondary summaries can point to primary sources but should not carry technical claims alone.775. **One-source overfit**: A single credible source can justify human review, but broad guidance should have either reproduced evidence or multiple converging sources.78