Skill Change Rubric

Use this rubric after a source passes content curation. It decides whether the extracted mechanism should change the published skill corpus.

Hard Gates

Gate	Pass	Fail
S1 Distinct Activation	The change has a clear trigger or improves an existing trigger	No clear activation scenario
S2 Implementable Guidance	The change tells an agent what to do, when to do it, and what to avoid	Adds only background knowledge
S3 Corpus Fit	The change belongs in an existing skill or justifies a new skill boundary	Duplicates existing content without improvement
S4 Evidence Traceability	Every non-obvious claim maps to a retrieved source or internal example	Unsupported or uncited claim
S5 Maintainer Burden	The claim is stable enough for `SKILL.md` or isolated in references if volatile	Adds brittle numbers or vendor-specific churn to core instructions

Any failed gate routes to HUMAN_REVIEW or REJECT; do not publish automatically.

Scoring

Dimension	Weight	What To Check
Actionability	30%	Can a future agent apply this without additional research?
Relevance	25%	Does it improve context engineering, harness engineering, evaluation, memory, tools, or agent architecture?
Non-Duplication	20%	Does it add a new mechanism, failure mode, or sharper operating rule?
Evidence	15%	Is the claim backed by reproducible artifacts, benchmarks, or credible production experience?
Skill Ergonomics	10%	Does it keep discovery, line count, and progressive disclosure clean?

Score each dimension 0, 1, or 2. Approve only when weighted total is at least 1.4 and no hard gate fails.

New Skill vs Existing Skill

Update an existing skill when:

The mechanism shares the same activation scenario.
The current skill already owns the concept.
The update is a sharper guideline, gotcha, example, or reference.

Create a new skill only when:

The activation scenario is distinct and likely to be recognized by future agents.
The workflow has its own operating sequence.
Combining it with an existing skill would blur boundaries or exceed the 500-line budget.

Keep as reference-only when:

The source is credible but volatile.
The mechanism is interesting but not yet an operating rule.
Evidence is useful for background but not enough for published instructions.

Required Proposal Fields

Every proposed skill change must include:

target: "new skill | existing skill | reference only"
target_path: ""
activation_trigger: ""
mechanism: ""
evidence:
  - source_url: ""
    retrieved: true
    supports: ""
proposed_delta:
  section: ""
  change_type: "add | update | remove"
  summary: ""
risks:
  - ""
review_decision: "approve | human_review | reject"

Failure Modes

Encyclopedia bloat: Adding every interesting paper turns skills into literature reviews. Only publish mechanisms that change agent behavior.
Claim rot: Model-specific numbers age quickly. Put volatile evidence in dated references, not timeless guidance.
Trigger collision: Similar descriptions cause agents to activate the wrong skill. Keep skill boundaries sharper than taxonomy labels.
Reference laundering: Secondary summaries can point to primary sources but should not carry technical claims alone.
One-source overfit: A single credible source can justify human review, but broad guidance should have either reproduced evidence or multiple converging sources.

Preparing the source view