Source from repo

Agent Skills for Context Engineering

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.

muratcankoylanGitHub muratcankoylanSource repo Original GitHub link

Files

339

Skill

n/a

Size

4.3 MB

Entrypoint

SKILL.md

Format

git-repo

Open file

researcher/fixtures/skill-proposals/harness-engineering-proposal.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown74 linesFree

researcher/fixtures/skill-proposals/harness-engineering-proposal.md

1# Skill Proposal: Harness Engineering From Autoresearch
2 
3## Source
4 
5- URL: https://github.com/karpathy/autoresearch/blob/master/program.md
6- Title: Karpathy autoresearch program
7- Author or organization: Andrej Karpathy
8- Source type: code
9- Retrieval status: retrieved
10- Evaluation file: `researcher/fixtures/source-evaluations/approved-harness-source.json`
11- Decision: HUMAN_REVIEW
12 
13## Mechanism
14 
15The source demonstrates a constrained autonomous experiment harness: the agent can edit one surface, the evaluator remains locked, each run receives fixed feedback, results are logged durably, and git rollback discards non-improving attempts. The transferable mechanism is not the nanoGPT task itself, but the boundary between editable and locked surfaces.
16 
17## Skill Target
18 
19- Target type: existing skill
20- Target path: `skills/harness-engineering/SKILL.md`
21- Activation scenario: designing an autonomous loop where editable artifacts are scored by locked evaluation surfaces
22- Related skills: evaluation, filesystem-context, project-development
23- Proposed location: SKILL.md
24 
25## Novelty Check
26 
27- Command: `python researcher/scripts/novelty_check.py --file researcher/fixtures/skill-proposals/harness-engineering-proposal.md --json`
28- Verdict: pass
29- Max mechanism overlap: 0.0866
30- Top mechanism overlaps: `locked-editable-surfaces`, `structured-novelty-gate`
31- Human-review rationale: Overlap is expected because the fixture seeded the harness pattern; the proposal predates the published mechanism registry and remains useful as a known-good example.
32 
33## Evidence
34 
35| Claim | Evidence | Source |
36| --- | --- | --- |
37| Autonomous loops need locked metrics | `prepare.py` owns evaluation while `train.py` is editable | Karpathy autoresearch |
38| Durable result logs prevent repeated failures | `results.tsv` records commit, metric, memory, status, and description | Karpathy autoresearch |
39| Rollback keeps failed attempts from polluting the frontier | Non-improving commits are reset | Karpathy autoresearch |
40 
41## Proposed Delta
42 
43```yaml
44changes:
45  - path: "skills/harness-engineering/SKILL.md"
46    section: "Core Concepts"
47    change_type: "add"
48    summary: "Explain locked vs editable surfaces and fixed feedback loops."
49```
50 
51## Quality Checks
52 
53- [x] Fits an existing activation scenario or justifies a new one.
54- [x] Adds an operating rule, workflow, artifact, gotcha, or reference.
55- [x] Records novelty-check verdict and top mechanism overlaps.
56- [x] Avoids duplicating existing skill guidance or accepted mechanisms.
57- [x] Keeps `SKILL.md` under 500 lines.
58- [x] Uses progressive disclosure for detailed or volatile evidence.
59- [x] Uses platform-agnostic wording.
60- [x] Updates README, root `SKILL.md`, and manifests if publishing a new skill.
61 
62## Risks And Gaps
63 
64- Evidence limitations: one benchmark environment; should not imply every research task has one scalar metric.
65- Possible duplication: overlaps with evaluation, but focuses on the control loop around evaluation.
66- Volatile claims: avoid embedding star counts or time-sensitive popularity metrics.
67- Required human review: O3 applies because evidence rigor is useful but narrow.
68 
69## Recommendation
70 
71`update-existing-skill`
72 
73Use the source as a core example for `harness-engineering`, with general wording that applies beyond nanoGPT.
74

Agent Skills for Context Engineering

researcher/fixtures/skill-proposals/harness-engineering-proposal.md

Preparing the source view

Agent Skills for Context Engineering

researcher/fixtures/skill-proposals/harness-engineering-proposal.md