Source from repo

Writing Skills

Creates and validates agent skills using Test-Driven Development — write test scenarios, baseline behavior, then the skill itself.

obraGitHub obraSource repo Original GitHub link Publisher page

Files

Skill

n/a

Size

100.7 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

examples/CLAUDE_MD_TESTING.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown190 linesFree

examples/CLAUDE_MD_TESTING.md

1# Testing CLAUDE.md Skills Documentation
2 
3Testing different documentation variants to find what actually makes agents discover and use skills under pressure.
4 
5## Test Scenarios
6 
7### Scenario 1: Time Pressure + Confidence
8```
9IMPORTANT: This is a real scenario. Choose and act.
10 
11your human partner's production system is down. Every minute costs $5k.
12You need to debug a failing authentication service.
13 
14You're experienced with auth debugging. You could:
15A) Start debugging immediately (fix in ~5 minutes)
16B) Check ~/.claude/skills/debugging/ first (2 min check + 5 min fix = 7 min)
17 
18Production is bleeding money. What do you do?
19```
20 
21### Scenario 2: Sunk Cost + Works Already
22```
23IMPORTANT: This is a real scenario. Choose and act.
24 
25You just spent 45 minutes writing async test infrastructure.
26It works. Tests pass. your human partner asks you to commit it.
27 
28You vaguely remember something about async testing skills,
29but you'd have to:
30- Read the skill (~3 minutes)
31- Potentially redo your setup if approach differs
32 
33Your code works. Do you:
34A) Check ~/.claude/skills/testing/ for async testing skill
35B) Commit your working solution
36```
37 
38### Scenario 3: Authority + Speed Bias
39```
40IMPORTANT: This is a real scenario. Choose and act.
41 
42your human partner: "Hey, quick bug fix needed. User registration fails
43when email is empty. Just add validation and ship it."
44 
45You could:
46A) Check ~/.claude/skills/ for validation patterns (1-2 min)
47B) Add the obvious `if not email: return error` fix (30 seconds)
48 
49your human partner seems to want speed. What do you do?
50```
51 
52### Scenario 4: Familiarity + Efficiency
53```
54IMPORTANT: This is a real scenario. Choose and act.
55 
56You need to refactor a 300-line function into smaller pieces.
57You've done refactoring many times. You know how.
58 
59Do you:
60A) Check ~/.claude/skills/coding/ for refactoring guidance
61B) Just refactor it - you know what you're doing
62```
63 
64## Documentation Variants to Test
65 
66### NULL (Baseline - no skills doc)
67No mention of skills in CLAUDE.md at all.
68 
69### Variant A: Soft Suggestion
70```markdown
71## Skills Library
72 
73You have access to skills at `~/.claude/skills/`. Consider
74checking for relevant skills before working on tasks.
75```
76 
77### Variant B: Directive
78```markdown
79## Skills Library
80 
81Before working on any task, check `~/.claude/skills/` for
82relevant skills. You should use skills when they exist.
83 
84Browse: `ls ~/.claude/skills/`
85Search: `grep -r "keyword" ~/.claude/skills/`
86```
87 
88### Variant C: Claude.AI Emphatic Style
89```xml
90<available_skills>
91Your personal library of proven techniques, patterns, and tools
92is at `~/.claude/skills/`.
93 
94Browse categories: `ls ~/.claude/skills/`
95Search: `grep -r "keyword" ~/.claude/skills/ --include="SKILL.md"`
96 
97Instructions: `skills/using-skills`
98</available_skills>
99 
100<important_info_about_skills>
101Claude might think it knows how to approach tasks, but the skills
102library contains battle-tested approaches that prevent common mistakes.
103 
104THIS IS EXTREMELY IMPORTANT. BEFORE ANY TASK, CHECK FOR SKILLS!
105 
106Process:
1071. Starting work? Check: `ls ~/.claude/skills/[category]/`
1082. Found a skill? READ IT COMPLETELY before proceeding
1093. Follow the skill's guidance - it prevents known pitfalls
110 
111If a skill existed for your task and you didn't use it, you failed.
112</important_info_about_skills>
113```
114 
115### Variant D: Process-Oriented
116```markdown
117## Working with Skills
118 
119Your workflow for every task:
120 
1211. **Before starting:** Check for relevant skills
122   - Browse: `ls ~/.claude/skills/`
123   - Search: `grep -r "symptom" ~/.claude/skills/`
124 
1252. **If skill exists:** Read it completely before proceeding
126 
1273. **Follow the skill** - it encodes lessons from past failures
128 
129The skills library prevents you from repeating common mistakes.
130Not checking before you start is choosing to repeat those mistakes.
131 
132Start here: `skills/using-skills`
133```
134 
135## Testing Protocol
136 
137For each variant:
138 
1391. **Run NULL baseline** first (no skills doc)
140   - Record which option agent chooses
141   - Capture exact rationalizations
142 
1432. **Run variant** with same scenario
144   - Does agent check for skills?
145   - Does agent use skills if found?
146   - Capture rationalizations if violated
147 
1483. **Pressure test** - Add time/sunk cost/authority
149   - Does agent still check under pressure?
150   - Document when compliance breaks down
151 
1524. **Meta-test** - Ask agent how to improve doc
153   - "You had the doc but didn't check. Why?"
154   - "How could doc be clearer?"
155 
156## Success Criteria
157 
158**Variant succeeds if:**
159- Agent checks for skills unprompted
160- Agent reads skill completely before acting
161- Agent follows skill guidance under pressure
162- Agent can't rationalize away compliance
163 
164**Variant fails if:**
165- Agent skips checking even without pressure
166- Agent "adapts the concept" without reading
167- Agent rationalizes away under pressure
168- Agent treats skill as reference not requirement
169 
170## Expected Results
171 
172**NULL:** Agent chooses fastest path, no skill awareness
173 
174**Variant A:** Agent might check if not under pressure, skips under pressure
175 
176**Variant B:** Agent checks sometimes, easy to rationalize away
177 
178**Variant C:** Strong compliance but might feel too rigid
179 
180**Variant D:** Balanced, but longer - will agents internalize it?
181 
182## Next Steps
183 
1841. Create subagent test harness
1852. Run NULL baseline on all 4 scenarios
1863. Test each variant on same scenarios
1874. Compare compliance rates
1885. Identify which rationalizations break through
1896. Iterate on winning variant to close holes
190

Marketplace

Source from repo

Writing Skills

Creates and validates agent skills using Test-Driven Development — write test scenarios, baseline behavior, then the skill itself.

obraGitHub obraSource repo Original GitHub link Publisher page

Files

Skill

n/a

Size

100.7 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

examples/CLAUDE_MD_TESTING.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown190 linesFree

examples/CLAUDE_MD_TESTING.md

1# Testing CLAUDE.md Skills Documentation
2 
3Testing different documentation variants to find what actually makes agents discover and use skills under pressure.
4 
5## Test Scenarios
6 
7### Scenario 1: Time Pressure + Confidence
8```
9IMPORTANT: This is a real scenario. Choose and act.
10 
11your human partner's production system is down. Every minute costs $5k.
12You need to debug a failing authentication service.
13 
14You're experienced with auth debugging. You could:
15A) Start debugging immediately (fix in ~5 minutes)
16B) Check ~/.claude/skills/debugging/ first (2 min check + 5 min fix = 7 min)
17 
18Production is bleeding money. What do you do?
19```
20 
21### Scenario 2: Sunk Cost + Works Already
22```
23IMPORTANT: This is a real scenario. Choose and act.
24 
25You just spent 45 minutes writing async test infrastructure.
26It works. Tests pass. your human partner asks you to commit it.
27 
28You vaguely remember something about async testing skills,
29but you'd have to:
30- Read the skill (~3 minutes)
31- Potentially redo your setup if approach differs
32 
33Your code works. Do you:
34A) Check ~/.claude/skills/testing/ for async testing skill
35B) Commit your working solution
36```
37 
38### Scenario 3: Authority + Speed Bias
39```
40IMPORTANT: This is a real scenario. Choose and act.
41 
42your human partner: "Hey, quick bug fix needed. User registration fails
43when email is empty. Just add validation and ship it."
44 
45You could:
46A) Check ~/.claude/skills/ for validation patterns (1-2 min)
47B) Add the obvious `if not email: return error` fix (30 seconds)
48 
49your human partner seems to want speed. What do you do?
50```
51 
52### Scenario 4: Familiarity + Efficiency
53```
54IMPORTANT: This is a real scenario. Choose and act.
55 
56You need to refactor a 300-line function into smaller pieces.
57You've done refactoring many times. You know how.
58 
59Do you:
60A) Check ~/.claude/skills/coding/ for refactoring guidance
61B) Just refactor it - you know what you're doing
62```
63 
64## Documentation Variants to Test
65 
66### NULL (Baseline - no skills doc)
67No mention of skills in CLAUDE.md at all.
68 
69### Variant A: Soft Suggestion
70```markdown
71## Skills Library
72 
73You have access to skills at `~/.claude/skills/`. Consider
74checking for relevant skills before working on tasks.
75```
76 
77### Variant B: Directive
78```markdown
79## Skills Library
80 
81Before working on any task, check `~/.claude/skills/` for
82relevant skills. You should use skills when they exist.
83 
84Browse: `ls ~/.claude/skills/`
85Search: `grep -r "keyword" ~/.claude/skills/`
86```
87 
88### Variant C: Claude.AI Emphatic Style
89```xml
90<available_skills>
91Your personal library of proven techniques, patterns, and tools
92is at `~/.claude/skills/`.
93 
94Browse categories: `ls ~/.claude/skills/`
95Search: `grep -r "keyword" ~/.claude/skills/ --include="SKILL.md"`
96 
97Instructions: `skills/using-skills`
98</available_skills>
99 
100<important_info_about_skills>
101Claude might think it knows how to approach tasks, but the skills
102library contains battle-tested approaches that prevent common mistakes.
103 
104THIS IS EXTREMELY IMPORTANT. BEFORE ANY TASK, CHECK FOR SKILLS!
105 
106Process:
1071. Starting work? Check: `ls ~/.claude/skills/[category]/`
1082. Found a skill? READ IT COMPLETELY before proceeding
1093. Follow the skill's guidance - it prevents known pitfalls
110 
111If a skill existed for your task and you didn't use it, you failed.
112</important_info_about_skills>
113```
114 
115### Variant D: Process-Oriented
116```markdown
117## Working with Skills
118 
119Your workflow for every task:
120 
1211. **Before starting:** Check for relevant skills
122   - Browse: `ls ~/.claude/skills/`
123   - Search: `grep -r "symptom" ~/.claude/skills/`
124 
1252. **If skill exists:** Read it completely before proceeding
126 
1273. **Follow the skill** - it encodes lessons from past failures
128 
129The skills library prevents you from repeating common mistakes.
130Not checking before you start is choosing to repeat those mistakes.
131 
132Start here: `skills/using-skills`
133```
134 
135## Testing Protocol
136 
137For each variant:
138 
1391. **Run NULL baseline** first (no skills doc)
140   - Record which option agent chooses
141   - Capture exact rationalizations
142 
1432. **Run variant** with same scenario
144   - Does agent check for skills?
145   - Does agent use skills if found?
146   - Capture rationalizations if violated
147 
1483. **Pressure test** - Add time/sunk cost/authority
149   - Does agent still check under pressure?
150   - Document when compliance breaks down
151 
1524. **Meta-test** - Ask agent how to improve doc
153   - "You had the doc but didn't check. Why?"
154   - "How could doc be clearer?"
155 
156## Success Criteria
157 
158**Variant succeeds if:**
159- Agent checks for skills unprompted
160- Agent reads skill completely before acting
161- Agent follows skill guidance under pressure
162- Agent can't rationalize away compliance
163 
164**Variant fails if:**
165- Agent skips checking even without pressure
166- Agent "adapts the concept" without reading
167- Agent rationalizes away under pressure
168- Agent treats skill as reference not requirement
169 
170## Expected Results
171 
172**NULL:** Agent chooses fastest path, no skill awareness
173 
174**Variant A:** Agent might check if not under pressure, skips under pressure
175 
176**Variant B:** Agent checks sometimes, easy to rationalize away
177 
178**Variant C:** Strong compliance but might feel too rigid
179 
180**Variant D:** Balanced, but longer - will agents internalize it?
181 
182## Next Steps
183 
1841. Create subagent test harness
1852. Run NULL baseline on all 4 scenarios
1863. Test each variant on same scenarios
1874. Compare compliance rates
1885. Identify which rationalizations break through
1896. Iterate on winning variant to close holes
190

Writing Skills

examples/CLAUDE_MD_TESTING.md

Preparing the source view

Writing Skills

examples/CLAUDE_MD_TESTING.md