Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
examples/book-sft-pipeline/examples/gertrude-stein/README.md
1# Example: Gertrude Stein Style Transfer23A complete case study of training Qwen3-8B-Base to write in Gertrude Stein's style using her 1909 novel "Three Lives."45## Project Summary67| Metric | Value |8|--------|-------|9| Source Book | Three Lives (1909) |10| Book Word Count | ~86,000 |11| Training Examples | 592 |12| Test Examples | 50 |13| Model | Qwen/Qwen3-8B-Base |14| LoRA Rank | 32 |15| Adapter Size | 352 MB |16| Training Time | ~15 minutes |17| Total Cost | ~$2 |1819## Training Metrics2021### Loss Trajectory2223| Step | Train Loss | Test Loss |24|------|-----------|-----------|25| 5 | 7579.67 | - |26| 40 | 4032.35 | 7584.85 |27| 100 | 2392.17 | 2127.44 |28| 140 | 2863.17 | 752.95 |29| 180 | 1327.41 | 229.88 |30| 200 | 1044.49 | 213.36 |31| Final | ~800 | ~200 |3233Loss reduction: 97% (7584 to 213 test loss)3435## Style Markers Learned3637The model successfully learned these Gertrude Stein characteristics:38391. **Repetitive sentence structures**: "She was a good woman. She was always a good woman."402. **Simple vocabulary**: Common words, no ornate language413. **Comma-separated adjectives**: "a dark, sweet, little, pretty girl"424. **Present continuous tense**: "She was always doing", "He was thinking"435. **Character focus**: Deep psychological observation over plot446. **Rhythmic, hypnotic quality**: Almost musical prose4546## Sample Outputs4748### Modern Scenario: Real Estate Office4950**Prompt**: Write in Gertrude Stein's style about a real estate clerk coming home tired.5152**Output**:53> It was a very busy day for the clerk in the real estate office. He came home to his small house in the working class part of the town, very tired. He went into the kitchen where his wife was cooking.54>55> "Hi, Sis, home for dinner?" he said to her.56>57> She looked at him and saw that he was very tired. She looked at him and then looked away into the fire. She said nothing in answer to him.58>59> He took off his hat and coat and then sat down on the steps just inside the door to take a rest. He was very tired today, he sat there and just sort of dozed a little.6061**Analysis**: Original content verified - "real estate office", "working class", "Hi, Sis" do not appear in training data.6263### Modern Scenario: Text Messages6465**Prompt**: Write about two lovers communicating through text messages.6667**Output**:68> "Hey, Herman, I just got a minute to be gone from my patient. Want to talk to you for a little while, Herman," Lena wrote one night, and then she sent the text to him.69>70> Herman read it and then he wrote her back, "I was just getting ready to go to bed, Lena. Want me to come and see you tomorrow, Lena. Want me to bring something to you, Lena. Want me to bring you some ice cream, Lena. Want me to bring you some of those little donuts..."7172**Analysis**: Classic Stein anaphora (repetition at sentence starts) applied to modern SMS format.7374## AI Detector Results7576Tested with Pangram AI detector: **100% Human Written**7778Multiple samples tested, all scored as human-written prose.7980## Validation Method8182### Modern Scenario Testing8384We tested the model on scenarios that couldn't exist in 1909:85- Barista making lattes86- Social media scrolling87- Video calls88- Food delivery drivers89- Climate change anxiety9091When style markers appeared in modern contexts, it proved the model learned **style** rather than **content**.9293### Originality Verification9495Searched training data for output phrases:9697```bash98grep "real estate office" dataset.jsonl # No matches99grep "working class" dataset.jsonl # No matches100grep "Hi, Sis" dataset.jsonl # No matches101grep "text messages" dataset.jsonl # No matches102```103104## Known Limitations105106### Character Name Leakage (~30% of outputs)107108The model sometimes uses original character names (Melanctha, Mrs. Lehntman, Anna) even in modern scenarios. This is because 592 examples from one book means these names appear hundreds of times.109110**Mitigation**: Train on multiple books by the same author, or add synthetic examples with different names.111112### Success Rate Distribution113114- Perfect style transfer: ~50%115- Style with name leakage: ~30%116- Partial style: ~15%117- Failed: ~5%118119The 50% perfect rate is realistic for an 8B model trained on one book.120121## Configuration Used122123### Dataset Generation124125```python126CONFIG = {127"min_words": 150,128"max_words": 400,129"overlap": True, # Last paragraph carried to next chunk130"variants_per_chunk": 2,131"prompt_templates": 15,132"system_prompts": 5,133"instruction_model": "gemini-2.0-flash-lite",134}135```136137### Training138139```python140CONFIG = {141"model_name": "Qwen/Qwen3-8B-Base",142"lora_rank": 32,143"learning_rate": 5e-4,144"batch_size": 4,145"epochs": 3,146"eval_every": 20,147"save_every": 50,148}149```150151## Key Learnings1521531. **Smaller chunks work better**: 150-400 words produced more examples and better style transfer than 250-6501541552. **Prompt diversity is critical**: 15 templates × 5 system prompts = 75 variations prevented memorization1561573. **Base models over instruct**: Qwen3-8B-Base was more malleable than instruct versions1581594. **Modern scenario testing proves transfer**: If style applies to modern contexts, the model learned patterns, not content1601615. **~$2 is enough**: LLM calls for instruction generation (~$0.50) plus Tinker training (~$1.50)162163## Files164165- `sample_outputs.md` - Full model outputs with analysis166- `training_config.json` - Exact configuration used167- `dataset_sample.jsonl` - Sample training examples168169