Source from repo

Agent Skills for Context Engineering

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.

muratcankoylanGitHub muratcankoylanSource repo Original GitHub link

Files

241

Skill

n/a

Size

2.6 MB

Entrypoint

SKILL.md

Format

git-repo

Open file

examples/llm-as-judge-skills/examples/basic-evaluation.ts

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

code90 linesFree

examples/llm-as-judge-skills/examples/basic-evaluation.ts

1/**
2 * Basic Evaluation Example
3 * 
4 * Demonstrates how to use the EvaluatorAgent to score responses.
5 * 
6 * Run: npx tsx examples/basic-evaluation.ts
7 */
8 
9import 'dotenv/config';
10import { EvaluatorAgent } from '../src/agents/evaluator.js';
11import { validateConfig } from '../src/config/index.js';
12 
13async function main() {
14  // Validate API key is configured
15  validateConfig();
16 
17  const agent = new EvaluatorAgent();
18 
19  console.log('=== Direct Scoring Example ===\n');
20 
21  const response = `
22    Machine learning is a subset of artificial intelligence that enables systems 
23    to learn and improve from experience without being explicitly programmed. 
24    It focuses on developing algorithms that can access data and use it to learn for themselves.
25    
26    There are three main types of machine learning:
27    1. Supervised learning - learns from labeled data
28    2. Unsupervised learning - finds patterns in unlabeled data  
29    3. Reinforcement learning - learns through trial and error
30  `;
31 
32  const result = await agent.score({
33    response,
34    prompt: 'Explain what machine learning is to a beginner',
35    criteria: [
36      {
37        name: 'Accuracy',
38        description: 'Factual correctness of the explanation',
39        weight: 0.4
40      },
41      {
42        name: 'Clarity',
43        description: 'Easy to understand for a beginner',
44        weight: 0.3
45      },
46      {
47        name: 'Completeness',
48        description: 'Covers the key concepts adequately',
49        weight: 0.3
50      }
51    ],
52    rubric: {
53      scale: '1-5',
54      levelDescriptions: {
55        '1': 'Poor - Major issues',
56        '2': 'Below Average - Several issues',
57        '3': 'Average - Some issues',
58        '4': 'Good - Minor issues only',
59        '5': 'Excellent - No issues'
60      }
61    }
62  });
63 
64  if (result.success) {
65    console.log('Evaluation Results:');
66    console.log('-------------------');
67    
68    result.scores.forEach(score => {
69      console.log(`\n${score.criterion}: ${score.score}/${score.maxScore}`);
70      console.log(`Justification: ${score.justification}`);
71      console.log(`Improvement: ${score.improvement}`);
72    });
73 
74    console.log('\n-------------------');
75    console.log(`Overall Score: ${result.overallScore}`);
76    console.log(`Weighted Score: ${result.weightedScore}`);
77    console.log(`\nAssessment: ${result.summary.assessment}`);
78    console.log(`\nStrengths:`);
79    result.summary.strengths.forEach(s => console.log(`  - ${s}`));
80    console.log(`\nWeaknesses:`);
81    result.summary.weaknesses.forEach(w => console.log(`  - ${w}`));
82    console.log(`\nEvaluation Time: ${result.metadata.evaluationTimeMs}ms`);
83  } else {
84    console.error('Evaluation failed:', result.summary.assessment);
85  }
86}
87 
88main().catch(console.error);
89 
90

Preparing the source view

Agent Skills for Context Engineering

examples/llm-as-judge-skills/examples/basic-evaluation.ts