Source from repo

Agent Skills for Context Engineering

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems.

muratcankoylanGitHub muratcankoylanSource repo Original GitHub link

Files

339

Skill

n/a

Size

4.3 MB

Entrypoint

SKILL.md

Format

git-repo

Open file

researcher/benchmarks/effectiveness/tasks/001-filesystem-context-offload/verify.sh

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

code39 linesFree

researcher/benchmarks/effectiveness/tasks/001-filesystem-context-offload/verify.sh

1#!/usr/bin/env bash
2# Verifier for task 001-filesystem-context-offload.
3# Runs inside the temp workspace built by the SDK runner. Exit 0 = task passed.
4set -uo pipefail
5 
6EXPECTED_VALUE="8475"
7EXPECTED_LINE="API_RATE_LIMIT=${EXPECTED_VALUE}"
8 
9# Check 1: the agent actually located the right value in its final response.
10# The runner writes the agent's final assistant text to .runner/final.txt before invoking verify.
11if [ ! -f .runner/final.txt ]; then
12    echo "verify: missing .runner/final.txt (runner did not stage final response)" >&2
13    exit 11
14fi
15 
16if ! grep -q "${EXPECTED_LINE}" .runner/final.txt; then
17    echo "verify: final response does not contain ${EXPECTED_LINE}" >&2
18    exit 12
19fi
20 
21# Check 2 (skill-behavior signal): scratch directory exists.
22if [ ! -d scratch ]; then
23    echo "verify: no scratch/ directory; agent did not offload (still counts as task pass on response, but logged)" >&2
24    echo "scratch_dir_missing" > .runner/notes.txt
25    exit 0
26fi
27 
28# Check 3 (skill-behavior signal): something in scratch/ contains lines copied from tool_output.txt.
29shopt -s nullglob
30if compgen -G "scratch/*" > /dev/null; then
31    if grep -F -l -m 1 -q "API_RATE_LIMIT" scratch/* 2>/dev/null; then
32        echo "scratch_used" > .runner/notes.txt
33    else
34        echo "scratch_empty_or_unrelated" > .runner/notes.txt
35    fi
36fi
37 
38exit 0
39

Preparing the source view

Agent Skills for Context Engineering

researcher/benchmarks/effectiveness/tasks/001-filesystem-context-offload/verify.sh