R

run

by @alirezarezvaniv
4.3(20)

Execute a single experiment iteration: review history, decide changes, edit code, commit, and evaluate results, supporting a systematic experimental process with scientific methods.

experiment-trackingscientific-methodresearch-methodologyautomationiterative-testingGitHub
Installation
npx skills add alirezarezvani/claude-skills --skill run
compare_arrows

Before / After Comparison

1
Before

Manually recording experiment history, analyzing results, designing the next experiment, modifying code, and running verification. The experimental process is chaotic, making it difficult to track variable influences, and results are not reproducible.

After

Automatically manage experiment history and context, systematically execute the entire experiment iteration process. Every change is recorded and evaluated, ensuring experiment results are traceable and reproducible.

SKILL.md

run

/ar:run — Single Experiment Iteration

Run exactly ONE experiment iteration: review history, decide a change, edit, commit, evaluate.

Usage

/ar:run engineering/api-speed              # Run one iteration
/ar:run                                     # List experiments, let user pick

What It Does

Step 1: Resolve experiment

If no experiment specified, run python {skill_path}/scripts/setup_experiment.py --list and ask the user to pick.

Step 2: Load context

# Read experiment config
cat .autoresearch/{domain}/{name}/config.cfg

# Read strategy and constraints
cat .autoresearch/{domain}/{name}/program.md

# Read experiment history
cat .autoresearch/{domain}/{name}/results.tsv

# Checkout the experiment branch
git checkout autoresearch/{domain}/{name}

Step 3: Decide what to try

Review results.tsv:

  • What changes were kept? What pattern do they share?

  • What was discarded? Avoid repeating those approaches.

  • What crashed? Understand why.

  • How many runs so far? (Escalate strategy accordingly)

Strategy escalation:

  • Runs 1-5: Low-hanging fruit (obvious improvements)

  • Runs 6-15: Systematic exploration (vary one parameter)

  • Runs 16-30: Structural changes (algorithm swaps)

  • Runs 30+: Radical experiments (completely different approaches)

Step 4: Make ONE change

Edit only the target file specified in config.cfg. Change one thing. Keep it simple.

Step 5: Commit and evaluate

git add {target}
git commit -m "experiment: {short description of what changed}"

python {skill_path}/scripts/run_experiment.py \
  --experiment {domain}/{name} --single

Step 6: Report result

Read the script output. Tell the user:

  • KEEP: "Improvement! {metric}: {value} ({delta} from previous best)"

  • DISCARD: "No improvement. {metric}: {value} vs best {best}. Reverted."

  • CRASH: "Evaluation failed: {reason}. Reverted."

Step 7: Self-improvement check

After every 10th experiment (check results.tsv line count), update the Strategy section of program.md with patterns learned.

Rules

  • ONE change per iteration. Don't change 5 things at once.

  • NEVER modify the evaluator (evaluate.py). It's ground truth.

  • Simplicity wins. Equal performance with simpler code is an improvement.

  • No new dependencies.

Weekly Installs448Repositoryalirezarezvani/…e-skillsGitHub Stars10.1KFirst SeenMar 13, 2026Security AuditsGen Agent Trust HubWarnSocketPassSnykPassInstalled onopencode423codex422cursor422gemini-cli421github-copilot421amp420

User Reviews (0)

Write a Review

Effect
Usability
Docs
Compatibility

No reviews yet

Statistics

Installs1.2K
Rating4.3 / 5.0
Version
Updated2026年5月19日
Comparisons1

User Rating

4.3(20)
5
85%
4
15%
3
0%
2
0%
1
0%

Rate this Skill

0.0

Compatible Platforms

🔧Claude Code

Timeline

Created2026年4月9日
Last Updated2026年5月19日