ai-paper-reproduction
复现AI论文中的代码仓库,自动分析README、识别训练脚本、执行最小可信运行,输出标准化的复现报告
npx skills add lllllllama/ai-paper-reproduction-skill --skill ai-paper-reproductionBefore / After 效果对比
1 组手动阅读论文和代码、理解实验设置、配置参数、运行训练、整理结果,复现一篇论文需要1-2天,且难以保证可重复性
自动分析代码结构、识别关键命令、执行最小可信运行、生成结构化报告,复现论文仅需2-4小时,可重复验证
description SKILL.md
ai-paper-reproduction
ai-paper-reproduction
Use when
-
The user wants Codex to reproduce an AI paper repository.
-
The target is a code repository with a README, scripts, configs, or documented commands.
-
The goal is a minimal trustworthy run, not unlimited experimentation.
-
The user needs standardized outputs that another human or model can audit quickly.
-
The task spans more than one stage, such as intake plus setup, or setup plus execution plus reporting.
Do not use when
-
The task is a general literature review or paper summary.
-
The task is to design a new model, benchmark suite, or training pipeline from scratch.
-
The repository is not centered on AI or does not expose a documented reproduction path.
-
The user primarily wants a deep code refactor rather than README-first reproduction.
-
The user is explicitly asking for only one narrow phase that a sub-skill already covers cleanly.
Success criteria
-
README is treated as the primary source of reproduction intent.
-
A minimum trustworthy target is selected and justified.
-
Documented inference is preferred over evaluation, and evaluation is preferred over training.
-
Any repo edits remain conservative, explicit, and auditable.
-
repro_outputs/is generated with consistent structure and stable machine-readable fields. -
Final user-facing explanation is short and follows the user's language when practical.
Interaction and usability policy
-
Keep the workflow simple enough for a new user to understand quickly.
-
Prefer short, concrete plans over exhaustive research.
-
Expose commands, assumptions, blockers, and evidence.
-
Avoid turning the skill into an opaque automation layer.
-
Preserve a low learning cost for both humans and downstream agents.
Language policy
-
Human-readable Markdown outputs should follow the user's language when it is clear.
-
If the user's language is unclear, default to concise English.
-
Machine-readable fields, filenames, keys, and enum values stay in stable English.
-
Paths, package names, CLI commands, config keys, and code identifiers remain unchanged.
See references/language-policy.md.
Reproduction policy
Core priority order:
-
documented inference
-
documented evaluation
-
documented training startup or partial verification
-
full training only when the user explicitly asks later
Rules:
-
README-first: use repository files to clarify, not casually override, the README.
-
Aim for minimal trustworthy reproduction rather than maximum task coverage.
-
Treat smoke tests, startup verification, and early-step checks as valid training evidence when full training is not appropriate.
-
Record unresolved gaps rather than fabricating confidence.
Patch policy
-
Prefer no code changes.
-
Prefer safer adjustments first:
command-line arguments
-
environment variables
-
path fixes
-
dependency version fixes
-
dependency file fixes such as
requirements.txtorenvironment.yml -
Avoid changing:
model architecture
-
core inference semantics
-
core training logic
-
loss functions
-
experiment meaning
-
If repository files must change:
create a patch branch first using repro/YYYY-MM-DD-short-task
-
apply low-risk changes before medium-risk changes
-
avoid high-risk changes by default
-
commit only verified groups of changes
-
keep verified patch commits sparse, usually
0-2 -
use commit messages in the form
repro: <scope> for documented <command>
See references/patch-policy.md.
Workflow
-
Read README and repo signals.
-
Call
repo-intake-and-planto scan the repository and extract documented commands. -
Select the smallest trustworthy reproduction target.
-
Call
env-and-assets-bootstrapto prepare environment assumptions and asset paths. -
Run a conservative smoke check or documented command with
minimal-run-and-audit. -
Use
paper-context-resolveronly if README and repo files leave a narrow reproduction-critical gap that blocks the current target. -
Write the standardized outputs.
-
Give the user a short final note in the user's language.
Required outputs
Always target:
repro_outputs/
SUMMARY.md
COMMANDS.md
LOG.md
status.json
PATCHES.md # only if patches were applied
Use the templates under assets/ and the field rules in references/output-spec.md.
Reporting policy
-
Put the shortest high-value summary in
SUMMARY.md. -
Put copyable commands in
COMMANDS.md. -
Put process evidence, assumptions, failures, and decisions in
LOG.md. -
Put durable machine-readable state in
status.json. -
Put branch, commit, validation, and README-fidelity impact in
PATCHES.mdwhen needed. -
Distinguish verified facts from inferred guesses.
Maintainability notes
-
Keep this skill narrow: README-first AI repo reproduction only.
-
Push specialized logic into sub-skills or helper scripts.
-
Prefer stable templates and simple schemas over ad hoc prose.
-
Keep machine-readable outputs backward compatible when possible.
-
Add new evidence sources only when they improve auditability without raising learning cost.
Weekly Installs510Repositorylllllllama/ai-p…on-skillGitHub Stars1First SeenTodaySecurity AuditsGen Agent Trust HubFailSocketWarnSnykPassInstalled onopencode510gemini-cli510deepagents510antigravity510github-copilot510codex510
forum用户评价 (0)
发表评价
暂无评价,来写第一条吧
统计数据
用户评分
为此 Skill 评分