首页/AI 工程/agentation-self-driving
A

agentation-self-driving

by @benjitaylorv1.0.0
0.0(0)

Autonomous design critique mode using the Agentation annotation toolbar. Use when the user asks to "critique this page," "add design annotations," "review the UI," "self-driving mode," "auto-annotate," or wants an AI agent to autonomously add design feedback annotations to a web page via the browser

AgentationAutonomous AgentsSelf-driving AIRoboticsReinforcement LearningGitHub
安装方式
npx skills add benjitaylor/agentation --skill agentation-self-driving
compare_arrows

Before / After 效果对比

0

description 文档


name: agentation-self-driving description: Autonomous design critique mode using the Agentation annotation toolbar. Use when the user asks to "critique this page," "add design annotations," "review the UI," "self-driving mode," "auto-annotate," or wants an AI agent to autonomously add design feedback annotations to a web page via the browser. Requires the Agentation toolbar to be installed on the target page and agent-browser skill to be available. allowed-tools: Bash(agent-browser:*)

Agentation Self-Driving Mode

Autonomously critique a web page by adding design annotations via the Agentation toolbar — in a visible headed browser so the user can watch the agent work in real time, like watching a self-driving car navigate.

Launch — Always Headed

The browser MUST be visible. Never run headless. The user watches you scan, hover, click, and annotate.

Preflight: Verify agent-browser is available before anything else:

command -v agent-browser >/dev/null || { echo "ERROR: agent-browser not found. Install the agent-browser skill first."; exit 1; }

Launch: Try opening directly first. Only close an existing session if the open command fails with a stale session error — this avoids killing a browser someone else is using:

# Try to open. If it fails (stale session), close first then retry.
agent-browser --headed open <url> 2>&1 || { agent-browser close 2>/dev/null; agent-browser --headed open <url>; }

Then verify the Agentation toolbar is present and expand it:

# 1. Check toolbar exists on the page (data-feedback-toolbar is the root marker)
agent-browser eval "document.querySelector('[data-feedback-toolbar]') ? 'toolbar found' : 'NOT FOUND'"
# If "NOT FOUND": Agentation is not installed on this page — stop and tell the user

# 2. Expand ONLY if collapsed (clicking when already expanded collapses it)
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'already expanded' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanding')"

# 3. Verify: take a snapshot and look for toolbar controls
agent-browser snapshot -i
# If expanded: you'll see "Block page interactions" checkbox, color buttons (Purple, Blue, etc.)
# If collapsed: you'll only see the small toggle button — retry step 2

"Block page interactions" must be checked (default: on).

eval quoting rule: Always use [class*=toggleContent] (no quotes around the attribute value) in eval strings. Do not use double-bang in eval because bash treats it as history expansion. Do not use backslash-escaped inner quotes either, as they break unpredictably across shells.

Critical: How to Create Annotations

Standard element clicks (click @ref) do NOT trigger annotation dialogs. The Agentation overlay intercepts pointer events at the coordinate level. Use coordinate-based mouse events — this also makes the interaction visible in the browser as the cursor moves across the page.

@ref compatibility: Only click, fill, type, hover, focus, check, select, drag support @ref syntax. The commands scrollintoview, get box, and eval do NOT — they expect CSS selectors. Use eval with querySelector for scrolling and position lookup.

# 1. Take interactive snapshot — identify target element and build a CSS selector
agent-browser snapshot -i
# Example: snapshot shows  heading "Point at bugs." [ref=e10]
# Derive a CSS selector: 'h1', or more specific: 'h1:first-of-type'

# 2. Scroll the element into view via eval (NOT scrollintoview @ref — that breaks)
agent-browser eval "document.querySelector('h1').scrollIntoView({block:'center'})"

# 3. Get its bounding box via eval (NOT get box @ref — that also breaks)
agent-browser eval "((r) => r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('h1').getBoundingClientRect())"
# Returns: "383,245,200,40"  (parse these as x,y,width,height)

# 4. Move cursor to element center, then click
#    centerX = x + width/2,  centerY = y + height/2
agent-browser mouse move <centerX> <centerY>
agent-browser mouse down left
agent-browser mouse up left

# 5. Get the annotation dialog refs — read the FULL snapshot output
#    Dialog refs appear at the BOTTOM of the list, don't truncate with head/tail
agent-browser snapshot -i
# Look for: textbox "What should change?" and "Cancel" / "Add" buttons

# 6. Type critique — fill and click DO support @ref
agent-browser fill @<textboxRef> "Your critique here"

# 7. Submit (Add button enables after text is filled)
agent-browser click @<addRef>

If no dialog appears after clicking, the toolbar may have collapsed. Re-expand (only if collapsed) and retry:

agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'ok' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanded')"

Building CSS selectors from snapshots

The snapshot shows element roles, names, and refs. Map them to CSS selectors:

| Snapshot line | CSS selector | |--------------|-------------| | heading "Point at bugs." [ref=e10] | h1 or h1:first-of-type | | button "npm install agentation Copy" [ref=e15] | button:has(code) or by text content via eval | | link "Star on GitHub" [ref=e28] | a[href*=github] | | paragraph (long text...) [ref=e20] | Target by section: section:nth-of-type(2) p |

When in doubt, use a broader selector and verify with eval:

agent-browser eval "document.querySelector('h2').textContent"

The Loop

Work top-to-bottom through the page. For each annotation:

  1. Scroll to the target area via eval (scrollIntoView)
  2. Pick a specific element — heading, paragraph, button, section container
  3. Get its bounding box via eval (getBoundingClientRect)
  4. Execute the coordinate-click sequence (mouse movemouse downmouse up)
  5. Read the full snapshot output to find dialog refs at the bottom
  6. Write the critique (fill @ref) and submit (click @ref)
  7. Verify the annotation was added (see below)
  8. Move to the next area

Verifying annotations

After submitting each annotation, confirm the count increased:

agent-browser eval "document.querySelectorAll('[data-annotation-marker]').length"
# Should return the expected count (1 after first, 2 after second, etc.)

If the count didn't increase, the submission failed silently — re-snapshot and check if the dialog is still open.

Aim for 5-8 annotations per page unless told otherwise.

What to Critique

| Area | What to look for | |------|-----------------| | Hero / above the fold | Headline hierarchy, CTA placement, visual grouping | | Navigation | Label styling, category grouping, visual weight | | Demo / illustrations | Clarity, depth, animation readability | | Content sections | Spacing rhythm, callout treatments, typography hierarchy | | Key taglines | Whether resonant lines get enough visual emphasis | | CTAs and footer | Conversion weight, visual separation, final actions |

Critique Style

2-3 sentences max per annotation:

  • Specific and actionable: "Stack the install command below the subheading at 16px" not "fix the layout"
  • 1-2 concrete alternatives: Reference CSS values, layout patterns, or design systems
  • Name the principle: Visual hierarchy, Gestalt grouping, whitespace, emphasis, conversion design
  • Reference comparable products: "Like how Stripe/Linear/Vercel handles this"

Bad: "This section needs work" Good: "This bullet list reads like docs, not a showcase. Use a 3-column card grid with icons — similar to Stripe's guidelines pattern. Creates visual rhythm and scannability."

Install

The skill must be symlinked into ~/.claude/skills/ for Claude Code to discover it:

ln -s "$(pwd)/skills/agentation-self-driving" ~/.claude/skills/agentation-self-driving

Restart Claude Code after installing. Verify with /agentation-self-driving — if it loads the skill instructions, the symlink is working.

Troubleshooting

  • "Browser not launched. Call launch first.": Stale session from a previous run — run agent-browser close 2>/dev/null then retry the --headed open command
  • Toolbar not found on page: Agentation isn't installed — run /agentation to set it up first
  • No dialog after clicking: Toolbar collapsed — re-expand with the state-aware eval (check [class*=expanded] first), retry
  • Wrong element targeted: Click Cancel, scroll to intended element, retry with correct coordinates
  • Add button stays disabled: Text wasn't filled — re-snapshot and fill the textbox
  • Page navigated: "Block page interactions" is off — enable via toolbar settings
  • Annotation count didn't increase: Submission failed — dialog may still be open, re-snapshot and check
  • Interrupted mid-run (Ctrl+C): The browser stays open with whatever state it was in. Run agent-browser close to clean up before starting a new session

agent-browser Pitfalls

These will silently break the workflow if you're not aware of them:

| Pitfall | What happens | Fix | |---------|-------------|-----| | scrollintoview @ref | Crashes: "Unsupported token @ref while parsing css selector" | Use eval "document.querySelector('sel').scrollIntoView({block:'center'})" | | get box @ref | Same crash — get box parses refs as CSS selectors | Use eval "((r)=>r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('sel').getBoundingClientRect())" | | eval with double-bang | Bash expands double-bang as history substitution before the command runs | Use expr !== null or expr ? true : false instead | | eval with backslash-escaped quotes | Escaped inner quotes break across shells | Drop the quotes: [class*=toggleContent] works for simple values without spaces | | snapshot -i \| head -50 | Annotation dialog refs (textbox "What should change?", Add, Cancel) appear at the BOTTOM of the snapshot | Always read the full snapshot output — never truncate | | click @ref on overlay elements | The click goes through to the real DOM, bypassing the Agentation overlay | Use mouse movemouse down leftmouse up left for coordinate-based clicks that the overlay intercepts | | --headed open fails with "Browser not launched" | Stale sessions from previous runs block new launches | Run agent-browser close 2>/dev/null then retry the open command |

Rule of thumb: @ref works for interaction commands (click, fill, type, hover). For everything else (eval, get, scrollintoview), use CSS selectors via querySelector in an eval.

Two-Session Workflow (Full Self-Driving)

With MCP connected (toolbar shows "MCP Connected"), annotations auto-send to any listening agent. This enables:

  • Session 1 (this skill): Watches the page, adds critique annotations in the visible browser
  • Session 2: Runs agentation_watch_annotations in a loop, receives annotations, edits code to address each one

The user watches Session 1 drive through the page in the browser while Session 2 fixes issues in the codebase — fully autonomous design review and implementation.

forum用户评价 (0)

发表评价

效果
易用性
文档
兼容性

暂无评价,来写第一条吧

统计数据

安装量0
评分0.0 / 5.0
版本1.0.0
更新日期2026年3月16日
对比案例0 组

用户评分

0.0(0)
5
0%
4
0%
3
0%
2
0%
1
0%

为此 Skill 评分

0.0

兼容平台

🔧Claude Code

时间线

创建2026年3月16日
最后更新2026年3月16日