Browser Skill for Agents

Give your AI agents the ability to control the browser

Overview

Agentastic's built-in browser can be controlled programmatically by AI agents via dev browser commands. To teach your agent how to use this, add a browser skill file to your project.

A skill file is a markdown document that gets loaded into the agent's context, teaching it what commands are available and how to use them. When the skill is active, the agent can open pages, snapshot the DOM, click elements, fill forms, handle dialogs, manage cookies, and verify UI state — all from the terminal.

Adding the Skill

Option A: Download from URL (recommended)

Point your agent to the hosted skill file — it will read the URL and learn all available commands:

https://agentastic.dev/skills/browser.md

For Claude Code, you can add it as a custom instruction in your project's .claude/settings.json:

{ "permissions": { "allow": ["Bash(dev browser *)"] } }

Then tell your agent: "Read https://agentastic.dev/skills/browser.md to learn how to use the browser, then check the UI at localhost:3000"

Option B: Add to your project

Create the file .claude/skills/browser/SKILL.md in your project root:

mkdir -p .claude/skills/browser curl -o .claude/skills/browser/SKILL.md https://agentastic.dev/skills/browser.md

This gives the agent auto-discovery — it loads the skill automatically when you ask it to interact with a browser.

How It Works

When the skill file is in your project, agents automatically discover it:

  1. Auto-invocation — The description field tells the agent when to activate the skill. When you ask it to "check the UI" or "test the login page", the agent loads the skill and knows how to use dev browser.
  2. Pre-authorized tools — The allowed-tools: Bash(dev browser *) line lets the agent run any dev browser command without asking for permission each time.
  3. No setup required — The AGENTASTIC_SOCKET_PATH environment variable is injected automatically into every Agentastic terminal. The agent just runs the commands.

What the Agent Can Do

The browser skill teaches agents 70+ commands across these categories:

CategoryCommands
Navigatenavigate, back, forward, reload
Inspectsnapshot, screenshot, highlight
Interactclick, dblclick, fill, type, press, keydown, keyup, hover, focus, check, uncheck, select, scroll
Waitwait selector, wait text, wait url, wait load
Queryget url/title/text/html/value/attr/count/box/styles, is visible/enabled/checked
Findfind role/text/label/placeholder/testid/alt/title/first/last/nth
Consoleconsole list, console errors, console clear, errors list
Dialogsdialog accept, dialog dismiss
Cookiescookies get, cookies set, cookies clear
Storagestorage get, storage set, storage clear
Statestate save, state load
Framesframe select, frame main
Tabslist, open, close, focus_tab, tab new/list/switch/close
Injectionaddinitscript, addscript, addstyle
Downloadsdownload wait

See the full Browser CLI Reference for detailed documentation on every command.

Usage Examples

Once the skill is added, you can tell your agent things like:

Verify a UI change

"Open localhost:3000 in the browser and check if the new header shows up"

The agent will:

dev browser navigate "http://localhost:3000" dev browser wait load dev browser snapshot dev browser is visible ".new-header" dev browser get text ".new-header h1"

Test a form

"Test the signup form at localhost:3000/signup with a fake user"

The agent will:

dev browser navigate "http://localhost:3000/signup" dev browser wait load dev browser snapshot dev browser fill "@e2" "Jane Doe" dev browser fill "@e3" "jane@example.com" dev browser fill "@e4" "Password123!" dev browser click "@e5" dev browser wait url "/welcome" --timeout 5000 dev browser get title dev browser console errors

Handle a confirmation dialog

"Click the delete button and accept the confirmation"

The agent will:

# Pre-configure the dialog response before triggering it dev browser dialog accept dev browser click "@e7" dev browser wait text "Deleted successfully"

Check cookies and storage

"Check what cookies and local storage the site is setting"

The agent will:

dev browser navigate "http://localhost:3000" dev browser wait load dev browser cookies get dev browser storage get local dev browser storage get session

Debug a visual bug

"The button on the homepage looks broken, check what CSS is applied"

The agent will:

dev browser navigate "http://localhost:3000" dev browser wait load dev browser snapshot dev browser get styles "@e5" "display,padding,margin,background-color,border" dev browser get box "@e5" dev browser is visible "@e5"

Interact with an iframe

"Fill in the embedded form inside the iframe"

The agent will:

dev browser snapshot dev browser frame select "iframe#payment" dev browser snapshot dev browser fill "@e2" "4242424242424242" dev browser fill "@e3" "12/28" dev browser click "@e4" dev browser frame main

Monitor for errors

"Reload the page and check if there are any console errors"

The agent will:

dev browser reload dev browser wait load dev browser console errors dev browser errors list

Customizing the Skill

You can tailor the skill to your project by adding project-specific instructions at the end of the file:

## Project-Specific Notes - Dev server runs on port 5173 (Vite) - Login credentials for testing: user@test.com / test1234 - The app uses React Router, so always `wait url` after navigation - Use `data-testid` attributes for targeting — run `find testid "submit-btn"` instead of CSS selectors

Scope Options

LocationWho gets itBest for
.claude/skills/browser/SKILL.mdAny agent in this projectProject-specific browser testing
~/.claude/skills/browser/SKILL.mdAny agent on your machineAll your projects
URL: https://agentastic.dev/skills/browser.mdAny agent you share it withQuick setup, always up-to-date

For most users, the project-level skill (.claude/skills/browser/) is the right choice — it's checked into git and shared with the team. Use the URL when you want to quickly give an agent browser capabilities without committing files.

Compatibility

The browser skill works with any agent that supports the skill file format:

  • Claude Code — Full support (auto-discovery, auto-invocation)
  • Other agents — Can read the skill file as context (via the URL), but may need manual dev browser command usage

The dev browser commands themselves work from any terminal session inside Agentastic, regardless of which agent is running.