Browser Skill for Agents

Give your AI agents the ability to control the browser

Overview

Agentastic's built-in browser can be controlled programmatically by AI agents via dev browser commands. To teach your agent how to use this, add a browser skill file to your project.

A skill file is a markdown document that gets loaded into the agent's context, teaching it what commands are available and how to use them. When the skill is active, the agent can open pages, snapshot the DOM, click elements, fill forms, handle dialogs, manage cookies, and verify UI state — all from the terminal.

Adding the Skill

Option A: Download from URL (recommended)

Point your agent to the hosted skill file — it will read the URL and learn all available commands:

https://agentastic.dev/skills/browser.md

For Claude Code, you can add it as a custom instruction in your project's .claude/settings.json:

{
  "permissions": {
    "allow": ["Bash(dev browser *)"]
  }
}

Then tell your agent: "Read https://agentastic.dev/skills/browser.md to learn how to use the browser, then check the UI at localhost:3000"

Option B: Add to your project

Create the file .claude/skills/browser/SKILL.md in your project root:

mkdir -p .claude/skills/browser
curl -o .claude/skills/browser/SKILL.md https://agentastic.dev/skills/browser.md

This gives the agent auto-discovery — it loads the skill automatically when you ask it to interact with a browser.

How It Works

When the skill file is in your project, agents automatically discover it:

Auto-invocation — The description field tells the agent when to activate the skill. When you ask it to "check the UI" or "test the login page", the agent loads the skill and knows how to use dev browser.
Pre-authorized tools — The allowed-tools: Bash(dev browser *) line lets the agent run any dev browser command without asking for permission each time.
No setup required — The AGENTASTIC_SOCKET_PATH environment variable is injected automatically into every Agentastic terminal. The agent just runs the commands.

What the Agent Can Do

The browser skill teaches agents 70+ commands across these categories:

Category	Commands
Navigate	`navigate`, `back`, `forward`, `reload`
Inspect	`snapshot`, `screenshot`, `highlight`
Interact	`click`, `dblclick`, `fill`, `type`, `press`, `keydown`, `keyup`, `hover`, `focus`, `check`, `uncheck`, `select`, `scroll`
Wait	`wait selector`, `wait text`, `wait url`, `wait load`
Query	`get url/title/text/html/value/attr/count/box/styles`, `is visible/enabled/checked`
Find	`find role/text/label/placeholder/testid/alt/title/first/last/nth`
Console	`console list`, `console errors`, `console clear`, `errors list`
Dialogs	`dialog accept`, `dialog dismiss`
Cookies	`cookies get`, `cookies set`, `cookies clear`
Storage	`storage get`, `storage set`, `storage clear`
State	`state save`, `state load`
Frames	`frame select`, `frame main`
Tabs	`list`, `open`, `close`, `focus_tab`, `tab new/list/switch/close`
Injection	`addinitscript`, `addscript`, `addstyle`
Downloads	`download wait`

See the full Browser CLI Reference for detailed documentation on every command.

Usage Examples

Once the skill is added, you can tell your agent things like:

Verify a UI change

"Open localhost:3000 in the browser and check if the new header shows up"

The agent will:

dev browser navigate "http://localhost:3000"
dev browser wait load
dev browser snapshot
dev browser is visible ".new-header"
dev browser get text ".new-header h1"

Test a form

"Test the signup form at localhost:3000/signup with a fake user"

The agent will:

dev browser navigate "http://localhost:3000/signup"
dev browser wait load
dev browser snapshot
dev browser fill "@e2" "Jane Doe"
dev browser fill "@e3" "jane@example.com"
dev browser fill "@e4" "Password123!"
dev browser click "@e5"
dev browser wait url "/welcome" --timeout 5000
dev browser get title
dev browser console errors

Handle a confirmation dialog

"Click the delete button and accept the confirmation"

The agent will:

# Pre-configure the dialog response before triggering it
dev browser dialog accept
dev browser click "@e7"
dev browser wait text "Deleted successfully"

Check cookies and storage

"Check what cookies and local storage the site is setting"

The agent will:

dev browser navigate "http://localhost:3000"
dev browser wait load
dev browser cookies get
dev browser storage get local
dev browser storage get session

Debug a visual bug

"The button on the homepage looks broken, check what CSS is applied"

The agent will:

dev browser navigate "http://localhost:3000"
dev browser wait load
dev browser snapshot
dev browser get styles "@e5" "display,padding,margin,background-color,border"
dev browser get box "@e5"
dev browser is visible "@e5"

Interact with an iframe

"Fill in the embedded form inside the iframe"

The agent will:

dev browser snapshot
dev browser frame select "iframe#payment"
dev browser snapshot
dev browser fill "@e2" "4242424242424242"
dev browser fill "@e3" "12/28"
dev browser click "@e4"
dev browser frame main

Monitor for errors

"Reload the page and check if there are any console errors"

The agent will:

dev browser reload
dev browser wait load
dev browser console errors
dev browser errors list

Customizing the Skill

You can tailor the skill to your project by adding project-specific instructions at the end of the file:

## Project-Specific Notes

- Dev server runs on port 5173 (Vite)
- Login credentials for testing: user@test.com / test1234
- The app uses React Router, so always `wait url` after navigation
- Use `data-testid` attributes for targeting — run `find testid "submit-btn"` instead of CSS selectors

Scope Options

Location	Who gets it	Best for
`.claude/skills/browser/SKILL.md`	Any agent in this project	Project-specific browser testing
`~/.claude/skills/browser/SKILL.md`	Any agent on your machine	All your projects
URL: `https://agentastic.dev/skills/browser.md`	Any agent you share it with	Quick setup, always up-to-date

For most users, the project-level skill (.claude/skills/browser/) is the right choice — it's checked into git and shared with the team. Use the URL when you want to quickly give an agent browser capabilities without committing files.

Compatibility

The browser skill works with any agent that supports the skill file format:

Claude Code — Full support (auto-discovery, auto-invocation)
Other agents — Can read the skill file as context (via the URL), but may need manual dev browser command usage

The dev browser commands themselves work from any terminal session inside Agentastic, regardless of which agent is running.