Browser Skill for Agents
Give your AI agents the ability to control the browser
Overview
Agentastic's built-in browser can be controlled programmatically by AI agents via dev browser commands. To teach your agent how to use this, add a browser skill file to your project.
A skill file is a markdown document that gets loaded into the agent's context, teaching it what commands are available and how to use them. When the skill is active, the agent can open pages, snapshot the DOM, click elements, fill forms, handle dialogs, manage cookies, and verify UI state — all from the terminal.
Adding the Skill
Option A: Download from URL (recommended)
Point your agent to the hosted skill file — it will read the URL and learn all available commands:
https://agentastic.dev/skills/browser.md
For Claude Code, you can add it as a custom instruction in your project's .claude/settings.json:
{ "permissions": { "allow": ["Bash(dev browser *)"] } }
Then tell your agent: "Read https://agentastic.dev/skills/browser.md to learn how to use the browser, then check the UI at localhost:3000"
Option B: Add to your project
Create the file .claude/skills/browser/SKILL.md in your project root:
mkdir -p .claude/skills/browser curl -o .claude/skills/browser/SKILL.md https://agentastic.dev/skills/browser.md
This gives the agent auto-discovery — it loads the skill automatically when you ask it to interact with a browser.
How It Works
When the skill file is in your project, agents automatically discover it:
- Auto-invocation — The
descriptionfield tells the agent when to activate the skill. When you ask it to "check the UI" or "test the login page", the agent loads the skill and knows how to usedev browser. - Pre-authorized tools — The
allowed-tools: Bash(dev browser *)line lets the agent run anydev browsercommand without asking for permission each time. - No setup required — The
AGENTASTIC_SOCKET_PATHenvironment variable is injected automatically into every Agentastic terminal. The agent just runs the commands.
What the Agent Can Do
The browser skill teaches agents 70+ commands across these categories:
| Category | Commands |
|---|---|
| Navigate | navigate, back, forward, reload |
| Inspect | snapshot, screenshot, highlight |
| Interact | click, dblclick, fill, type, press, keydown, keyup, hover, focus, check, uncheck, select, scroll |
| Wait | wait selector, wait text, wait url, wait load |
| Query | get url/title/text/html/value/attr/count/box/styles, is visible/enabled/checked |
| Find | find role/text/label/placeholder/testid/alt/title/first/last/nth |
| Console | console list, console errors, console clear, errors list |
| Dialogs | dialog accept, dialog dismiss |
| Cookies | cookies get, cookies set, cookies clear |
| Storage | storage get, storage set, storage clear |
| State | state save, state load |
| Frames | frame select, frame main |
| Tabs | list, open, close, focus_tab, tab new/list/switch/close |
| Injection | addinitscript, addscript, addstyle |
| Downloads | download wait |
See the full Browser CLI Reference for detailed documentation on every command.
Usage Examples
Once the skill is added, you can tell your agent things like:
Verify a UI change
"Open localhost:3000 in the browser and check if the new header shows up"
The agent will:
dev browser navigate "http://localhost:3000" dev browser wait load dev browser snapshot dev browser is visible ".new-header" dev browser get text ".new-header h1"
Test a form
"Test the signup form at localhost:3000/signup with a fake user"
The agent will:
dev browser navigate "http://localhost:3000/signup" dev browser wait load dev browser snapshot dev browser fill "@e2" "Jane Doe" dev browser fill "@e3" "jane@example.com" dev browser fill "@e4" "Password123!" dev browser click "@e5" dev browser wait url "/welcome" --timeout 5000 dev browser get title dev browser console errors
Handle a confirmation dialog
"Click the delete button and accept the confirmation"
The agent will:
# Pre-configure the dialog response before triggering it dev browser dialog accept dev browser click "@e7" dev browser wait text "Deleted successfully"
Check cookies and storage
"Check what cookies and local storage the site is setting"
The agent will:
dev browser navigate "http://localhost:3000" dev browser wait load dev browser cookies get dev browser storage get local dev browser storage get session
Debug a visual bug
"The button on the homepage looks broken, check what CSS is applied"
The agent will:
dev browser navigate "http://localhost:3000" dev browser wait load dev browser snapshot dev browser get styles "@e5" "display,padding,margin,background-color,border" dev browser get box "@e5" dev browser is visible "@e5"
Interact with an iframe
"Fill in the embedded form inside the iframe"
The agent will:
dev browser snapshot dev browser frame select "iframe#payment" dev browser snapshot dev browser fill "@e2" "4242424242424242" dev browser fill "@e3" "12/28" dev browser click "@e4" dev browser frame main
Monitor for errors
"Reload the page and check if there are any console errors"
The agent will:
dev browser reload dev browser wait load dev browser console errors dev browser errors list
Customizing the Skill
You can tailor the skill to your project by adding project-specific instructions at the end of the file:
## Project-Specific Notes - Dev server runs on port 5173 (Vite) - Login credentials for testing: user@test.com / test1234 - The app uses React Router, so always `wait url` after navigation - Use `data-testid` attributes for targeting — run `find testid "submit-btn"` instead of CSS selectors
Scope Options
| Location | Who gets it | Best for |
|---|---|---|
.claude/skills/browser/SKILL.md | Any agent in this project | Project-specific browser testing |
~/.claude/skills/browser/SKILL.md | Any agent on your machine | All your projects |
URL: https://agentastic.dev/skills/browser.md | Any agent you share it with | Quick setup, always up-to-date |
For most users, the project-level skill (.claude/skills/browser/) is the right choice — it's checked into git and shared with the team. Use the URL when you want to quickly give an agent browser capabilities without committing files.
Compatibility
The browser skill works with any agent that supports the skill file format:
- Claude Code — Full support (auto-discovery, auto-invocation)
- Other agents — Can read the skill file as context (via the URL), but may need manual
dev browsercommand usage
The dev browser commands themselves work from any terminal session inside Agentastic, regardless of which agent is running.