Building an Autonomous Agent That Completes a Task with /v1/runs
Many automation tools rely on brittle selectors or a fixed API surface. They break when layouts change. They cannot interact with local apps, browsers, or terminals the same way a human does. The /v1/runs endpoint lets you spin up an autonomous computer use agent that sees the screen, decides next actions, and executes them on a real machine. You define a task, choose a machine, set a deadline, and stream events as the agent works to completion. You pay only for the actual steps the agent takes at $0.05 per step.
How it works
The flow starts with a POST to /v1/runs. You provide a machine_id to target a cloud VM with a desktop, browser, or terminal. The agent runs with cua_version set to the latest version, typically "v3" or "v4". The cua_version field controls whether the agent uses a pass/fail verifier with the v4 mode. The task field holds the high-level goal. You can append custom instructions to guide the prompt. The server returns a run_id and initial state queued. As the agent progresses, you stream events from GET /v1/runs/{id}/events. The events stream contains actions taken and status changes such as running, awaiting_human, succeeded, failed, cancelled, or timed_out. You can cancel a run with POST /v1/runs/{id}/cancel or resume a paused run with POST /v1/runs/{id}/resume. The agent continues until it reaches a terminal state or the deadline (deadline_seconds) expires.
import os
import httpx
import json
from urllib.parse import urlencode
API_KEY = os.getenv('COASTY_API_KEY')
BASE = 'https://coasty.ai/v1'
def create_run(machine_id, task, cua_version='v3', instructions=None, max_steps=50, deadline_seconds=600, webhook_url=None):
url = f'{BASE}/runs'
payload = {
'machine_id': machine_id,
'task': task,
'cua_version': cua_version,
'max_steps': max_steps,
'deadline_seconds': deadline_seconds,
}
if instructions:
payload['instructions'] = instructions
if webhook_url:
payload['webhook_url'] = webhook_url
resp = httpx.post(url, json=payload, headers={'X-API-Key': API_KEY})
resp.raise_for_status()
return resp.json()
def stream_events(run_id):
url = f'{BASE}/runs/{run_id}/events'
headers = {'X-API-Key': API_KEY}
with httpx.stream('GET', url, headers=headers, timeout=300) as r:
r.raise_for_status()
for line in r.iter_lines():
if line.startswith('data: '):
yield json.loads(line[6:])
if __name__ == '__main__':
run = create_run(
machine_id='your-machine-id',
task='Open Chrome, navigate to https://coasty.ai/docs, and take a screenshot of the page',
cua_version='v3',
max_steps=100,
deadline_seconds=600
)
print('run_id:', run['run_id'], 'state:', run['state'])
for event in stream_events(run['run_id']):
print(event)Key request fields
- ●machine_id: required. The ID of a provisioned cloud VM from /v1/machines.
- ●task: required. The high-level goal for the agent to complete.
- ●cua_version: optional. Default 'v3'. Set to 'v4' for autonomous mode with a pass/fail verifier.
- ●instructions: optional. Additional guidance appended to the base prompt.
- ●max_steps: optional. Maximum number of steps the agent may take before stopping.
- ●deadline_seconds: optional. Time limit for the run in seconds.
- ●webhook_url: optional. URL to receive an HMAC-signed webhook when the run finishes.
- ●on_awaiting_human: optional. Behavior when the agent needs approval (pause, fail, or cancel).
POST /v1/runs creates an autonomous task run, streams events with GET /v1/runs/{id}/events, and bills $0.05 per agent step.
Where this beats brittle automation
Traditional automation relies on hardcoded selectors, XPath, or API endpoints that rarely exist for every UI element. When a layout shifts or a copy changes, scripts break. The computer use agent built on /v1/runs sees the actual screen using vision models. It interprets text, buttons, and layout context just like a human would. It can click, type, scroll, and inspect DOM elements without brittle selectors. It works whether the interface is a desktop app, a browser tab, or a terminal shell. This makes it robust to changes and capable of handling tasks that require reasoning across multiple applications.
The /v1/runs endpoint gives you a fully autonomous computer use agent that drives real desktops, browsers, and terminals. You define a task, target a machine, and stream results without managing state yourself. Start building agents that see and act on the screen, not just against a fixed API surface. Get your API key at https://coasty.ai/developers and try your first run today.