Tutorial

Building an Autonomous Agent That Completes a Task with /v1/runs

James Liu||6 min
Ctrl+R

Many automation tools rely on brittle selectors or a fixed API surface. They break when layouts change. They cannot interact with local apps, browsers, or terminals the same way a human does. The /v1/runs endpoint lets you spin up an autonomous computer use agent that sees the screen, decides next actions, and executes them on a real machine. You define a task, choose a machine, set a deadline, and stream events as the agent works to completion. You pay only for the actual steps the agent takes at $0.05 per step.

How it works

The flow starts with a POST to /v1/runs. You provide a machine_id to target a cloud VM with a desktop, browser, or terminal. The agent runs with cua_version set to the latest version, typically "v3" or "v4". The cua_version field controls whether the agent uses a pass/fail verifier with the v4 mode. The task field holds the high-level goal. You can append custom instructions to guide the prompt. The server returns a run_id and initial state queued. As the agent progresses, you stream events from GET /v1/runs/{id}/events. The events stream contains actions taken and status changes such as running, awaiting_human, succeeded, failed, cancelled, or timed_out. You can cancel a run with POST /v1/runs/{id}/cancel or resume a paused run with POST /v1/runs/{id}/resume. The agent continues until it reaches a terminal state or the deadline (deadline_seconds) expires.

python
import os
import httpx
import json
from urllib.parse import urlencode

API_KEY = os.getenv('COASTY_API_KEY')
BASE = 'https://coasty.ai/v1'

def create_run(machine_id, task, cua_version='v3', instructions=None, max_steps=50, deadline_seconds=600, webhook_url=None):
    url = f'{BASE}/runs'
    payload = {
        'machine_id': machine_id,
        'task': task,
        'cua_version': cua_version,
        'max_steps': max_steps,
        'deadline_seconds': deadline_seconds,
    }
    if instructions:
        payload['instructions'] = instructions
    if webhook_url:
        payload['webhook_url'] = webhook_url
    resp = httpx.post(url, json=payload, headers={'X-API-Key': API_KEY})
    resp.raise_for_status()
    return resp.json()

def stream_events(run_id):
    url = f'{BASE}/runs/{run_id}/events'
    headers = {'X-API-Key': API_KEY}
    with httpx.stream('GET', url, headers=headers, timeout=300) as r:
        r.raise_for_status()
        for line in r.iter_lines():
            if line.startswith('data: '):
                yield json.loads(line[6:])

if __name__ == '__main__':
    run = create_run(
        machine_id='your-machine-id',
        task='Open Chrome, navigate to https://coasty.ai/docs, and take a screenshot of the page',
        cua_version='v3',
        max_steps=100,
        deadline_seconds=600
    )
    print('run_id:', run['run_id'], 'state:', run['state'])
    for event in stream_events(run['run_id']):
        print(event)

Key request fields

  • machine_id: required. The ID of a provisioned cloud VM from /v1/machines.
  • task: required. The high-level goal for the agent to complete.
  • cua_version: optional. Default 'v3'. Set to 'v4' for autonomous mode with a pass/fail verifier.
  • instructions: optional. Additional guidance appended to the base prompt.
  • max_steps: optional. Maximum number of steps the agent may take before stopping.
  • deadline_seconds: optional. Time limit for the run in seconds.
  • webhook_url: optional. URL to receive an HMAC-signed webhook when the run finishes.
  • on_awaiting_human: optional. Behavior when the agent needs approval (pause, fail, or cancel).

POST /v1/runs creates an autonomous task run, streams events with GET /v1/runs/{id}/events, and bills $0.05 per agent step.

Where this beats brittle automation

Traditional automation relies on hardcoded selectors, XPath, or API endpoints that rarely exist for every UI element. When a layout shifts or a copy changes, scripts break. The computer use agent built on /v1/runs sees the actual screen using vision models. It interprets text, buttons, and layout context just like a human would. It can click, type, scroll, and inspect DOM elements without brittle selectors. It works whether the interface is a desktop app, a browser tab, or a terminal shell. This makes it robust to changes and capable of handling tasks that require reasoning across multiple applications.

The /v1/runs endpoint gives you a fully autonomous computer use agent that drives real desktops, browsers, and terminals. You define a task, target a machine, and stream results without managing state yourself. Start building agents that see and act on the screen, not just against a fixed API surface. Get your API key at https://coasty.ai/developers and try your first run today.

Want to see this in action?

View Case Studies
Try Coasty Free