Comparison

v3 vs v4: Choosing a Computer Use Model on the API

Sarah Chen||6 min
Pg Up

You need a bot that clicks, types, and navigates like a human. The Coasty computer use API has two model families: v3 for guided, stateful tasks and v4 for fully autonomous runs with a pass/fail verifier. This post shows you when to use each and how to call the real endpoints.

v3: Guided, stateful trajectory memory

Use v3 when you want the agent to act within a persistent session and receive feedback after every step. You start a session, then make predict calls that include a base64 screenshot, an instruction, and cua_version:"v3". The server returns actions and a status. You loop capture, predict, act until the status is "done". Each predict costs $0.04 and each agent step in a task run costs $0.05. This model is ideal for workflows that need human-in-the-loop assertions or where you want to inspect intermediate actions.

bash
curl -X POST https://coasty.ai/v1/sessions \  -H "X-API-Key: $COASTY_API_KEY" \  -H "Content-Type: application/json" \  -d '{"cua_version": "v3"}'

v4: Fully autonomous runs with verifier

Use v4 when you want the server to drive the agent to completion without predict calls. You POST to /v1/runs with cua_version:"v4" and provide a task. The server runs an autonomous agent and includes a pass/fail verifier. States are queued, running, awaiting_human, succeeded, failed, cancelled, timed_out. Each agent step costs $0.05 and each task step in a workflow also costs $0.05. This model is best for end‑to‑end tasks where you care only about success or failure.

bash
curl -X POST https://coasty.ai/v1/runs \  -H "X-API-Key: $COASTY_API_KEY" \  -H "Content-Type: application/json" \  -d '{"machine_id": "m1","task": "open chrome and navigate to https://coasty.ai","cua_version": "v4","max_steps": 20,"deadline_seconds": 300}'

Key differences at a glance

  • v3: requires a session and predict calls; each predict costs $0.04; suited for guided, stateful tasks with human feedback.
  • v4: single POST to /v1/runs; no predict calls; includes a pass/fail verifier; billed $0.05 per agent step.
  • v3: states include queued, running, done; v4: states include queued, running, awaiting_human, succeeded, failed, cancelled, timed_out.
  • Both versions can be used inside workflows with task, assert, if, loop, parallel, human_approval, retry, succeed, and fail steps.

Choose v3 for guided, stateful tasks and v4 for fully autonomous runs with built‑in verification.

Where this beats brittle automation

Traditional automation relies on brittle selectors and hardcoded APIs. The Coasty computer use agent sees the screen, interprets context, and acts like a human. It handles layout changes, dynamic content, and unexpected UI states without breaking. This makes it ideal for CLI tools, web scraping, and desktop tasks where reliability matters more than every single click.

Start building with the computer use API. Get your key at https://coasty.ai/developers and pick the model that fits your workflow.

Want to see this in action?

View Case Studies
Try Coasty Free