Comparison

v3 vs v4: Choosing a Computer Use Model on the API

David Park||8 min
+K

You want a computer use agent that understands your screen, clicks buttons, and fills forms like a human. Coasty offers two CUA versions: v3 for low-cost, stateless loops and v4 for autonomous tasks with built-in verification. This post explains the real differences, exact request fields, and pricing so you can pick the right model for your automation.

How the two models differ

  • v3: stateless, loop capture + predict + act until status is done. No trajectory memory. Good for single-shot actions or short scripts.
  • v4: autonomous with a pass/fail verifier. Runs a full agent to completion using the same predict loop but with automatic outcome checks.
  • Both versions use the same predict endpoint and billing, but v4 is the default when you start a Task Run; v3 is the default for stateless predict calls.
  • Use v3 when you control the loop and need to keep costs low per step ($0.05). Use v4 when you want a ready-made agent that handles the full task lifecycle.

Task Run request with CUA version

  • POST /v1/runs provisions a machine and starts a Task Run that drives the agent to completion.
  • Set cua_version to v3 for a stateless loop or v4 for autonomous verification.
  • Options include max_steps, deadline_seconds, and on_awaiting_human (pause, fail, or cancel).
  • The server returns a run ID that you can poll or stream events from.
bash
# Example: start a v4 Task Run with automatic verification

# 1. Start the run (curl)
curl -X POST https://coasty.ai/v1/runs \
  -H "X-API-Key: ${COASTY_API_KEY}" \
  -H "Content-Type: application/json" \
  -d ' {
    "machine_id": "mach-123",
    "task": "Open Chrome and navigate to https://example.com",
    "cua_version": "v4",
    "on_awaiting_human": "pause",
    "max_steps": 100,
    "deadline_seconds": 300
  }'

# Response:
# {
#   "id": "run-abc123",
#   "status": "queued",
#   ...
# }

# 2. Stream events from the run (curl)
curl -X GET https://coasty.ai/v1/runs/run-abc123/events \
  -H "X-API-Key: ${COASTY_API_KEY}" \
  -N

# Example stream line:
# data: {"type":"step","action":"click","x":120,"y":83}
# data: {"type":"status","status":"running"}
# data: {"type":"status","status":"succeeded"}

Stateless predict loop (v3)

  • POST /v1/predict ($0.05 per call) takes a base64 screenshot, an instruction, and cua_version.
  • The response includes actions and a status. Loop until the status is done.
  • Use this when you want to control the predict loop yourself and keep each step billed at $0.05.
  • No session or trajectory memory is required, making it cheap for short or one-off tasks.
python
# Example: v3 stateless predict loop in Python

import base64
import os
import requests

def predict_v3(screenshot_base64: str, instruction: str) -> dict:
    url = "https://coasty.ai/v1/predict"
    key = os.environ["COASTY_API_KEY"]
    headers = {"X-API-Key": key, "Content-Type": "application/json"}
    payload = {
        "screenshot": screenshot_base64,
        "instruction": instruction,
        "cua_version": "v3"
    }
    resp = requests.post(url, json=payload, headers=headers)
    resp.raise_for_status()
    return resp.json()

# Loop until done
status = "running"
while status != "done":
    # 1. Capture screen, encode as base64
    # screenshot_b64 = capture()
    screenshot_b64 = "<base64-encoded-screenshot>"
    # 2. Predict actions
    result = predict_v3(screenshot_b64, "Click the login button")
    actions = result.get("actions", [])
    status = result.get("status", "running")
    # 3. Execute actions
    for act in actions:
        execute_action(act)

print("Task complete.")

Use v3 for cheap, stateless loops that you control. Use v4 for autonomous tasks with built-in pass/fail verification.

Where computer use beats brittle automation

  • Coasty agents see the real UI, not just static selectors or fragile class names.
  • They adapt to layout changes, missing elements, or missing classes without configuration.
  • v4’s pass/fail verifier evaluates outcomes automatically, reducing false positives.
  • You can use the same CUA version across browsers, desktop apps, and terminals with consistent behavior.

Pick v3 for low-cost, stateless predict loops or v4 for autonomous tasks with verification. Start building reliable computer use agents that understand your screen, not just your selectors. Get your API key at https://coasty.ai/developers.

Want to see this in action?

View Case Studies
Try Coasty Free