Engineering

Coasty Computer Use API Pricing: Every Endpoint, Every Cent

Lisa Chen||8 min
Ctrl+A

Most automation APIs bill by minute, by hour, or by generic token count. Coasty bills by what matters: each time the agent looks at a screen and takes an action. That is the computer use API pricing model. It rewards you for real agent steps and keeps the bill predictable. Below is every endpoint, every field, and every cent you will be charged.

How it works

The computer use API lets you build agents that see a desktop or browser and act like a human. You send a screenshot, an instruction, and a CUA version. The model returns actions. You capture again and repeat until status is done. For stateful long-running tasks you create a session, which retains trajectory memory between predictions. Grounding maps a screenshot and element description to x,y coordinates. Parse turns raw pyautogui code into structured actions.

bash
# Example: POST /v1/predict (vision, $0.05)
# Create a short-lived agent that clicks a button.
export COASTY_API_KEY=$(cat ~/.coasty_key)

# Base64 encode a screenshot (replace with your file)
SCREENSHOT=$(base64 -i screenshot.png)

curl -X POST https://coasty.ai/v1/predict \
  -H "X-API-Key: $COASTY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "screenshot": "'$SCREENSHOT'",
    "instruction": "Click the submit button.",
    "cua_version": "v3"
  }'

# Response includes actions and status

Vision endpoints

  • POST /v1/predict (stateless): $0.05 per request. Takes base64 screenshot, instruction, cua_version. Returns actions and status. Loop capture, predict, act until status is done.
  • POST /v1/sessions (stateful): $0.10 per session. Creates a session with trajectory memory. Then POST /v1/sessions/{id}/predict ($0.04) for each subsequent step while the session is active.
  • POST /v1/ground ($0.03): Maps a screenshot + element description to x,y coordinates. Useful for precise clicks anywhere on the screen.
  • POST /v1/parse (free): Turns pyautogui code into structured actions without a screen. Great for converting existing scripts before you attach a computer use agent.

One credit equals $0.01. All vision calls are billed in credits. Check your wallet balance before you start.

Task runs (agent lifecycle)

  • POST /v1/runs: Starts a task run. Required fields: machine_id, task, cua_version ("v3" default, "v4" autonomous with pass/fail verifier). Optional: instructions, system_prompt, max_steps, deadline_seconds, on_awaiting_human ("pause" / "fail" / "cancel"), webhook_url. Billed $0.05 per agent step.
  • GET /v1/runs and GET /v1/runs/{id}: Retrieve run status and details.
  • POST /v1/runs/{id}/cancel: Cancel an in-progress run.
  • POST /v1/runs/{id}/resume: Resume a run that was paused (on_awaiting_human = "pause").
  • GET /v1/runs/{id}/events: Streams Server-Sent Events with run progress. Reconnect using Last-Event-ID header.

Where this beats brittle automation

Traditional automation relies on brittle selectors, XPath, or hardcoded IDs. When UI changes, your scripts break. With the computer use API the agent sees the screen, understands context, and chooses the right action. It works even if IDs shift, layouts change, or you are in a browser, terminal, or native desktop. That human-like perception means fewer maintenance issues and higher reliability for complex workflows.

You now have a complete picture of Coasty computer use API pricing and every endpoint. Start building agents that see and act. Get your API key at https://coasty.ai/developers and test the free parse endpoint first.

Want to see this in action?

View Case Studies
Try Coasty Free