Tutorial

Automate Any Desktop App with the Coasty Computer Use API

Marcus Sterling||8 min
+Enter

Most automation tools rely on brittle selectors, API docs, or fragile UI trees. When the UI changes, your scripts break. The Coasty computer use API turns your agent into a real user. It captures screenshots, reads pixels, and emits actions that interact with desktops, browsers, and terminals just like a human would. You get a single, stateful loop that sees the screen and acts until the task is done.

How it works

The computer use agent runs in a loop. Each iteration captures a screenshot, sends it to the vision model with an instruction, and receives a structured action list. The server handles state, retries, and safety until the status becomes "done". You can also use a task run that manages the entire lifecycle, or a session for persistent memory across calls.

python
import os
import base64
import requests

API_KEY = os.getenv("COASTY_API_KEY")
BASE_URL = "https://coasty.ai/v1"

# Capture a local screenshot and encode to Base64
with open("screenshot.png", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode("utf-8")

# Call vision endpoint to get actions
resp = requests.post(
    f"{BASE_URL}/predict",
    headers={"X-API-Key": API_KEY},
    json={
        "image": image_b64,
        "instruction": "Click the sign-in button and type your email in the email field.",
        "cua_version": "v3"
    }
)
resp.raise_for_status()
actions = resp.json()["actions"]
status = resp.json()["status"]
print("Actions:", actions)
print("Status:", status)

# Keep capturing and predicting until status is done
while status != "done":
    with open("screenshot.png", "rb") as f:
        image_b64 = base64.b64encode(f.read()).decode("utf-8")
    resp = requests.post(
        f"{BASE_URL}/predict",
        headers={"X-API-Key": API_KEY},
        json={
            "image": image_b64,
            "instruction": "Continue clicking and typing as needed.",
            "cua_version": "v3"
        }
    )
    resp.raise_for_status()
    actions = resp.json()["actions"]
    status = resp.json()["status"]
    print("Actions:", actions)
    print("Status:", status)

print("Task completed.")

Key fields and pricing

  • POST /v1/predict costs $0.05 per call and requires image, instruction, and cua_version.
  • You receive an actions array and a status field that you loop on until done.
  • POST /v1/sessions creates a stateful session for trajectory memory and costs $0.10 per creation then $0.04 per predict.
  • POST /v1/ground costs $0.03 and maps a screenshot plus element description to x,y coordinates.
  • POST /v1/parse is free and converts pyautogui code into structured actions.

Loop capture, predict, and act until status is done.

Where this beats brittle automation

Traditional automation tools depend on stable selectors or exhaustive API docs, which break when UIs evolve. The Coasty computer use agent sees the screen and understands context. It can click a button that moves, type into a dynamic input field, and scroll through lists without brittle selectors. It works on browsers, native desktop apps, and terminals alike, all from the same loop.

Beyond one-shot automation

  • Use POST /v1/runs to queue a whole workflow and let the server manage retries, timeouts, and human approval.
  • Define stateful workflows with POST /v1/workflows that include task, assert, if, loop, and parallel steps.
  • Provision real desktops with POST /v1/machines and let the agent drive VMs you own or rent.
  • Billed $0.05 per agent step in task runs and per task step in workflows.
  • Manage credit consumption via a prepaid USD wallet where 1 credit equals $0.01.

You now have a clear path to automate any desktop app with the Coasty computer use API. Start by capturing a screenshot, looping predict, and acting until done. Then layer in sessions, workflows, and real machines to scale across teams and environments. Grab your API key at https://coasty.ai/developers and start building agents that see and act like humans.

Want to see this in action?

View Case Studies
Try Coasty Free