Tutorial

Automate Form Filling and Checkout Flows with the Computer Use API

Marcus Sterling||7 min
Esc

Checkout flows are full of text, dates, dropdowns, and CAPTCHAs that break brittle selectors. You can build a computer use agent that sees the screen, reads instructions, and clicks exactly where users would. This tutorial shows how to automate form filling and checkout using the Coasty Computer Use API.

How it works

Coasty runs an agent on a real desktop or browser. You describe what to do in natural language. The agent captures screenshots, predicts clicks and keystrokes, and acts. The API returns actions and status. You loop capture → predict → act until status is "done". For stateful runs, use sessions to keep memory between steps.

bash
#!/bin/bash
# Example: fill a form and buy an item with the Computer Use API
# Replace with your own machine_id or let Coasty provision one.

COASTY_API_KEY="${COASTY_API_KEY}"
machine_id="${MACHINE_ID:-"my-machine"}"

curl "https://coasty.ai/v1/runs" \
  -H "X-API-Key: $COASTY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "machine_id": "'$machine_id'",
    "task": "Fill the form with name John Doe, email [email protected], password secret, select size M, and buy the red T-shirt. Click Checkout and complete payment on the final screen. Do not dismiss any popups."
  }' | jq .

Key fields and pricing

  • POST /v1/runs provisions a task and starts an agent. You send machine_id, task, and optional instructions. Billed $0.05 per agent step.
  • GET /v1/runs returns a list of runs. GET /v1/runs/{id} gives the current run status and events.
  • GET /v1/runs/{id}/events streams Server-Sent Events with status updates and actions. Reconnect with Last-Event-ID.
  • Optional fields: instructions (appended to the base prompt), system_prompt, max_steps, deadline_seconds, on_awaiting_human (pause/fail/cancel), webhook_url.
  • Stateful runs: POST /v1/sessions creates a session, then POST /v1/sessions/{id}/predict for each step. /v1/sessions costs $0.10 to start, /v1/sessions/{id}/predict costs $0.04 per step.
  • Vision actions: POST /v1/predict takes a base64 screenshot + instruction + cua_version. Returns actions and status. Free.
  • Grounding: POST /v1/ground maps a screenshot + element description to x,y coordinates. Costs $0.03.
  • Parsing: POST /v1/parse turns pyautogui code into structured actions. Free.
  • Billing: Prepaid USD wallet. 1 credit = $0.01. Webhooks are HMAC signed (header Coasty-Signature: t=unix,v1=hex). Idempotency-Key makes writes safe to retry.

The Computer Use API bills $0.05 per agent step for task runs.

Where this beats brittle automation

Traditional tools rely on CSS selectors, XPath, or API stubs. When a layout changes or a class name shifts, the automation breaks. With a computer use agent, you describe the intent. The agent sees the screen, understands the context, and clicks or types where a human would. It adapts to dynamic forms, CAPTCHAs, and login flows without extra glue code.

Start automating form filling and checkout flows with the Computer Use API. Get a key at https://coasty.ai/developers and try a task run with POST /v1/runs. Build agents that handle complex UIs reliably.

Want to see this in action?

View Case Studies
Try Coasty Free