Automating Form Filling and Checkout Flows Over the Computer Use API
Form filling and checkout flows on the web are messy. Layouts change, IDs shift, and CAPTCHAs block pure-play automation. The Coasty computer use API solves this by letting an agent watch the screen, understand instructions, and click like a human. You define what you want, the agent handles the details.
How it works
A task run drives the automation over the API. You POST /v1/runs with a task description, optional instructions, and the cua_version. The server provisions a machine, launches the agent, and streams events until success, failure, cancellation, or timeout. The billing model is $0.05 per agent step, plus any optional actions from the vision endpoint. A workflow DSL lets you sequence tasks, asserts, loops, and parallel steps with variables and guards like budget_cents and deadline_seconds.
curl https://coasty.ai/v1/runs \
-H "Authorization: Bearer $COASTY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"machine_id": "machine_123",
"task": "Fill out the checkout form on https://example.com/cart and complete the purchase",
"cua_version": "v4",
"max_steps": 100,
"deadline_seconds": 600,
"on_awaiting_human": "pause",
"webhook_url": "https://your-server.com/webhook"
}'Key fields and behavior
- ●machine_id: identifier of the provisioned machine.
- ●task: natural language description of the goal (e.g., fill checkout form).
- ●cua_version: "v3" for guided mode, "v4" for autonomous mode with a pass/fail verifier.
- ●max_steps: maximum number of agent steps before forced termination.
- ●deadline_seconds: timeout in seconds (e.g., 600).
- ●on_awaiting_human: "pause", "fail", or "cancel" when the agent needs human input.
- ●webhook_url: endpoint to receive Server-Sent Events about the run.
- ●Billed $0.05 per agent step.
- ●Workflow DSL supports task, assert, if, loop, parallel, human_approval, retry, succeed, fail, and variables like {{inputs.x}} or stepId.field.
POST /v1/runs with cua_version "v4" drives an autonomous computer use agent over real desktops, browsers, and terminals.
Where this beats brittle automation
Traditional tools rely on selectors and fixed X/Y coordinates that break when layouts change. The computer use API lets the agent see the screen, reason about context, and adapt its actions. It can handle dynamic forms, CAPTCHAs, and UI changes without you updating selectors. It also works across browsers, terminals, and desktop apps, not just a single page.
Start building reliable form filling and checkout automation with the Coasty computer use API. Get a key at https://coasty.ai/developers and try the workflow DSL for sequencing complex flows.