Automating Form Filling and Checkout Flows with the Computer Use API
Checkout flows are brittle. CAPTCHAs, layout shifts, and dynamic fields break traditional CSS selectors. Use the Coasty computer use agent to drive a real browser or desktop, see the screen state, and click exactly what you see. This guide shows how to automate a checkout flow with the computer use API.
How it works
The computer use agent runs on a cloud virtual machine via the /v1/machines endpoint. You submit a task description and the agent sees screenshots, interprets them, and takes actions. The agent uses the /v1/predict endpoint to generate actions like click, type, and scroll. Each prediction consumes 0.05 credits. The agent continues until the task completes or hits the max_steps limit. For autonomous runs, set cua_version to v4 to enable a pass/fail verifier.
curl -X POST https://coasty.ai/v1/runs \
-H "X-API-Key: $COASTY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"machine_id": "vm-123",
"task": "Open https://example.com/store, add item to cart, fill checkout form with name=John Doe [email protected] address=123 Main St zip=90210, and place the order. Stop when the order is confirmed or an error occurs.",
"cua_version": "v4",
"max_steps": 50,
"deadline_seconds": 600,
"on_awaiting_human": "pause"
}'Key parameters and billing
- ●machine_id: Cloud VM ID from /v1/machines.
- ●task: Plain language description of the checkout flow.
- ●cua_version: v3 for guided runs, v4 for autonomous with pass/fail verifier.
- ●max_steps: Hard limit on agent steps. Each step costs $0.05.
- ●deadline_seconds: Optional timeout. Billed per step regardless of timeout.
- ●on_awaiting_human: pause, fail, or cancel when the agent asks for approval.
- ●POST /v1/runs initiates the task run.
- ●GET /v1/runs returns run status and metadata.
- ●GET /v1/runs/{id}/events streams events as Server-Sent Events.
Each agent step costs $0.05 and the server drives the agent to completion.
Where this beats brittle automation
Traditional automation relies on CSS selectors, XPath, or API-only integrations. Layout changes break selectors, and CAPTCHAs cause failures. The computer use agent sees the actual screen, interprets buttons and fields, and clicks precisely what it sees. It handles dynamic forms, hidden elements, and visual CAPTCHAs without brittle selectors. This approach works across browsers, desktops, and terminals.
Start building checkout bots that see the screen and act like a human. Provision a machine, submit a task, and let the computer use agent handle the rest. Get a key at https://coasty.ai/developers.