Automating Form Filling and Checkout Flows via the Computer Use API
Web forms and checkout pages are fragile. They change layouts, reorder fields, or inject new inputs without notice. Traditional automation tools rely on brittle selectors like CSS classes or IDs. When a site updates its markup, your script breaks. The Coasty Computer Use API lets you build a computer use agent that sees the page layout and clicks elements exactly where they appear. This guide walks through automating a multi-step form and checkout flow using the real endpoints and fields from the API.
How it works
The Computer Use API drives desktop browsers and web apps by capturing a screenshot, interpreting the visual context, and returning actions such as mouse clicks, text inputs, and navigation. The core loop uses two endpoints. First, you capture the screen and send an instruction to the /v1/predict endpoint with a base64 screenshot, a text instruction, and the cua_version. The response includes actions and a status. You repeat the loop until the status is done. For stateful tasks you can also use /v1/sessions with a session_id to keep a trajectory in memory across steps. This makes multi-step flows such as form filling and checkout more reliable.
curl -X POST https://coasty.ai/v1/predict \
-H "X-API-Key: $COASTY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"screenshot": "$(base64 -i screenshot.png | tr -d "\n")",
"instruction": "Find the email input, type your email, and press Enter",
"cua_version": "v3"
}'Typical steps in a checkout flow
- ●Navigate to a product page and add to cart
- ●Open the cart page and review items
- ●Fill in shipping information with a form
- ●Enter billing details on a separate page
- ●Complete payment on a third-party gateway
- ●Confirm the order and navigate to the confirmation page
Use /v1/predict in a loop, capture the screenshot, and repeat until the status is done.
Where this beats brittle automation
A computer use agent does not care about CSS classes or IDs. It sees the visual layout and uses relative coordinates from the screenshot to click inputs and buttons. When a site rearranges its markup, the agent still finds elements in the same screen locations. This makes checkout flows resilient to layout changes and partial updates. Unlike API-only tools, you can also handle multi-step flows where data is spread across several pages and you must navigate between them. The agent can follow links, click buttons, and fill forms exactly as a human would.
Build a reliable form filler and checkout agent by looping through /v1/predict, capturing screenshots, and acting until status is done. Use /v1/sessions to keep state across steps. Get a key at https://coasty.ai/developers and start automating your checkout flows.