Tutorial

Build a Self-Running QA Testing Bot with the Computer Use API

Sarah Chen||12 min
Pg Up

Traditional QA scripts struggle with dynamic UIs, pop-ups, and layout shifts. A self-running QA bot that sees the screen and acts like a human solves this by driving real browsers and desktop apps. The Coasty Computer Use API lets you orchestrate a full testing run with a single POST /v1/runs request. This tutorial builds a QA bot that opens a site, clicks buttons, fills forms, and asserts results, all without brittle selectors.

How it works

The QA bot uses two core Coasty endpoints. First, POST /v1/runs starts the full test. The request defines a machine_id (a cloud VM), a task (the test scenario), and optional instructions. You can append a system_prompt and set max_steps and deadline_seconds to control the run. When the server receives the request, it provisions a machine and launches a computer use agent with a default cua_version of v3. The agent performs actions until it reaches a final state: queued, running, awaiting_human, succeeded, failed, cancelled, or timed_out. Each agent step costs $0.05. After the run completes, you can stream events with GET /v1/runs/{id}/events or poll the status with GET /v1/runs/{id}. A second endpoint, POST /v1/sessions, enables stateful trajectory memory if you want to build a custom loop that captures a screenshot, predicts actions, then acts. For element clicks, you can use POST /v1/ground to map a screenshot and description to x,y coordinates. Vision predictions cost $0.05, and stateful predictions cost $0.04.

bash
curl https://coasty.ai/v1/runs \ 
-H "X-API-Key: $COASTY_API_KEY" \ 
-H "Content-Type: application/json" \ 
-d '{"machine_id": "vm-12345","task": "Open https://example.com, click the Get Started button, fill the name field with 'Tester', then assert the page title contains 'Welcome'","cua_version": "v3","max_steps": 50,"deadline_seconds": 600,"on_awaiting_human": "pause"}'

QA bot job flow

  • POST /v1/runs with a machine_id and a natural language task that describes the test steps.
  • Server provisions a cloud VM and launches a computer use agent that sees the screen and acts.
  • Each agent step costs $0.05. The run can run for up to max_steps before stopping.
  • Use on_awaiting_human to control how the agent reacts to prompts you need to approve.
  • Poll /v1/runs/{id} to check status or stream events with GET /v1/runs/{id}/events.
  • For more control, use POST /v1/sessions to create a stateful trajectory and loop capture, predict, act.

POST /v1/runs is the single entry point to drive a full QA test with real screen interaction.

Where this beats brittle automation

Traditional automation relies on XPath, CSS selectors, and fixed element IDs. When a layout changes, the script fails. The computer use API lets the agent see the screen, understand context, and click buttons by description instead of selector. It can handle pop-ups, dynamic content, and mixed UI environments like desktop tools and browsers in the same session. This approach reduces maintenance and increases reliability for complex workflows.

You can now build a self-running QA testing bot that drives real browsers and desktops with natural language tasks. Extend this by chaining multiple runs, integrating with test reports, or using workflows to orchestrate multi-step test suites. Get your API key at https://coasty.ai/developers and start automating your QA flows.

Want to see this in action?

View Case Studies
Try Coasty Free