Build a Self-Running QA Testing Bot with the Computer Use API
Writing brittle selectors and keeping your test suite in sync with UI changes is exhausting. You need a QA bot that sees the screen, understands what it is looking at, and acts like a human. The Coasty Computer Use API lets you spawn a real computer (cloud VM) with a browser and send it instructions. It captures screenshots, reasons about them, and returns actions such as click, type, and scroll. You can assert results and stop the test when the job is done. In this guide you will build a self-running QA bot that navigates to a web app, fills a form, and checks the success message.
How the Computer Use API works
The Computer Use API drives a real desktop environment not just API calls. It uses Task Runs to orchestrate the whole test. You POST to /v1/runs with a machine_id, a task description, and optional instructions. The cua_version defaults to v3 but you can use v4 for an autonomous agent with a pass/fail verifier. The server provisions a cloud VM, starts a browser, and runs your task. Each step the agent takes costs $0.05. The run can return queued, running, awaiting_human, succeeded, failed, cancelled, or timed_out. You can stream events with GET /v1/runs/{id}/events to watch progress in real time.
#!/usr/bin/env bash
# Create a machine and run a QA test task with the Computer Use API
# Replace YOUR_API_KEY below with a valid key from https://coasty.ai/developers
API_KEY="${COASTY_API_KEY}"
BASE_URL="https://coasty.ai/v1"
# Create a machine - you can reuse machine_id for multiple runs
MACHINE_RESPONSE=$(curl -s -X POST "${BASE_URL}/machines" \
-H "X-API-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"os": "ubuntu_24_04",
"size": "small",
"image": "browser"
}')
machine_id=$(echo "$MACHINE_RESPONSE" | jq -r '.machine_id')
echo "Created machine_id: ${machine_id}"
# Create a QA test run that opens a browser, fills a form, and asserts success
RUN_RESPONSE=$(curl -s -X POST "${BASE_URL}/runs" \
-H "X-API-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
-d "{
\"machine_id\": \"${machine_id}\",
\"task\": \"Navigate to https://example.com/contact, fill the name field with 'Bot QA', type 'Hello' in the message box, and click the submit button. Wait for the page to load and confirm that the success message 'Thank you for your message!' is visible.\",
\"cua_version\": \"v3\",
\"max_steps\": 50,
\"deadline_seconds\": 300
}")
run_id=$(echo "$RUN_RESPONSE" | jq -r '.run_id')
echo "Created run_id: ${run_id}"
# Poll the run status until it succeeds or fails
echo "Waiting for run to complete..."
while true; do
STATUS_RESPONSE=$(curl -s -X GET "${BASE_URL}/runs/${run_id}" \
-H "X-API-Key: ${API_KEY}")
status=$(echo "$STATUS_RESPONSE" | jq -r '.status')
echo "Status: ${status}"
if [ "$status" = "succeeded" ]; then
echo "QA test PASSED."
break
elif [ "$status" = "failed" ] || [ "$status" = "cancelled" ] || [ "$status" = "timed_out" ]; then
echo "QA test FAILED or CANCELLED."
echo "$STATUS_RESPONSE" | jq '.'
break
fi
sleep 2
done
# Clean up: stop the machine
curl -s -X POST "${BASE_URL}/machines/${machine_id}/stop" \
-H "X-API-Key: ${API_KEY}"What you get with a Computer Use QA bot
- ●A real browser on a cloud VM (ubuntu_24_04, small size).
- ●A task that describes the test in natural language (e.g., fill form and check success).
- ●cua_version v3 for explicit steps, or v4 for an autonomous agent with a pass/fail verifier.
- ●max_steps up to 50 and deadline_seconds up to 300 to keep tests fast.
- ●Status updates: queued, running, awaiting_human, succeeded, failed, cancelled, timed_out.
- ●Billed $0.05 per agent step (the server drives the browser and desktop).
- ●You can stop, resume, or cancel runs via POST /v1/runs/{id}/cancel and POST /v1/runs/{id}/resume.
- ●Stream live events with GET /v1/runs/{id}/events to add progress UI or logging.
A QA bot that drives a real browser and asserts results like a human.
Where this beats brittle automation
Traditional E2E frameworks rely on CSS selectors, XPath, or IDs. When the UI changes, your tests break. The Computer Use API sees the screen. It captures a screenshot, reasons about it, and returns actions such as click, type, and scroll. Because it works like a human, it adapts to layout changes, hidden elements, and dynamic content. You write your QA tests in natural language instead of maintaining a fragile selector map. This makes your test suite more resilient and easier to update.
You now have a self-running QA bot that uses the Computer Use API to drive a real browser and assert results. You can extend this bot to run regression suites, visual regression checks, or smoke tests across multiple browsers. The API handles the VM lifecycle, so you only pay for the steps the agent takes. Ready to try it yourself? Get a key at https://coasty.ai/developers and start building your QA bot.