Tutorial

Ground UI Elements to Coordinates with the Coasty /v1/ground Endpoint

Alex Thompson||5 min
Del

Most UI automation relies on selectors, CSS classes, data attributes, XPath. They break when a designer changes a class name or adds a wrapper div. The Coasty computer use API solves this by grounding every action to the exact visual position of an element. The /v1/ground endpoint takes a screenshot and a short description of an element and returns precise x,y coordinates so your agent can click, type, or hover with pixel-perfect accuracy.

How /v1/ground works

The /v1/ground endpoint maps a screenshot and an element description to x,y coordinates. It is part of the Coasty computer use API and runs on the same cloud infrastructure that powers the agent. You send a base64-encoded screenshot plus a natural language description of the target element and receive back a status and a coordinates object. The endpoint is stateless and does not require a session ID.

bash
#!/bin/bash
# Ground a UI element to coordinates with Coasty /v1/ground
# Requires: COASTY_API_KEY environment variable

API_BASE="https://coasty.ai/v1"
API_KEY=$(cat "$HOME/.coasty-key" 2>/dev/null || echo "$COASTY_API_KEY")

if [ -z "$API_KEY" ]; then
  echo "Error: COASTY_API_KEY not set" >&2
  exit 1
fi

# Example screenshot (replace with your own base64 screenshot)
SCREENSHOT="$(base64 -w0 /tmp/screenshot.png)"

curl -sS --location "$API_BASE/ground" \
  --header "Content-Type: application/json" \
  --header "X-API-Key: $API_KEY" \
  --data-raw '{
    "screenshot": "'"$SCREENSHOT"'",
    "description": "the primary CTA button in the top right"
  }' | jq '.'

Request fields

  • screenshot: base64-encoded PNG image of the UI.
  • description: plain-language description of the target element.
  • Optional: no additional parameters are defined in the current API spec.

Response fields

  • status: 'success' or an error code (e.g., 'invalid_input').
  • coordinates: an object with x and y integers representing the pixel position.
  • message: optional human-readable explanation.

POST /v1/ground costs $0.03 per request and returns exact x,y coordinates for the described element.

Where this beats brittle automation

When a product team updates a UI, they often refactor selectors without warning. Your test suite breaks and you spend hours updating XPath or CSS rules. With Coasty's computer use API, you describe what you want in plain language and the model grounds that description to coordinates on the current screenshot. This means your automation tracks the visual element even when classes change, making your tests more resilient and your builds more stable.

Use /v1/ground to map elements to coordinates and build reliable, visual UI automation on top of the Coasty computer use API. Ready to try it yourself? Get an API key at https://coasty.ai/developers.

Want to see this in action?

View Case Studies
Try Coasty Free