Turn PyAutoGUI Code into Structured Actions with the Free Parse Endpoint
You have a PyAutoGUI script that clicks exact pixel coordinates. It breaks the moment a UI changes by a few pixels. You need a computer use agent that sees the real screen and drives the desktop like a human. The /v1/parse endpoint converts your PyAutoGUI code into structured actions. You get machine-readable steps without writing selectors yourself. This turns brittle coordinate scripts into robust computer use automation.
How /v1/parse works
POST to https://coasty.ai/v1/parse with a base64-encoded screenshot and your PyAutoGUI code. The endpoint returns a JSON object with a list of structured actions. Each action has type, target, and parameters. You can feed these actions into a computer use agent loop (capture → predict → act) or directly into a task run. This endpoint is free, so you can iterate on your automation logic without paying per step.
import base64
import os
import requests
COASTY_API_KEY = os.environ.get("COASTY_API_KEY")
BASE_URL = "https://coasty.ai/v1"
# Base64 encode a screenshot (replace this with your own image)
with open("screenshot.png", "rb") as f:
screenshot_b64 = base64.b64encode(f.read()).decode("utf-8")
pyautogui_code = '''
import pyautogui
pyautogui.moveTo(100, 200)
pyautogui.click()
pyautogui.moveTo(300, 400)
pyautogui.doubleClick()
'''
response = requests.post(
f"{BASE_URL}/parse",
headers={
"X-API-Key": COASTY_API_KEY,
"Content-Type": "application/json"
},
json={
"screenshot": screenshot_b64,
"pyautogui_code": pyautogui_code
}
)
if response.ok:
structured = response.json()
print("Structured actions:")
print(structured)
else:
print("Error:", response.text)What you get back
- ●An array of actions, each with type (e.g., 'move', 'click', 'double_click'), x and y coordinates, and optional parameters.
- ●The endpoint preserves the intent of your PyAutoGUI script but outputs a format that a computer use agent can consume.
- ●No API usage charge for parse. You pay only for agent steps when the agent acts on the structured actions.
POST /v1/parse with a screenshot and pyautogui_code to get structured actions for free.
Where this beats brittle automation
PyAutoGUI scripts rely on exact pixel coordinates. A one-pixel shift breaks the automation. A computer use agent that parses your PyAutoGUI code into structured actions can still use those coordinates as starting points. When the UI changes, you re-run parse on the new screenshot. The agent adapts. You get a computer use agent that understands the screen and reacts to real changes, not to fragile hard-coded positions.
Use the free /v1/parse endpoint to turn PyAutoGUI scripts into structured actions. Build a computer use agent that sees the screen and acts like a human. Get a key at https://coasty.ai/developers to start automating real desktops.