Turn PyAutoGUI Code into Structured Actions with the Free Coasty Parse Endpoint
Most pyautogui scripts rely on absolute coordinates. You manually calculate where a button is, copy pixel values, and paste them into click(x, y). This breaks every time the layout changes. You can rebuild the script by hand, or you can delegate the translation to the computer use API. The /v1/parse endpoint turns raw pyautogui code into a structured list of actions that your agent can follow. It is free, it runs locally, and you can reuse the output across runs, sessions, or workflows.
How the Parse Endpoint Works
POST https://coasty.ai/v1/parse with the pyautogui source code and optional configuration. The endpoint parses the script and returns a JSON array of actions. Each action has a type, coordinates, and optional metadata. You can then feed those actions into /v1/predict or /v1/sessions/{id}/predict to drive a computer use agent. This decouples the scripting layer from the execution layer and lets you maintain a single source of truth.
import os
import requests
API_KEY = os.getenv("COASTY_API_KEY")
BASE_URL = "https://coasty.ai/v1"
pyautogui_code = '''import pyautogui
time.sleep(1)
pyautogui.click(1000, 500)
time.sleep(0.5)
pyautogui.write("hello")
pyautogui.press("enter")
'''
resp = requests.post(
f"{BASE_URL}/parse",
headers={
"X-API-Key": API_KEY,
"Content-Type": "application/json"
},
json={
"code": pyautogui_code,
"language": "python"
}
)
resp.raise_for_status()
actions = resp.json()["actions"]
print("Parsed actions:")
for i, a in enumerate(actions, 1):
print(f"{i}. {a.get('type')} at {a.get('coordinates')}")Response Fields Reference
- ●actions: an array of parsed actions. Each action is an object with a type string (e.g., click, type, press).
- ●coordinates: an optional object holding x and y values for click and other position-based steps.
- ●metadata: optional extra fields like delay_seconds, keys, or modifiers.
- ●error: if parsing fails, the response includes an error object with a code and message.
POST /v1/parse is free and runs client-side; no credits are deducted.
Where This Beats Brittle Automation
Fixed coordinates become brittle the moment the window resizes or the page layout shifts. A computer use agent can see the screen and adapt, but you must give it clear, structured instructions. The parse endpoint gives you that structure without writing a parser from scratch. You can store the parsed actions as a JSON file, version them, and run them repeatedly through Sessions or Workflows. This approach is more resilient than maintaining giant if/else blocks of pixel checkers, and it lets you leverage the vision-based computer use API for higher-level tasks while keeping low-level steps in a script you control.
Use the free parse endpoint to translate existing pyautogui scripts into structured actions, then feed those actions into /v1/sessions or /v1/runs to run them with a computer use agent. Start building with the computer use API today. Get a key at https://coasty.ai/developers .