Tutorial

Turn PyAutoGUI Code into Structured Actions with the Free Parse Endpoint

James Liu||5 min
+N

You have a PyAutoGUI script that clicks exact pixel coordinates. It breaks the moment a UI changes by a few pixels. You need a computer use agent that sees the real screen and drives the desktop like a human. The /v1/parse endpoint converts your PyAutoGUI code into structured actions. You get machine-readable steps without writing selectors yourself. This turns brittle coordinate scripts into robust computer use automation.

How /v1/parse works

POST to https://coasty.ai/v1/parse with a base64-encoded screenshot and your PyAutoGUI code. The endpoint returns a JSON object with a list of structured actions. Each action has type, target, and parameters. You can feed these actions into a computer use agent loop (capture → predict → act) or directly into a task run. This endpoint is free, so you can iterate on your automation logic without paying per step.

python
import base64
import os
import requests

COASTY_API_KEY = os.environ.get("COASTY_API_KEY")
BASE_URL = "https://coasty.ai/v1"

# Base64 encode a screenshot (replace this with your own image)
with open("screenshot.png", "rb") as f:
    screenshot_b64 = base64.b64encode(f.read()).decode("utf-8")

pyautogui_code = '''
import pyautogui
pyautogui.moveTo(100, 200)
pyautogui.click()
pyautogui.moveTo(300, 400)
pyautogui.doubleClick()
'''

response = requests.post(
    f"{BASE_URL}/parse",
    headers={
        "X-API-Key": COASTY_API_KEY,
        "Content-Type": "application/json"
    },
    json={
        "screenshot": screenshot_b64,
        "pyautogui_code": pyautogui_code
    }
)

if response.ok:
    structured = response.json()
    print("Structured actions:")
    print(structured)
else:
    print("Error:", response.text)

What you get back

  • An array of actions, each with type (e.g., 'move', 'click', 'double_click'), x and y coordinates, and optional parameters.
  • The endpoint preserves the intent of your PyAutoGUI script but outputs a format that a computer use agent can consume.
  • No API usage charge for parse. You pay only for agent steps when the agent acts on the structured actions.

POST /v1/parse with a screenshot and pyautogui_code to get structured actions for free.

Where this beats brittle automation

PyAutoGUI scripts rely on exact pixel coordinates. A one-pixel shift breaks the automation. A computer use agent that parses your PyAutoGUI code into structured actions can still use those coordinates as starting points. When the UI changes, you re-run parse on the new screenshot. The agent adapts. You get a computer use agent that understands the screen and reacts to real changes, not to fragile hard-coded positions.

Use the free /v1/parse endpoint to turn PyAutoGUI scripts into structured actions. Build a computer use agent that sees the screen and acts like a human. Get a key at https://coasty.ai/developers to start automating real desktops.

Want to see this in action?

View Case Studies
Try Coasty Free