Tutorial

Turn PyAutoGUI Code into Structured Actions with the Free Coasty Parse Endpoint

James Liu||5 min
+Enter

PyAutoGUI scripts are easy to write but hard to maintain across machines and UI updates. The Coasty parse endpoint turns them into structured actions you can feed into a computer use agent. This lets you reuse existing snippets and keep them up to date without rewriting everything.

How the parse endpoint works

Send a POST request to /v1/parse with a base64-encoded screenshot and your PyAutoGUI code. The server returns a structured action list you can feed into a computer use API call. The endpoint is free, so you can iterate quickly without spending credits.

python
import base64
import os
import requests

def run_pyautogui_parse():
    url = "https://coasty.ai/v1/parse"
    api_key = os.getenv("COASTY_API_KEY")

    # Replace with a real screenshot of your target window
    screenshot_path = "target_window.png"
    with open(screenshot_path, "rb") as f:
        screenshot_b64 = base64.b64encode(f.read()).decode("utf-8")

    pyautogui_code = """
import pyautogui
pyautogui.moveTo(100, 200)
pyautogui.click()
pyautogui.hotkey('ctrl', 'c')
pyautogui.moveTo(300, 300)
pyautogui.click()
"""

    payload = {
        "screenshot": screenshot_b64,
        "code": pyautogui_code
    }

    headers = {
        "X-API-Key": api_key,
        "Content-Type": "application/json"
    }

    resp = requests.post(url, json=payload, headers=headers)
    resp.raise_for_status()
    print(resp.json())

if __name__ == "__main__":
    run_pyautogui_parse()

What you get back

  • A JSON array of actions, each with type, x, y, and any required parameters.
  • The endpoint never modifies your original code, it only describes the intent.
  • Structured actions can be fed into the /v1/predict endpoint for execution.

POST /v1/parse is free and returns a structured action list from PyAutoGUI code.

Where this beats brittle automation

API-only tools break when UI changes or when elements have no stable IDs. A computer use agent sees the same screen as a human, so it can adapt to layout shifts. By feeding structured actions from parse, you get the best of both worlds: readability of PyAutoGUI and resilience of computer use.

Now you can take old PyAutoGUI snippets, parse them, and run them as robust computer use actions. Get a key at https://coasty.ai/developers to start using the parse endpoint and other computer use API features.

Want to see this in action?

View Case Studies
Try Coasty Free