Tutorial

Turn PyAutoGUI Code into Structured Actions with the Free Parse Endpoint

Sarah Chen||5 min
+N

PyAutoGUI is great for quick scripts, but it produces brittle pixel coordinates and mouse moves. You want reusable, inspectable steps that describe what the agent sees and does. The /v1/parse endpoint turns PyAutoGUI code into structured actions. You send the script, the API returns a plan of steps with coordinates and hints. You can then use those actions in a task run or workflow. This is a free endpoint for building a library of UI workflows.

How /v1/parse works

POST https://coasty.ai/v1/parse with a JSON body containing a script string. The endpoint parses the code and returns a list of structured actions. Each action has a type (e.g., click, type, scroll), coordinates, and a hint describing what the step does. You can pipe the resulting actions into /v1/sessions/{id}/predict or pass them directly into a task run. There is no charge for this call.

python
import os, base64, json, requests

COASTY_API_KEY = os.getenv("COASTY_API_KEY")
if not COASTY_API_KEY:
    raise RuntimeError("COASTY_API_KEY is required")

pyautogui_script = """
import pyautogui
time.sleep(1)
pyautogui.moveTo(500, 300)
pyautogui.click()
pyautogui.typewrite("hello world")
pyautogui.press("enter")
"""

url = "https://coasty.ai/v1/parse"
headers = {"X-API-Key": COASTY_API_KEY}
payload = {"script": pyautogui_script}

resp = requests.post(url, headers=headers, json=payload)
resp.raise_for_status()
actions = resp.json()
print(json.dumps(actions, indent=2))

What the response contains

  • The endpoint returns a JSON object with a key for actions (list).
  • Each action has a type field (e.g., click, type, scroll).
  • Coordinates are provided as x and y integers.
  • A hint field gives a human-readable description of the step.
  • There is no billing for this call. It is free.
  • You can reuse the actions in a session or task run.

Call POST /v1/parse once, reuse the structured actions in multiple runs.

Where this beats brittle automation

Pixel coordinates break when layouts shift. Selectors change or break. A computer use agent that sees the screen and acts like a human adapts to layout changes. By parsing PyAutoGUI scripts into structured actions, you can build a library of robust workflows that run on any screen size. You get inspectable, testable steps instead of magic numbers. Then you can drive those actions with a true computer use agent on real machines.

Create a reusable library of UI workflows by parsing PyAutoGUI scripts. Use the actions in sessions or task runs. Get a key at https://coasty.ai/developers and start building computer use agents that see and act like humans.

Want to see this in action?

View Case Studies
Try Coasty Free