Turn PyAutoGUI Code Into Structured Actions for Free with the Coasty Parse Endpoint
You have PyAutoGUI scripts that work on your machine. You want to turn them into reusable, stateful computer use agents. The problem is that PyAutoGUI is pixel‑based and fragile. A single UI change breaks your script. The Coasty parse endpoint solves this by converting PyAutoGUI code into structured actions that reference elements by description instead of coordinates. The endpoint is free. It reads your Python logic and returns a normalized action payload ready to drive the Coasty computer use API. This lets you build agents that see the screen and act like humans, not brittle pixel patterns.
How it works
The parse endpoint takes a string of Python code that uses the pyautogui module and returns a structured list of actions. The request requires two fields. The first is code, a string containing your PyAutoGUI logic. The second is cua_version, the version of the computer use API you intend to use, such as 'v3' or 'v4'. The endpoint runs the code in a sandbox and emits actions that describe clicks, text entry, keyboard presses, and other interactions in a normalized format. Each action includes type, x, y, and text fields where relevant. The response body is a JSON object with a key named actions that holds the list. This output can be fed directly into the predict endpoint of the computer use API to drive real desktop sessions.
# Install the Python SDK
# pip install requests
export COASTY_API_KEY=$(cat ~/.coasty/key)
# Your PyAutoGUI code as a string
PYAUTOGUI_CODE='''import pyautogui
pyautogui.moveTo(100, 100)
pyautogui.click()
pyautogui.typewrite("hello world")
pyautogui.press("enter")
'''
# Call the free parse endpoint
curl -X POST https://coasty.ai/v1/parse \
-H "Authorization: Bearer $COASTY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"code": "'"$PYAUTOGUI_CODE"'",
"cua_version": "v3"
}' \
| jq .What you get back
- ●A JSON object with top-level key 'actions' containing an array.
- ●Each action has a 'type' field such as 'click', 'type', or 'key_press'.
- ●Coordinates are normalized to screen pixels for the predict endpoint.
- ●The endpoint does not execute the code against a running desktop.
- ●It returns a representation of your intent that the Coasty computer use API can execute.
The parse endpoint is free. Use it to refactor any PyAutoGUI script into structured actions in one call.
Where this beats brittle automation
PyAutoGUI scripts rely on absolute coordinates. If a window moves or changes layout, your script fails. The parse endpoint gives you a structured representation of your intent. You can then feed these actions into the Coasty computer use API which uses vision to see the screen and click or type based on element descriptions. This means your agents adapt to layout changes, window positioning, and dynamic content without selector updates. You still get pixel-based fallbacks when needed. The combination of structured actions from parse and vision-based execution from the computer use API gives you maintainable automation that lasts longer than pixel patterns.
Use the free parse endpoint to turn your PyAutoGUI scripts into structured actions. Feed them directly into the Coasty computer use API to build agents that see and act like humans. Visit https://coasty.ai/developers to get an API key and start building. For more details on the full computer use API, see the documentation at https://coasty.ai/docs.