Stateful Sessions vs Stateless Predict in the Coasty Computer Use API
Building a computer use agent that executes multi-step workflows, like logging into a dashboard, filling a form, and submitting a report, requires reliable trajectory memory. The Coasty Computer Use API offers two ways to compute actions: a stateless /v1/predict that bills $0.05 per call and a stateful /v1/sessions/{id}/predict that charges $0.04 per call, storing the full history of screenshots and actions. This guide explains the difference and shows how to use stateful sessions for long-running tasks.
How it works
Use stateless predict when you want a quick action on a single screenshot. The request requires a base64 screenshot, an instruction, and the cua_version. The response includes actions and a status. Loop capture, predict, act until the status is done. For stateful sessions, first create a session with POST /v1/sessions. That endpoint returns an id and initial session state. Then repeatedly POST /v1/sessions/{id}/predict with the same capture-predict-act loop. The server persists the trajectory and returns the next actions based on the full history. Both endpoints require the X-API-Key header with your Coasty API key from COASTY_API_KEY.
#!/bin/bash
# Stateless predict example
COASTY_API_KEY="${COASTY_API_KEY}"
SCREENSHOT="$(cat screenshot_base64.txt)"
curl -s -X POST https://coasty.ai/v1/predict \
-H "Authorization: Bearer $COASTY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"screenshot": "'$SCREENSHOT'",
"instruction": "Click the login button",
"cua_version": "v3"
}' | jq .import os
import base64
import requests
def stateless_predict(screenshot_b64: str, instruction: str, cua_version: str = "v3") -> dict:
api_key = os.getenv("COASTY_API_KEY")
url = "https://coasty.ai/v1/predict"
resp = requests.post(
url,
headers={"Authorization": f"Bearer {api_key}"},
json={
"screenshot": screenshot_b64,
"instruction": instruction,
"cua_version": cua_version,
},
)
resp.raise_for_status()
return resp.json()
# Example usage
with open("screenshot.png", "rb") as f:
img_b64 = base64.b64encode(f.read()).decode()
actions = stateless_predict(img_b64, "Click the login button")
print(actions)Stateful sessions
- ●POST /v1/sessions creates a session and returns an id and initial state.
- ●POST /v1/sessions/{id}/predict returns actions and updates session state.
- ●Each stateful predict costs $0.04, cheaper than the $0.05 per step for stateless predict.
- ●The trajectory is stored on the server, so you do not need to re-send every screenshot.
Use stateful sessions when your agent runs for many steps to save money and simplify your client code.
import os
import base64
import requests
def create_session(cua_version: str = "v3") -> dict:
api_key = os.getenv("COASTY_API_KEY")
url = "https://coasty.ai/v1/sessions"
resp = requests.post(
url,
headers={"Authorization": f"Bearer {api_key}"},
json={"cua_version": cua_version},
)
resp.raise_for_status()
return resp.json()
def session_predict(session_id: str, screenshot_b64: str, instruction: str) -> dict:
api_key = os.getenv("COASTY_API_KEY")
url = f"https://coasty.ai/v1/sessions/{session_id}/predict"
resp = requests.post(
url,
headers={"Authorization": f"Bearer {api_key}"},
json={
"screenshot": screenshot_b64,
"instruction": instruction,
},
)
resp.raise_for_status()
return resp.json()
# Session-based loop
session = create_session()
session_id = session["id"]
cua_version = session["cua_version"]
while True:
# Capture the screen
with open("screenshot.png", "rb") as f:
img_b64 = base64.b64encode(f.read()).decode()
# Predict next actions
result = session_predict(session_id, img_b64, "Continue the task")
status = result["status"]
if status == "done":
break
# Execute predicted actions
for action in result.get("actions", []):
# Apply action, e.g., pyautogui.click(action["x"], action["y"])
passWhere this beats brittle automation
Traditional automation relies on brittle selectors, XPath, and hardcoded DOM paths that break when UI elements move or change. The Coasty Computer Use API directly reasons over screenshots and understands the screen layout. Stateful sessions let the agent remember past actions and adapt to changing UI elements, making it easier to handle dynamic dashboards, multi-step forms, or workflows that span multiple pages. When you use /v1/sessions/{id}/predict, you offload trajectory storage to the server and reduce network round trips for every step.
Pick stateless predict for simple one-off actions and stateful sessions for multi-step workflows. Get your API key at https://coasty.ai/developers and start building reliable computer use agents.