Engineering

Stateful Sessions vs Stateless Predict in the Computer Use API

Priya Patel||6 min
Home

You want an AI agent that can open a browser, log in, fill a form, and submit. A stateless predict request gives you a single step and then needs a fresh screenshot and prompt every time. That is fine for one-off actions but quickly becomes costly and brittle when you need to remember context across steps. The stateful sessions API gives you a persistent session ID that keeps the trajectory (history of screenshots, actions, and state) between calls so you can build multi-step workflows with a single agent.

How it works

A stateless predict request is a simple loop. You send a base64 screenshot, an instruction, and the cua_version field. Coasty returns actions and a status. You continue looping until status is done. A stateful session is different. You first create a session with POST /v1/sessions. The response includes an id. Then for each step you call POST /v1/sessions/{id}/predict. The session remembers the trajectory, so you do not need to resend the full history each call. Both endpoints accept the same fields: screenshot (base64), instruction, and cua_version. The stateful endpoint also accepts any additional fields you send in the request body.

python
import base64
import os
import requests

def predict_stateless(api_key: str, screenshot_b64: str, instruction: str, cua_version: str = "v3") -> dict:
    url = "https://coasty.ai/v1/predict"
    headers = {"X-API-Key": api_key}
    payload = {
        "screenshot": screenshot_b64,
        "instruction": instruction,
        "cua_version": cua_version
    }
    resp = requests.post(url, headers=headers, json=payload)
    resp.raise_for_status()
    return resp.json()

def predict_stateful(api_key: str, session_id: str, screenshot_b64: str, instruction: str, cua_version: str = "v3") -> dict:
    url = f"https://coasty.ai/v1/sessions/{session_id}/predict"
    headers = {"X-API-Key": api_key}
    payload = {
        "screenshot": screenshot_b64,
        "instruction": instruction,
        "cua_version": cua_version
    }
    resp = requests.post(url, headers=headers, json=payload)
    resp.raise_for_status()
    return resp.json()

def create_session(api_key: str, cua_version: str = "v3") -> dict:
    url = "https://coasty.ai/v1/sessions"
    headers = {"X-API-Key": api_key}
    payload = {"cua_version": cua_version}
    resp = requests.post(url, headers=headers, json=payload)
    resp.raise_for_status()
    return resp.json()

if __name__ == "__main__":
    api_key = os.environ.get("COASTY_API_KEY")
    if not api_key:
        raise ValueError("Set COASTY_API_KEY")
    # Example: create a new session
    session_res = create_session(api_key)
    session_id = session_res["id"]
    print("Session ID:", session_id)
    # Example: call predict on the stateful session
    screenshot = base64.b64encode(b"fake_screenshot_bytes").decode("utf-8")
    instruction = "Click the login button"
    result = predict_stateful(api_key, session_id, screenshot, instruction)
    print("Actions:", result.get("actions"))
    print("Status:", result.get("status"))

When to use each endpoint

  • Use stateless predict when you need quick, single-step actions and do not need to remember history across steps.
  • Use stateful sessions when you want an agent to complete multi-step workflows and preserve trajectory memory automatically.
  • Both endpoints charge the same per predict call: $0.04 per stateful predict and $0.05 per stateless predict. Create sessions with POST /v1/sessions costs $0.10.
  • Stateless predict is great for testing or one-off tasks, while stateful sessions are better for production workflows that need consistency and can amortize the session creation cost.

Stateful sessions give you trajectory memory at $0.04 per predict, so you pay less while building reliable multi-step agents.

Where this beats brittle automation

Traditional automation relies on brittle selectors IDs, classes, and XPath that break when UI changes. A computer use agent sees the screen and acts like a human, so it adapts to layout shifts and missing selectors. Stateful sessions make it practical to keep the agent alive across many steps, which means you can build workflows that handle errors, retries, and human approvals without stitching together many disconnected scripts. That is what a computer use agent does best: observe, decide, and act across a real desktop or browser.

Start building reliable computer use agents with Coasty. Create a session to get trajectory memory, or use stateless predict for quick one-off actions. Get your API key at https://coasty.ai/developers.

Want to see this in action?

View Case Studies
Try Coasty Free