Research

The Data Flywheel: Synthetic Data for Self-Improving Agents

James Liu||6 min
Pg Up

Training an AI agent today often feels like trying to learn a language with only a handful of sentences. You get the basics, but you never see enough of the messy, nuanced situations that really matter. Real-world data is expensive to collect, hard to label, and sometimes risky to use. Synthetic data offers a way to generate millions of realistic scenarios, close the feedback loop, and let agents learn from their own mistakes.

The data bottleneck for agents

Agents that can use computers and browsers need to handle long, multi-step tasks. The difficulty isn't just the model size. It's the availability of high-quality trajectories. A recent benchmark study showed that models trained on fewer than 100k realistic trajectories struggled to complete complex workflows, while those with 500k+ trajectories improved completion rates by over 40 percent. Most teams simply do not have that much clean, labeled interaction data. Even when they do, reusing real user sessions can expose sensitive information or cause compliance issues.

Synthetic data as a feedback loop

Synthetic data lets you create new scenarios on demand. Instead of waiting for users to make mistakes, you can design them, run agents through them, and collect the results. This creates a flywheel. You generate data, train or fine-tune the agent, evaluate it on synthetic and real benchmarks, and use the failures to generate more synthetic data. One engineering team reported that after running this loop three times, their agent's error rate dropped from 18 percent to 6 percent on a standard browser automation task. The model started recognizing patterns it had never seen before because it had been exposed to a much wider variety of outcomes.

Techniques that actually work

  • Use computer use agents to generate realistic click sequences, keyboard inputs, and navigation paths.
  • Design failure scenarios deliberately, wrong clicks, misread text, unexpected popups, to force the model to learn robust error handling.
  • Combine synthetic and real data at scale, using real data for validation and synthetic data for exploration.
  • Apply human-in-the-loop review only to the most critical synthetic cases, keeping the process efficient.

The key is not just generating more data, but generating data that reflects the real complexity of computer use, context, timing, and failure modes you never see in a static dataset.

How Coasty fits

Coasty runs computer use agents on real desktops and browsers, so it can capture realistic interaction data and produce synthetic datasets and trajectories tailored to your specific workflows. The service is custom and contact-led, meaning you work directly with the Coasty team to design the data you need. There is no self‑serve product and no fixed packages. You get a bespoke solution aligned with your model’s training or evaluation goals.

If you want to build a self‑improving agent, synthetic data is the fuel that keeps the engine running. To explore how Coasty can help you generate the right synthetic datasets for your use case, book a data call with the Coasty data team at https://cal.com/coasty/coasty-data-call .

Want to see this in action?

View Case Studies
Try Coasty Free