Guide

Your Computer Use Agent Can Be Hijacked in Seconds. Here's What You're Not Doing About It.

Emily Watson||8 min
Esc

In October 2024, a security researcher named Johann Rehberger did something that should have made every AI team on the planet stop and sweat. He took Anthropic's Claude computer use feature, fed it a malicious webpage, and watched it silently connect to a command-and-control server he controlled. No user interaction. No warning. The AI just did it. He called the attack 'ZombAIs.' The name is funny. The implications are not. Fast forward to May 2025, and a single vulnerability, CVE-2025-47241, in the popular Browser Use framework exposed over 1,500 AI projects to a silent exploit simultaneously. And IBM's 2025 Cost of a Data Breach report found that 97% of organizations that suffered AI-related breaches were running without proper AI access controls. Ninety-seven percent. So before you deploy your next computer use agent to handle real work on real systems, let's talk about what you're probably getting wrong.

The Threat Is Real and It's Already Here

People treat AI agent security like it's a future problem. It isn't. The ZombAIs attack worked because a computer use agent, by design, takes actions based on what it sees on screen. A malicious website can inject instructions directly into that visual context. The agent reads them as legitimate commands. It then executes them with whatever permissions it has on your system. This is called an indirect prompt injection attack, and it's not theoretical. Researchers at academic institutions published a full systematization of these vulnerabilities in 2025, cataloging how computer use agents are uniquely exposed compared to standard LLM chatbots. The reason is simple: a chatbot that gives a bad answer is annoying. A computer-using AI that executes a bad instruction can delete files, exfiltrate data, install backdoors, or pivot to other systems on your network. The attack surface is your entire desktop. IBM puts the average cost of an AI-related data breach at $4.9 million in 2025. That's not a rounding error. That's a company-altering event.

The 7 Security Practices Most Teams Skip

  • Sandbox everything. Your computer use agent should never run on a machine with access to production databases, internal admin panels, or sensitive credentials. Use isolated cloud VMs or containerized desktops. If it gets compromised, the blast radius stays contained.
  • Apply least-privilege access, ruthlessly. Gartner predicts 40% of enterprise apps will integrate AI agents by end of 2026, yet most are deployed with admin-level permissions 'for convenience.' Give the agent exactly the access it needs for the task. Nothing more.
  • Treat every external website as hostile. If your computer use agent browses the web, any page it visits is a potential injection vector. Implement content filtering, restrict browsing to an allowlist where possible, and log every site visited.
  • Audit action logs in real time, not after the fact. 63% of breached organizations in IBM's 2025 report lacked AI governance policies entirely. You need a full record of every click, keystroke, and file access your agent makes. Not for compliance theater, for actual incident response.
  • Never hardcode credentials in agent context. This sounds obvious and yet it keeps happening. Credentials passed in system prompts or task descriptions can be extracted via prompt injection. Use secrets managers and short-lived tokens.
  • Validate outputs before they trigger downstream actions. If your agent is writing to a database, sending emails, or calling APIs, add a confirmation layer for high-stakes actions. Human-in-the-loop checkpoints are not a weakness, they're a kill switch.
  • Update your frameworks aggressively. CVE-2025-47241 hit 1,500+ projects because teams were running outdated versions of Browser Use. AI agent frameworks are moving fast and so are the people finding holes in them.

97% of organizations that suffered AI model breaches in 2025 were running without proper AI access controls, according to IBM. The tools to prevent this exist. Teams are just choosing not to use them.

The Sandboxing Problem Nobody Talks About

Here's where most security guides stop at 'use a sandbox' and move on. That's not enough. The question is what kind of sandbox, and whether your computer use agent can actually do its job inside it. A lot of teams discover too late that their sandboxed environment is missing the apps, browser profiles, or network access the agent needs. So they start poking holes in the sandbox to make things work. Then the sandbox isn't really a sandbox anymore. It's just a VM with extra steps. The right architecture is an isolated cloud VM that's purpose-built for the specific workflow the agent runs, with network egress locked to only the domains that workflow requires, credentials injected at runtime via a secrets manager, and session recordings for every run. This isn't paranoia. Anthropic's own threat intelligence team documented a case in August 2025 where cybercriminals used AI coding tools to scale a data extortion operation they called 'vibe hacking.' If attackers are using AI to attack, you need to think harder about how you're defending the AI you're using to automate.

The Credential Problem Is Worse Than You Think

Let me paint you a picture. A developer sets up a computer use agent to automate a reporting workflow. To make it work, they paste the company's Google Workspace admin credentials into the agent's system prompt. The agent browses to a third-party data aggregator site to pull some numbers. That site has a hidden prompt injection in its HTML. The agent reads it, and now the attacker's instructions have access to the same context window as those admin credentials. This is not a hypothetical. Researchers have demonstrated this attack chain in controlled environments multiple times since 2024. The fix is not complicated: never put credentials in the agent's context. Use OAuth flows, short-lived API tokens, and secrets managers that inject credentials at the moment of use without ever exposing the raw value to the model. It requires a bit more setup. It also means you don't spend $4.9 million cleaning up a breach.

Why Coasty Was Built With This in Mind

I'm going to be straight with you. Most computer use agent tools were built to show off benchmark scores and ship fast. Security architecture came later, if at all. Coasty was built differently, and it shows in the benchmark scores too, 82% on OSWorld, which is the highest of any computer use agent right now, but the score isn't the point. The point is how the architecture works. Coasty runs agents in isolated cloud VMs by default. You're not running automation on your local machine with your local credentials exposed. The agent operates in a contained environment, and you get full session recordings of every action it takes. That's your audit trail. That's your incident response foundation. The desktop app and BYOK support mean your data and API keys don't have to live on someone else's server in a way you can't control. Agent swarms for parallel execution are scoped and isolated too, so one compromised task thread can't bleed into another. It's not magic. It's just what a computer use agent looks like when the people building it actually thought about what happens when something goes wrong. If you're evaluating computer use tools right now, the security architecture should be question one, not an afterthought. Coasty's free tier is at coasty.ai if you want to see how a properly sandboxed computer use agent actually behaves.

Here's my take. The AI agent security conversation is about two years behind where it needs to be. Teams are racing to automate everything, deploying computer use agents with admin permissions on production machines, no audit logs, credentials baked into prompts, and zero incident response plan. And then they're shocked when a researcher publishes a new attack that makes it all fall apart. The ZombAIs attack was a warning. CVE-2025-47241 was a warning. The IBM report saying 97% of AI breach victims had no access controls is not a warning anymore, it's a diagnosis. You don't need to be paranoid. You need to be deliberate. Sandbox your agents. Enforce least privilege. Log everything. Treat every external input as potentially hostile. And pick tools that were built with this in mind from the start. The best computer use agent isn't the one that moves the fastest. It's the one that moves fast without burning your company down. Start at coasty.ai.

Want to see this in action?

View Case Studies
Try Coasty Free