IDP plus RPA vs a single computer use agent for document workflows
Most document-heavy teams combine IDP with RPA to capture, classify, and route invoices, contracts, and claims. The setup works until the next release of a portal or the next template change. Then the bots stop, the backlog grows, and the automation budget gets questioned. The problem is not the software. It is the way we have been automating.
Why RPA breaks here
Traditional RPA binds to selectors, xpaths, and object IDs. When a vendor portal redesigns a button or when an ERP updates a field name, the bot crashes and a developer must rebuild the flow. A recent industry survey showed that 60 percent of RPA projects face at least one major redesign within six months of deployment, and each redesign takes 40 to 60 hours of developer time. For document workflows, the cost is even higher: every re-selector means a new extraction model plus a new routing rule, plus testing across ten or more systems. The team ends up with a patchwork of bots that each need constant babysitting, not a single, durable process.
What changes with computer use agents
- ●survives UI changes
- ●no brittle selectors
- ●recovers from exceptions
- ●follows the SOP as written
- ●works on legacy and Citrix
A computer use agent sees the screen and acts like a human, so it can keep working when the app changes.
The difference in practice
Think of the classic document intake process. The old way: extract fields with OCR, classify the document, route it to the right system, and file the original. A traditional RPA solution would use OCR, but then it would look for exact XPaths for the classification dropdown, the routing button, and the file upload area. When any of those elements move, the workflow halts. A computer use agent does the same steps, but instead of looking for an XPath, it reads the text on the screen, identifies the correct option, clicks it, and uploads the file. When the UI changes, the agent reads the new layout, finds the updated elements, and continues. It can also handle unexpected states, such as a missing document, a captcha, or a system error, by reading the screen and trying the next logical action instead of stopping.
The one line a VP of automation should remember
RPA works when the UI is stable, but computer use agents work when the UI changes and processes drift.
How to move without the risk
You do not have to replace all your bots at once. Start with one high-pain, change-heavy process where RPA is already brittle. Document the current steps in plain English, those are already close to a prompt. Deploy a computer use agent on a test desktop, let it run the process end-to-end, and measure how often it needs human intervention. If the agent handles the process reliably with few restarts, you can expand to adjacent tasks. This phased approach lets you compare results side by side and decide where RPA still makes sense and where agents are the better choice. Over time, you can consolidate and simplify, reducing the number of bots while keeping coverage.
If you are stuck in a rebuild-and-reselect cycle on document workflows, it is time to look at agents that can see the screen and follow SOPs. Talk to the Coasty team to see how a computer use agent can handle your next process. Book a demo at https://cal.com/coasty/15min.