
arXiv:2606.13239v1 Announce Type: cross Abstract: Existing computer-use agents remain fundamentally limited in professional software manipulation: GUI-based agents suffer from fragile visual grounding and long-horizon error accumulation, while API-basedapproaches struggle with heterogeneous protocols and inaccessible commercial interfaces. In this work,we identify the Component Object Model (COM) as a unified executable abstraction, proposing COM-as-Action: a new paradigm that reframes professional software interaction as deterministic program synthesisrather than sequential visual control. To
The proliferation of AI agents has highlighted the limitations of current interaction paradigms with complex software, creating an urgent need for more robust manipulation methods.
This research proposes a fundamental shift in how AI interacts with professional software, potentially enabling agents to perform complex workflows with greater reliability and autonomy.
The interaction model for AI agents with professional software is shifting from visual/API-based methods to a more deterministic, synthesis-driven approach using existing COM structures.
- · AI agent developers
- · Enterprise software companies
- · Businesses adopting AI agents
- · Productivity software users
- · Companies reliant on fragile GUI automation
- · Simplistic RPA providers
- · Developers focused on visual scripting
- · Manual software testers
AI agents will become significantly more capable of complex, multi-application tasks in existing enterprise environments without extensive retraining.
This improved reliability could accelerate the adoption of autonomous agents across a wide range of white-collar professional tasks, reducing human intervention.
The abstraction offered by COM-as-Action could foster a new ecosystem of 'component-aware' AI tools and services, drastically altering the landscape of enterprise software development and integration.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL