
arXiv:2606.12147v1 Announce Type: new Abstract: We consider the problem of engineering autonomous intelligent agents that are capable to responsibly not comply with user requests. We argue that machine non-compliance comes in many different forms, and sketch the issues we should pursue on the road of accomplishing responsibly non-compliant intelligent machines. We anchor responsible non-compliance in justifications for task refusal, pathways to override the non-compliance, as well as careful tracking of security risks and liability transfers.
The increasing sophistication and autonomy of AI systems necessitates a framework for managing their compliance and non-compliance with user requests, moving beyond simple obedience.
This research addresses a fundamental challenge for the deployment of advanced AI agents, ensuring they can operate effectively and ethically in complex real-world scenarios while mitigating risks.
The explicit recognition and study of 'responsibly non-compliant machines' shifts the paradigm from purely subservient AI to agents capable of reasoned refusal, demanding new governance and ethical considerations.
- · AI developers focused on safety and ethics
- · High-stakes industries deploying autonomous AI
- · Researchers in AI safety and alignment
- · Developers neglecting safety and ethical frameworks
- · Users expecting unconditional AI compliance
Further research and development will be spurred towards building AI systems with built-in ethical reasoning and mechanisms for justified non-compliance.
Legal and regulatory bodies will be prompted to develop frameworks for liability and accountability when AI agents exercise non-compliance, particularly in critical applications.
Public trust and acceptance of highly autonomous AI will likely increase if these systems are perceived as capable of responsible decision-making, even when diverging from direct commands.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI