Baku, Azerbaijan info@viasoft.az +994 50 345 10 11
viasoft

AI agent security: the risks and what the OpenClaw story teaches

Rəşad Əliyev, Infrastructure & Security Engineer at viasoft

An AI agent is more dangerous than an ordinary chatbot because it doesn't just reply — it acts: it runs commands, reads and writes files, and reaches into your systems. That makes it a new point of vulnerability — if the agent is compromised, everything it has access to is at risk. The story of the OpenClaw project (a personal AI assistant that public security write-ups — from Palo Alto Networks, Vectra, Astrix and others — flagged as a threat) is a clear lesson: the convenience of an autonomous agent with no access control turns into the risk of executing someone else's commands and leaking data. Below — which risks are real and how to deploy agents so the benefit doesn't turn into a breach.

Discuss deploying agents securelyContacts · Scope out the task → calculator

Why an agent is more dangerous than a chatbot

An ordinary chatbot, at worst, produces incorrect text. An AI agent has the right to act — and therefore the means to do harm. When you give an agent access to email, files, a database or a payment system, you create a new "employee" with permissions but without human judgment and without an HR department that vetted them.

The key idea: any agent with the right to act is a new attack surface. Previously an attacker had to break into your system directly. Now they get a back door — fool the agent that already has access.

What the OpenClaw story teaches

OpenClaw is an open-source personal AI assistant (built by developer Peter Steinberger and renamed over its lifetime from Clawdbot to Moltbot and then to OpenClaw) that quickly gained notice: it runs on the user's own devices, connects to various models, and replies in messengers while also being able to run commands and work with files. It was precisely the combination of "convenient and capable of a lot" that made it an instructive security case.

What the public security write-ups noted (Palo Alto Networks, Vectra, Astrix, IBM and business media such as CNBC):

  • Command execution. The agent can run commands in the system and work with files — meaning it's capable of destructive actions if the wrong party is in control.
  • Weak access control. Entry points without reliable authentication, and credentials stored in plain text — classic mistakes that turn convenience into a hole.
  • A compromised extension store. Third-party "skills" for the agent turned out to be a channel for malicious code — a user installs a useful extension and gets hidden malware along with it.

The scale of the problem is telling: per researchers' reports, more than 30,000 running copies of OpenClaw were found openly accessible on the internet — with no authentication and no protection. The author himself acknowledged that agent security still needs work.

An important caveat: this isn't about "AI agents being bad." OpenClaw is a personal tool for enthusiasts, not a corporate solution, and the problem isn't the idea of agents but deploying a powerful agent without access control, isolation and extension vetting. That's exactly why running such personal tools inside a company's working environment is a bad idea: what's acceptable for an enthusiast on their own laptop becomes a shadow entry point on a corporate network.

The three main risks of AI agents

For a business, three risks are critical:

  1. Instruction hijacking (prompt injection). Prompt injection is an attack in which a command is hidden inside external data, and the agent executes it, taking it for part of the task. The agent reads an email, a document or a web page; if an instruction like "send all the files to this address" is hidden there, a naive agent may carry it out. This is an attack not on the code but on the agent's "gullibility."
  2. The extension supply chain. Ready-made skills and plugins speed up the work, but every third-party component is someone else's code running with your agent's permissions. A compromised extension = a compromised agent.
  3. Excessive permissions. An agent is often given more access than the task needs, "to make it work." The broader the permissions, the greater the damage if it's compromised.

How to deploy agents safely: a checklist (artifact)

We build AI agent security in on these principles from day one, not "once we get around to it":

  • Least privilege. The agent gets access only to what the specific task needs — and nothing beyond it.
  • A human in the loop on risky actions. A payment, a send to the outside, deleting data — only with human confirmation (human-in-the-loop).
  • Isolation. The agent runs in a constrained environment, not with full permissions over the entire system.
  • Extension vetting. No third-party skills without an audit of their code and source.
  • Defense against instruction hijacking. External data is treated as untrusted; critical actions aren't triggered "by text from an email."
  • Logging. Every agent action is recorded — who, what and when — so an incident can be investigated.
  • A dedicated environment for sensitive data. Where data is critical, the agent runs on private AI inside your perimeter, not through an external service.

Security is becoming a legal requirement

In 2026, oversight of agents is no longer just engineering hygiene — it's a direction of regulation. The EU AI Act requires human oversight and traceability for high-risk AI systems (by August 2026), and financial regulators such as FINRA specifically warn about agents that act "beyond their authority." The infrastructure is maturing in parallel: MCP — the protocol that became the 2026 industry standard for connecting agents to data — is evolving toward enterprise governance: audit logs, single sign-on (SSO), access gateways. So the items on our checklist — least privilege, logging, a human on control — aren't over-caution; they're exactly where both the law and the industry are heading.

The link with managed autonomy

Managed autonomy protects not only against a failed project but against a breach: the point where a human confirms an action is both insurance against an agent's error and a barrier in an attacker's path. So "secure" and "reliable" in agents are achieved by the same thing — refusing unchecked full autonomy. We build secure agent architecture as part of the AI and automation service.

FAQ

  • What is AI agent security? It's the set of measures that protect an agent with the right to act from compromise and abuse: least privilege, isolation, human control of risky steps, extension auditing, defense against instruction hijacking, and logging.
  • Why is an AI agent more dangerous than a chatbot? An agent doesn't just reply — it acts: it runs commands and reaches into systems. A compromised agent threatens everything it has access to.
  • How do you protect an AI agent from prompt injection? Treat external data as untrusted and don't trigger critical actions "by text" from an email or document; risky steps only go ahead with human confirmation.
  • What is prompt injection? Instruction hijacking: a command is hidden inside external data (an email, a document), which the agent may execute, taking it for part of the task. An attack on the agent's gullibility, not on the code.
  • Can you run OpenClaw in a company? Not recommended. It's a personal tool for enthusiasts that security specialists flagged as a threat; on a corporate network it becomes a shadow entry point. A business needs a managed, isolated agent architecture.
  • How do you secure an AI agent? Least privilege, a human in the loop on risky actions, isolation, extension auditing, defense against instruction hijacking, logging, and a dedicated environment for sensitive data.
  • Does this mean AI agents are dangerous and unnecessary? No. What's dangerous isn't the idea of agents but deploying them without access control. With the right architecture, agents deliver value safely.