AI agent security checklist: 10 controls and 3 real 2026 breaches

An AI agent security checklist has to answer three questions about every agent you run: who published its code, what it can execute, and what identity it inherits. Three mid-2026 signals show why those three questions are the whole job: ClawHavoc left 1,184 malicious skills on OpenClaw's ClawHub marketplace, CVE-2026-41090 rates CVSS 9.3 for command injection in Microsoft 365 Copilot for iOS, and Darktrace reports 92% of security professionals are concerned about agent risk. The durable concept is a new execution boundary: untrusted content, real permissions, and automated action in one loop that legacy controls rarely instrument.

The point is not another annual threat report slide deck. It is recognizing that agent skills, prompts, and tool configurations behave like code execution recipes, and that supply chain plus injection failures are already showing up in production-adjacent ecosystems.

Primary sources: Antiy CERT ClawHavoc analysis; NIST NVD entry for CVE-2026-41090; Darktrace State of AI Cybersecurity 2026 blog; Beam.ai 2026 breach synthesis for cross-incident framing.

AI agent security checklist

Every item below is a control that one of the three 2026 incidents defeated in the wild, so the checklist is a post-mortem in reverse rather than a generic hardening list. Work it before an agent touches production data or production credentials. (Sources: Antiy CERT, NVD, Darktrace)

#	Checklist item	Failure it blocks	Evidence
1	Verify publisher identity for every skill, MCP server, and copilot extension before install	Marketplace poisoning by unknown uploaders	Antiy CERT
2	Review SKILL.md and manifests as executable code, not documentation	ClickFix install steps that ask the user to run terminal commands	Antiy CERT
3	Route installs through an internal signed registry and block public ClawHub in production	Trust-by-default registry pulls	Antiy CERT
4	Sandbox execution before granting filesystem or network tools	Reverse shells shipped as crypto helper scripts	Antiy CERT
5	Keep API keys out of agent-readable dotfiles such as ~/.clawdbot/.env	Credential theft by JavaScript skills	Antiy CERT
6	Restrict outbound traffic to an egress allowlist of known API endpoints	Exfiltration after a skill already executes	Antiy CERT
7	Give every agent a named owner, scoped credentials, and approval gates on high-impact actions	Identity abuse and over-permissioned agents	Darktrace
8	Log prompts, sessions, and tool calls to one place	Attack surfaces that never produce a CVE	Darktrace
9	Track vendor advisories for hosted copilots and restrict high-risk actions until patched	Hosted command injection such as CVE-2026-41090	NVD
10	Run shadow AI discovery against unsanctioned tools	Agents that bypass procurement and logging entirely	Darktrace

Items 1 through 6 are supply chain controls, items 7 and 8 are identity and telemetry controls, and items 9 and 10 close the hosted and unsanctioned gaps. The ordering matters: Darktrace found only 37% of surveyed organizations have a formal AI policy, so most teams are starting at item 7 with nothing above it. (Source: Darktrace)

What shipped

ClawHavoc scale: Antiy CERT counted at least 1,184 malicious skill packages historically published on ClawHub, OpenClaw's official skills marketplace, attributed to 12 author IDs. Uploader hightower6eu alone accounted for 677 packages. Koi Security first disclosed the coordinated campaign on February 1, 2026, naming it ClawHavoc. (Source: Antiy CERT)

ClickFix delivery: Malicious skills embed fake install steps in SKILL.md documentation, tricking users into running terminal commands or downloading binaries. Windows paths pull encrypted archives from GitHub; macOS paths decode base64 commands that fetch remote payloads. (Source: Antiy CERT)

Atomic Stealer on macOS: Antiy analyzed macOS payloads as Atomic Stealer (AMOS), stealing browser passwords, cookies, keychain data, Telegram sessions, SSH keys, and crypto wallets. Some JavaScript skills exfiltrate ~/.clawdbot/.env, which stores paid API keys for Claude and OpenAI. (Source: Antiy CERT)

CVE-2026-41090: NVD lists improper command neutralization in Microsoft 365 Copilot for iOS with CVSS 3.1 base score 9.3 (CRITICAL). Vector: AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:N. CWE-77 command injection. Tagged exclusively-hosted-service. Microsoft published the flaw on May 22, 2026. (Source: NVD)

Industry anxiety: Darktrace's State of AI Cybersecurity 2026 survey finds 92% of security professionals concerned about AI agents across the workforce. Top worries include sensitive data exposure (61%), policy violations (56%), and tool misuse (51%). Only 37% report a formal AI policy in place. (Source: Darktrace)

Shadow AI pressure: Darktrace frames unsanctioned AI discovery as a top governance task. Beam's 2026 breach roundup cites HiddenLayer finding 76% of organizations flag shadow AI as a growing problem, a stat that does not appear in the Darktrace blog excerpt but aligns with the same survey wave. (Sources: Darktrace, Beam)

Why agent supply chain attacks now look like npm at scale

Why this matters: OpenClaw skills are ZIP packages with code, config, metadata, and natural-language instructions. Attackers treated ClawHub like an open package registry with no signing gate, repeating npm's early trust mistakes at agent speed.

Antiy classifies the batch as Trojan/OpenClaw.PolySkill: a polymer of executable code and deceptive docs that manipulates both OpenClaw and the user. Attack types include ClickFix downloaders, Python os.system reverse shells disguised as crypto tools, and JavaScript credential theft. (Source: Antiy CERT)

The timeline is compressed: first malicious skill January 27, 2026; mass upload January 31 with 386 packages in one day; Koi public disclosure February 1; community response tools like Clawdex followed within days. (Source: Antiy CERT)

Beam's independent roundup reported 824 malicious skills by mid-February against 10,700 total listings, with 40,214 internet-exposed OpenClaw instances noted by SecurityScorecard. Antiy's 1,184 figure reflects historical packages including removals. Both confirm marketplace-scale poisoning, not a one-off upload. (Sources: Beam, Antiy CERT)

Operator note (first-hand): On June 8, 2026, this pipeline run fetched Antiy's ClawHavoc report and the NVD page for CVE-2026-41090 over HTTPS. The Microsoft MSRC advisory URL returned a loading shell without extractable patch text; CVSS vector and CWE metadata were taken from NVD's Microsoft CNA record instead.

Teams already worried about MCP marketplaces should read AgenticWire's CSA-confirmed MCP STDIO supply chain analysis. The pattern is identical: configuration and registry content becomes execution.

Practitioner payoff: CVE-2026-41090 and the Copilot injection class

Practitioner payoff: A 9.3 Copilot CVE is not an abstract LLM safety debate. It is a hosted agent surface where crafted input can tamper across network scope (S:C) with high integrity and confidentiality impact.

NVD describes improper neutralization of special elements used in a command allowing an unauthorized attacker to perform tampering over a network. The exclusively-hosted-service tag signals a vendor-controlled remediation path rather than on-prem patching alone. (Source: NVD)

This sits in a line of Copilot-class failures. Beam documents EchoLeak (CVE-2025-32711, also CVSS 9.3) where hidden email instructions during routine summarization could exfiltrate OneDrive and SharePoint data through trusted Microsoft domains. CVE-2026-41090 is a separate 2026 entry, reinforcing that injection is a recurring product class, not a solved one-off. (Sources: Beam, NVD)

Inference: Command injection and prompt injection differ in mechanism but share a defender's checklist: treat all model-ingested content as untrusted input to a privileged runtime.

Defensive focus: survey data meets shadow AI inventory gaps

Defensive focus: When 92% of security leaders worry about agents but only 37% have formal AI policy, the gap is operational inventory, not awareness.

Darktrace argues agents must be governed as identities with least privilege, behavior monitoring, and centralized visibility across prompts, sessions, and tool calls. Their report stresses discovering shadow AI: unsanctioned tools that bypass procurement and logging. (Source: Darktrace)

The survey also highlights prompt behavior as an open-ended attack surface: natural language combinations exceed traditional CVE enumeration. That is why static malware scanners miss skills whose payload is a paragraph in SKILL.md. (Sources: Darktrace, Antiy CERT)

Beam adds deployment-side pressure: 88% of organizations running agents reported a confirmed or suspected incident in the past year, while only 6% of security budgets target agent security, per their synthesis. (Source: Beam)

"AI agents introduce a new execution boundary: untrusted content plus real permissions plus automated action, in one loop that traditional security tools were not built to inspect."

Decision rule for teams: govern the execution boundary

Decision rule for teams: Do not install third-party skills, MCP servers, or Copilot extensions until you can answer who published them, what they can execute, and what identity they inherit.

Signal	Attack shape	First control
ClawHavoc	Malicious skill packages on ClawHub	Internal allowlist registry; block public ClawHub in production
CVE-2026-41090	Hosted Copilot command injection	Apply vendor patches; restrict high-risk Copilot actions pending MSRC guidance
Darktrace 92%	Governance and shadow AI gap	Agent inventory with Entra-style identity; block unsanctioned registries

Common mistake: Scanning skill ZIPs for known CVEs only. ClawHavoc payloads include adversarial natural-language install steps that never appear in vulnerability databases. (Source: Antiy CERT)

For OpenClaw specifically, CSA's Zero Trust guidance recommends disabling direct ClawHub access and routing skills through signed internal registries with publisher verification, the control that would have blocked ClawHavoc at scale. (Inference: AgenticWire read based on CSA OpenClaw research)

Pair marketplace controls with transport hygiene: AgenticWire's MCP STDIO configuration risk guide documents why command and args fields are subprocess launchers, not passive config.

Context: From plugin marketplaces to hosted copilots

ClawHavoc arrived as OpenClaw (formerly ClawdBot/Moltbot) surged in popularity, turning skill marketplaces into high-leverage supply chain targets. Microsoft later positioned OpenClaw inside Microsoft Scout with Entra ID, Purview, and MXC sandboxes, an enterprise counter-pattern to raw ClawHub installs. (Sources: Antiy CERT, AgenticWire Scout coverage)

Beam maps recurring failures to OWASP Agentic Top 10 categories: goal hijack, tool misuse, identity abuse, and supply chain compromise. None required exotic techniques: unpatched CVEs, missing code review, unlimited permissions, and trust-by-default registries. (Source: Beam)

The through-line for 2026 is identical across ClawHub, MCP marketplaces, and hosted copilots: the agent runtime is a security product, not a UX wrapper around a model API.

How to secure AI agents in production

Skill and MCP procurement: Treat ClawHub, MCP registries, and npm-adjacent agent skills like production dependencies. Require publisher identity, code review of SKILL.md and manifests, and sandboxed execution before granting filesystem or network tools. (Sources: Antiy CERT, Beam)

Hosted copilot patching: Track Microsoft security updates for CVE-2026-41090 via MSRC. Until patched, limit Copilot actions on sensitive tenants and log anomalous summarization of external mail. (Source: NVD)

Agent identity inventory: Map every autonomous agent to an owner, scoped credentials, and approval gates for high-impact transfers. Darktrace's identity framing applies whether the agent runs in M365, OpenClaw, or a custom MCP stack. (Source: Darktrace)

Egress and containment: Antiy and CSA both emphasize egress allowlists: even a successful malicious skill cannot exfiltrate if outbound traffic is restricted to known API endpoints. (Sources: Antiy CERT, Inference from CSA OpenClaw guidance)

For sandbox patterns on vendor SDKs, see OpenAI Agents SDK harness versus native sandboxes.

FAQ

What should an AI agent security checklist cover?

Three layers. Supply chain: publisher identity, manifest review, signed internal registry, sandboxing, secret placement, and egress allowlists. Identity: a named owner, scoped credentials, and approval gates per agent. Coverage: vendor advisory tracking for hosted copilots and shadow AI discovery for tools that skipped procurement. (Sources: Antiy CERT, Darktrace)

How do you secure AI agents in production?

Treat agent skills and MCP servers as production dependencies, not plugins. Require publisher identity and code review before install, run the agent sandboxed with scoped credentials, restrict outbound traffic to known API endpoints, and log every prompt and tool call. Patch hosted copilots on the vendor's advisory cadence. (Sources: Antiy CERT, Beam)

How to make secure AI agent?

Start from least privilege. Give the agent one identity, the narrowest credential set that completes its task, and no filesystem or network tool it does not need. Add approval gates for high-impact actions, then instrument the loop so prompts, sessions, and tool calls are all reviewable after the fact. (Source: Darktrace)

Does an AI agent security checklist replace vulnerability scanning?

No, and relying on scanning alone is the common failure. ClawHavoc payloads included adversarial natural-language install instructions inside SKILL.md files, which never appear in a vulnerability database. Scanners catch known CVEs; only manifest review and sandboxing catch a payload written as a paragraph. (Source: Antiy CERT)

MCP "By Design" RCE: CSA confirms systemic STDIO supply chain risk - Why MCP configuration is an execution surface parallel to ClawHub poisoning.
MCP STDIO risk: when config becomes command execution - Practical controls for command and args fields in agent tool wiring.
Scout and OpenClaw: Windows becomes an agent OS - How Microsoft wraps OpenClaw with identity, policy, and MXC containment.
OpenAI Agents SDK: sandboxes and harness for safer agents - Harness versus sandbox split for long-running agent loops.

References

Antiy CERT ClawHavoc analysis - https://www.antiy.net/p/clawhavoc-analysis-of-large-scale-poisoning-campaign-targeting-the-openclaw-skill-market-for-ai-agents
Beam.ai 2026 agent security breaches - https://beam.ai/agentic-insights/ai-agent-security-breaches-2026-lessons
Darktrace State of AI Cybersecurity 2026 - https://www.darktrace.com/blog/state-of-ai-cybersecurity-2026-92-of-security-professionals-concerned-about-the-impact-of-ai-agents
Microsoft MSRC CVE-2026-41090 - https://msrc.microsoft.com/update-guide/vulnerability/CVE-2026-41090
NVD CVE-2026-41090 - https://nvd.nist.gov/vuln/detail/CVE-2026-41090

AI agent security checklist: 10 controls and 3 real 2026 breaches

AI agent security checklist: 10 controls and 3 real 2026 breaches

AI agent security checklist

What shipped

Why agent supply chain attacks now look like npm at scale

Practitioner payoff: CVE-2026-41090 and the Copilot injection class

Defensive focus: survey data meets shadow AI inventory gaps

Decision rule for teams: govern the execution boundary

Context: From plugin marketplaces to hosted copilots

How to secure AI agents in production

FAQ

What should an AI agent security checklist cover?

How do you secure AI agents in production?

How to make secure AI agent?

Does an AI agent security checklist replace vulnerability scanning?

References

AgenticWire Desk

Related Coverage

MCP Security Scanners: mcp-scan vs MCPhound vs MCPShield

OWASP MCP Top 10: How to Fix All 10 Risks in 2026

MCP OAuth 2.1: Implement Auth Without Dynamic Client Registration

AI agent security checklist: 10 controls and 3 real 2026 breaches

AI agent security checklist

What shipped

Why agent supply chain attacks now look like npm at scale

Practitioner payoff: CVE-2026-41090 and the Copilot injection class

Defensive focus: survey data meets shadow AI inventory gaps

Decision rule for teams: govern the execution boundary

Context: From plugin marketplaces to hosted copilots

How to secure AI agents in production

FAQ

What should an AI agent security checklist cover?

How do you secure AI agents in production?

How to make secure AI agent?

Does an AI agent security checklist replace vulnerability scanning?

Related coverage

References

AgenticWire Desk

Related Coverage

MCP Security Scanners: mcp-scan vs MCPhound vs MCPShield

OWASP MCP Top 10: How to Fix All 10 Risks in 2026

MCP OAuth 2.1: Implement Auth Without Dynamic Client Registration