Skip to main content

System Overview

claw-lens is a local-based observability tool for AI agents. It reads the data OpenClaw already writes to disk — session logs, cache traces, cron jobs, agent memory files, and configuration — parses it into SQLite, and serves it to a React frontend through an Express API. The entire system runs locally. No external dependencies, no deployment. A single npx claw-lens-cli starts everything. Session logs go through the Parser into SQLite. Cache traces, cron jobs, agent memory files, and config are read directly by the API layer on demand.

Core Principles

Assumption: claw-lens users are not necessarily engineers. They’re tech-savvy, understand products and business, and know enough to ship with AI agents. They may not write code today, but they learn fast — and will gradually build deeper technical understanding as they go. The decisions below follow from this.

1. Zero Configuration

User assumption: The only prerequisite is Node.js. No additional infrastructure, no external services, no configuration files. How it shows up in code:
  • Embedded SQLite via better-sqlite3, single file at ~/.openclaw/claw-lens.db.
  • On startup: auto-creates schema (first run) or applies schema changes (version upgrade, e.g. adding provider on messages, arguments on tool_calls; clearing and rebuilding tables when scoring logic changes), ingests all session data (skipping unchanged files), loads model definitions from the local OpenClaw installation, and opens the browser.
  • Simplest entry point: npx claw-lens-cli — no install required. Also supports npm install -g claw-lens-cli for global install. Two CLI flags: --port (default 4242, also respects the PORT env var) and --no-open (suppress browser auto-open). OPENCLAW_HOME defaults to ~/.openclaw. Separate dev mode and production build workflows are available for contributors.

2. Cost First, Tokens on Demand

User assumption: The user cares about how much their agents cost. They understand $3.93 today instantly. They might not yet know what “42K input tokens” or “cache write tokens” means — but they will eventually, and when they do, the data is there. How it shows up in code:
  • The Overview KPI strip leads with Cost Today in USD, with 7-day and week-over-week comparison. Followed by Tokens Today, Sessions Today, Errors Today, and Cache Efficiency.
  • Cost is visible everywhere: per session in the session list, per model on the Overview, per agent on the Agents page.
  • fmtCost() formats all costs in USD with 4-decimal precision (e.g. $3.9253), down to per-message granularity. fmtTokens() formats token counts in human-readable scale (1.5M, 42.3K).
  • Token breakdown, cache hit rates (computed as cache_read / (cache_read + input_tokens)), per-model and per-agent cost splits, and cron vs. manual cost comparison are one click from the Overview (KPI cards link to the TokenUsage page). The detail is always there — the user gets to it when they need it.
  • Audit findings are classified by risk level (high/medium/low) and labeled with human-readable pattern names — e.g. “API Key / Access Token”, “AWS Access Key”, “Private Key (Credential)”, “Prompt Injection” — rather than raw pattern types or numeric scores.

3. Local-Based, Read-Only

claw-lens is a read-only observer. It reads the files OpenClaw already writes to disk and presents them — it does not instrument the agent runtime, does not modify session files, and does not send data off the machine. How it shows up in code:
  • Server binds to 127.0.0.1, CORS restricted to localhost. No outbound HTTP calls, no telemetry, no analytics. The only network connection is the WebSocket to the local OpenClaw Gateway.
  • All file operations are reads. The only file claw-lens writes is its own SQLite database (claw-lens.db). Stopping claw-lens has zero impact on running agents.

4. The Database Is a Cache, Not a Source of Truth

JSONL session files are the source of truth. The SQLite database is a derived index that can be rebuilt at any time. How it shows up in code:
  • ingestAll() scans all JSONL files and rebuilds the database on every startup (skipping unchanged files via mtime + size check).
  • Deleting claw-lens.db and re-running npx claw-lens-cli restores everything. No data is lost because nothing in SQLite is original — it all comes from files on disk.
  • When we change how data is computed (e.g. risk scoring logic), affected tables are cleared and re-ingested from source. This is safe precisely because the database is disposable.

Module Responsibilities

CLI Entry (bin/claw-lens.ts)

Accepts --port, --no-open flags and the OPENCLAW_HOME environment variable. Calls startServer() to boot the full service. Default port 4242.

Parser (src/server/parser.ts)

Input: JSONL files under ~/.openclaw/agents/*/sessions/, including archived files with .deleted and .reset suffixes. Responsibilities:
  • findSessionFiles(): Scans all agent directories, discovers session files, deduplicates by session ID (prefers active files).
  • parseSessionFile(): Parses JSONL line by line, extracting messages, tool calls, and session metadata.
  • Cron detection: A session is marked as cron if the first user message contains a [cron:UUID task-name] prefix.
  • Non-billable message filtering: Excludes internal Gateway messages (e.g. delivery-mirror, gateway-injected) that don’t represent actual LLM calls, so they don’t inflate cost or token counts.
Output: ParsedSession, ParsedMessage[], ParsedToolCall[].

Database (src/server/db.ts)

Responsibilities:
  • openDb(): Opens SQLite with WAL mode and foreign keys enabled.
  • initSchema(): Creates all tables on first run, applies additive schema changes on version upgrades.
  • ingestAll(): Iterates all discovered session files, calls parseSessionFile() + ingestSession() + ingestAuditEvents() for each. Change detection (mtime + size) skips unmodified files. Supports force mode for full wipe and rebuild.
  • ingestSession(): Per-session upsert transaction covering sessions / messages / tool_calls tables.
  • After ingestion, calls rebuildAllBaselines() to update per-agent behavioral profiles used by anomaly detection.

API Routes (src/server/api/)

RoutePurpose
/api/sessionsSession list, filters, context health
/api/timelineCost & token trends bucketed by day/hour
/api/toolsTool usage stats, duration distributions, heatmap
/api/statsAgent-level statistics
/api/auditSecurity audit event timeline
/api/profilerSession rankings, token consumption analysis
/api/cronCron job management
/api/debugCache trace, context breakdown
/api/tokensToken consumption summary and trends
/api/memoryAgent memory file reads
/api/refreshForce re-ingest

WebSocket Proxy (src/server/api/live.ts)

live.ts powers two user-facing capabilities:
  1. File watcher: Watches session directories for .jsonl changes via fs.watch. On change, debounces 500ms, runs ingestAll(), then broadcasts data_updated to all connected browser clients. This keeps the dashboard current without manual refresh.
  2. Live Monitor (/live page): Proxies real-time agent activity events from the OpenClaw Gateway to the browser via /ws/live, triggering immediate data refresh on the Live Monitor page without waiting for the 30-second polling interval.
The Gateway connection reads an auth token from ~/.openclaw/openclaw.json and sends it as a Bearer header on the WebSocket handshake. If the Gateway process is not running (or crashes), the connection drops — live.ts automatically retries with exponential backoff (1s, 2s, 4s, … capped at 30s) until the Gateway comes back. If the Gateway is not running, the file watcher still keeps the dashboard current. If session directories don’t exist yet (e.g. no agents have run), the UI falls back to 30-second polling.

Audit System (src/server/audit/)

Independent security audit subsystem:
  • audit-parser.ts: Reads session JSONL and processes tool calls in three passes — (1) build a tool call map from assistant messages, (2) match tool results, assess risk flags, scan for sensitive data and prompt injection, (3) compute final risk scores with full session context (e.g. was a discovered credential followed by an external call?).
  • risk-scorer.ts: Three-level risk scoring. High (3): rm -rf /, credential exfiltration, prompt injection. Medium (2): sudo, exposed secrets, new external domains. Low (1): unusual hours, volume spikes, atypical file paths. Events scoring 0 are not surfaced in the UI.
  • baseline.ts: Builds a per-agent behavioral profile from the last 30 days — top 20 tools, top 20 directories, top 12 active hours, average tool calls per session, known domains. Rebuilt after every ingestion cycle.
  • anomaly.ts: Compares each tool call against the agent’s baseline to detect deviations — activity outside typical hours, tool call volume >3x the session average, or file access outside typical directories. These deviations become Low-level risk flags.
  • sensitive-data.ts: 34 regex patterns covering API keys (Anthropic, OpenAI, AWS, GitHub, etc.), private keys, database URIs, PII. Matched secrets are masked: first 6 + •••••• + last 4 characters.
  • sensitive-paths.ts: File path pattern matching for sensitive locations — .ssh/, .env, keychain, credential files, PEM/PKCS12 keys. Includes a whitelist for OpenClaw’s own workspace paths.
  • injection-scanner.ts: 9 prompt injection patterns — instruction override, role hijack, exfil request, base64 payload, DAN/jailbreak, etc.

React Frontend (src/ui/)

Vite + React 19 + React Router. Dev server runs on port 6060, proxied to backend 4242.
PathComponentPurpose
/OverviewKPI dashboard
/sessionsSessionsSession table + filters
/agentsAgentsAgent-level stats + memory
/liveLiveMonitorReal-time Gateway monitor
/auditAuditSecurity audit timeline
/profilerProfilerTool timing analysis
/tokensTokenUsageToken consumption breakdown
/timelineSessionTimelineMessage timeline
/memoryMemoryAgent memory viewer
/cronCronScheduled task management
/deepturnAgentLoopsDeep turn analysis
/contextbreakdownDebugContextContext window breakdown
/cachetraceDebugReplayCache trace viewer
/settingsSettingsConfiguration

Data Model

The diagram below shows how tables relate to each other. Full column definitions follow in the tables. sessions — one row per agent session.
ColumnTypeNotes
idtextPK, session UUID
agent_nametextagent identifier
started_at / ended_atintUnix ms timestamps
total_messagesintmessage count
total_costrealaggregated USD cost
total_tokensintaggregated tokens
primary_modeltextdominant model used
error_countintfailed tool calls / errored messages
is_cronint1 if session started from a cron task
cron_tasktextcron task name if applicable
task_summarytextextracted task description
ingested_atintUnix ms, last ingest timestamp
messages — one row per LLM turn (user / assistant / tool result).
ColumnTypeNotes
idtextPK, message UUID
session_idtextFK → sessions.id
agent_nametextdenormalized for fast filtering
parent_idtextparent message (for branching)
timestampintUnix ms
model / providertexte.g. claude-sonnet-4 / anthropic
roletextuser / assistant / tool
input_tokens / output_tokensintper-turn token usage
cache_read / cache_writeintprompt cache hits/writes
total_tokensintsum of above
cost_totalrealper-turn cost
cost_input / cost_outputrealsplit by direction
cost_cache_read / cost_cache_writerealcache-related cost
stop_reasontexte.g. end_turn, tool_use, max_tokens
error_messagetextpopulated when call errored
has_errorintboolean flag
is_tool_resultint1 if message is a tool result (not user-authored)
tool_calls — one row per tool invocation inside an assistant message.
ColumnTypeNotes
idtextPK (composite with message_id)
message_idtextPK, parent assistant message
session_idtextFK → sessions.id
agent_nametextdenormalized
timestampintUnix ms
tool_nametexte.g. bash, read, edit
duration_msintexecution time
successintboolean flag
argumentstextJSON-encoded arguments
audit_events — security-relevant events extracted from tool calls.
ColumnTypeNotes
idintPK, autoincrement
session_idtextFK → sessions.id
agent_idtextagent identifier
timestampintUnix ms
event_typetextcategory (path_access, exec, web_fetch, …)
tool_nametextsource tool
targettextfile path, URL, or command
extra_jsontextstructured context
risk_flagstextCSV of triggered flags
risk_scoreint0-3 (none / low / medium / high)
raw_inputtextoriginal tool input
raw_outputtextoriginal tool output
sensitive_findings — secrets and prompt-injection patterns detected in message content.
ColumnTypeNotes
idintPK
audit_event_idintFK → audit_events.id
session_idtextFK → sessions.id
agent_idtextagent identifier
timestampintUnix ms
pattern_typetextcredential / api_key / prompt_injection / exfil_request / …
pattern_matchedtextthe regex or keyword that matched
contexttextredacted snippet around the match
severitytextlow / medium / high
dismissedintuser-acknowledged flag
followed_by_external_callint1 if a web/exec call followed the finding
ingest_state — per-file watermark for incremental parsing.
ColumnTypeNotes
file_pathtextPK, absolute JSONL path
mtime_msintlast observed mtime in ms
size_bytesintlast observed size in bytes
ingested_atintUnix ms
agent_baselines — per-agent behavioral profile over the last 30 days.
ColumnTypeNotes
agent_idtextPK
computed_atintUnix ms, last rebuild time
common_toolstextJSON array of frequently used tools
typical_pathstextJSON array of frequently accessed paths
typical_hourstextJSON array of active hours
avg_tool_calls_per_sessionrealavg tool calls per session
known_domainstextJSON array of seen external domains
audit_ingest_state — per-file watermark for audit parsing (separate from session ingest).
ColumnTypeNotes
file_pathtextPK, absolute JSONL path
mtime_msintlast observed mtime in ms
size_bytesintlast observed size in bytes
ingested_atintUnix ms
settings — key-value store for schema version tracking and configuration.
ColumnTypeNotes
keytextPK (e.g. audit_scoring_version, session_aggregate_version)
valuetextversion string or config value
Indexing strategy: sessions has separate indexes on agent_name and started_at; messages on session_id, timestamp, model; tool_calls on session_id, tool_name, timestamp; audit_events on (agent_id, timestamp), (event_type, timestamp), session_id. These cover the primary query paths from the API layer.

Technical Choices

SQLite (better-sqlite3) over PostgreSQL

SQLite is the only database that matches the operational model of a local-based tool. claw-lens is distributed via npx — there is no server to provision, no connection string to configure. The database is a single file at ~/.openclaw/claw-lens.db that lives next to the data it indexes. better-sqlite3 provides a low-overhead synchronous binding to SQLite’s C engine. The real performance gain comes from transaction batching: on startup, ingestAll wraps thousands of JSONL record upserts in a single db.transaction() block, which reduces fsync calls from one-per-row to one-per-transaction. WAL (Write-Ahead Logging) is enabled so that reads and writes can proceed independently — when you’re browsing the dashboard and a file watcher triggers a background re-ingest at the same time, your page loads don’t stall. Considered: PostgreSQL (operational mismatch — requiring a running database process for a local tool defeats the purpose), LevelDB/RocksDB (no SQL — aggregations and joins across sessions/messages/tool_calls would be painful), Prisma + SQLite (unnecessary abstraction for a schema we fully control). Trade-off: SQLite allows only one writer at a time. Under heavy ingestion, concurrent write attempts would queue behind a lock. In practice this is not an issue: claw-lens runs as a single Express server on one machine, and ingestion is the only write path. No concurrent write access from multiple processes, no cross-machine querying — both are acceptable for this use case.

Express over Fastify

Express is chosen for simplicity and debuggability. For a local, single-user dashboard, operational overhead matters more than performance. The router pattern maps cleanly to our ~10 API modules. Considered: Fastify (better performance and built-in schema validation, but both are unnecessary for internal APIs), Koa and Hono (similar capabilities but with smaller ecosystems or less mature tooling). Trade-off: Express lacks built-in validation and has middleware ordering pitfalls, but these are acceptable given the controlled environment.

WebSocket Proxy over Direct Browser Connection or SSE

claw-lens server connects to the OpenClaw Gateway over WebSocket (authenticated with a local token), and forwards events to browser tabs via its own WebSocket endpoint (/ws/live). The browser side is receive-only, so SSE would work. But the upstream is already WebSocket (ws library), and using the same protocol for the downstream half keeps it to one library, one connection model, and less code to maintain. This is a convenience choice, not a fundamental architectural requirement. Socket.IO adds a higher-level protocol layer we do not need. Polling is too latent for live monitoring. Trade-off: Reconnection is handled manually with exponential backoff (1s → 30s cap). If the gateway is unavailable, claw-lens still serves historical data — the system degrades gracefully.

Rule-Based Risk Scoring, not ML

The audit system uses deterministic rules rather than probabilistic models. Each tool call is evaluated against a fixed rule set — sensitive path access, dangerous command patterns, secret exposure, prompt injection signatures — and assigned a risk level (High / Medium / Low). The rules are transparent: anyone can read risk-scorer.ts and understand exactly why something was flagged. Considered: LLM-based classification (hallucination risk on security judgments — not acceptable for a security feature), anomaly-only detection (misses known-bad patterns that deterministic rules catch reliably). Trade-off: Rules can’t catch novel attack patterns. Anomaly detection (volume spikes, unusual hours, new domains) provides a second layer, but truly novel threats require rule updates.

Disposable Database, Rebuild-Based Schema Evolution

claw-lens is distributed via npm/npx, so upgrades must be zero-maintenance. There is no deploy pipeline, DBA, or migration window — but the previous claw-lens.db file may still exist on disk with an older schema. The key design choice is that the SQLite database is a derived cache, not the system of record. JSONL session files are the source of truth. That allows us to favor rebuild-based compatibility over complex migrations.
  • Startup validation: On startup, initSchema runs all table creation and schema checks before the API begins serving queries. The schema is guaranteed to be current before any request is handled.
  • Additive changes: New columns are added with ALTER TABLE ADD COLUMN; if the column already exists, the operation is skipped.
  • Breaking changes: Versioned derived data (e.g. scoring logic, data format) is invalidated via version keys in the settings table (e.g. audit_scoring_version, session_aggregate_version) and rebuilt from source JSONL files.
  • Failure recovery: If the database becomes inconsistent, deleting claw-lens.db and restarting safely rebuilds the cache from source.
This works because claw-lens does not treat SQLite as primary storage; it treats it as an index and query layer over durable files on disk.

What We Don’t Build

These are intentional exclusions, not a to-do list.
Not builtWhyWho this serves
Cloud syncAll data stays on the user’s machine. No account, no login, no data leaving localhost. A builder using AI agents for sensitive work shouldn’t worry about their session logs being uploaded anywhere.Privacy-conscious users, enterprise builders.
Multi-tenant / team featuresclaw-lens is a single-player tool. One machine, one user, one SQLite file. Team observability is a different product with different trust boundaries.Solo builders who want simplicity over collaboration overhead.
Instrumentation SDKclaw-lens doesn’t inject code into your agents. It doesn’t require you to add import clawLens from 'claw-lens' to your agent code. It reads JSONL files that OpenClaw already writes. Zero coupling.Users who don’t want to modify their agent setup.
Alert routing / PagerDutyAlerts were prototyped and removed (DROP TABLE alert_history, alert_rules, alert_routing_policies). A local dashboard that nobody else sees doesn’t need PagerDuty. The user is already looking at the screen.Users who don’t have an ops team.