System Overview
claw-lens is a local-based observability tool for AI agents. It reads the data OpenClaw already writes to disk — session logs, cache traces, cron jobs, agent memory files, and configuration — parses it into SQLite, and serves it to a React frontend through an Express API. The entire system runs locally. No external dependencies, no deployment. A singlenpx claw-lens-cli starts everything.
Session logs go through the Parser into SQLite. Cache traces, cron jobs, agent memory files, and config are read directly by the API layer on demand.
Core Principles
Assumption: claw-lens users are not necessarily engineers. They’re tech-savvy, understand products and business, and know enough to ship with AI agents. They may not write code today, but they learn fast — and will gradually build deeper technical understanding as they go. The decisions below follow from this.1. Zero Configuration
User assumption: The only prerequisite is Node.js. No additional infrastructure, no external services, no configuration files. How it shows up in code:- Embedded SQLite via
better-sqlite3, single file at~/.openclaw/claw-lens.db. - On startup: auto-creates schema (first run) or applies schema changes (version upgrade, e.g. adding
provideron messages,argumentson tool_calls; clearing and rebuilding tables when scoring logic changes), ingests all session data (skipping unchanged files), loads model definitions from the local OpenClaw installation, and opens the browser. - Simplest entry point:
npx claw-lens-cli— no install required. Also supportsnpm install -g claw-lens-clifor global install. Two CLI flags:--port(default 4242, also respects thePORTenv var) and--no-open(suppress browser auto-open).OPENCLAW_HOMEdefaults to~/.openclaw. Separate dev mode and production build workflows are available for contributors.
2. Cost First, Tokens on Demand
User assumption: The user cares about how much their agents cost. They understand$3.93 today instantly. They might not yet know what “42K input tokens” or “cache write tokens” means — but they will eventually, and when they do, the data is there.
How it shows up in code:
- The Overview KPI strip leads with Cost Today in USD, with 7-day and week-over-week comparison. Followed by Tokens Today, Sessions Today, Errors Today, and Cache Efficiency.
- Cost is visible everywhere: per session in the session list, per model on the Overview, per agent on the Agents page.
fmtCost()formats all costs in USD with 4-decimal precision (e.g.$3.9253), down to per-message granularity.fmtTokens()formats token counts in human-readable scale (1.5M,42.3K).- Token breakdown, cache hit rates (computed as
cache_read / (cache_read + input_tokens)), per-model and per-agent cost splits, and cron vs. manual cost comparison are one click from the Overview (KPI cards link to the TokenUsage page). The detail is always there — the user gets to it when they need it. - Audit findings are classified by risk level (high/medium/low) and labeled with human-readable pattern names — e.g. “API Key / Access Token”, “AWS Access Key”, “Private Key (Credential)”, “Prompt Injection” — rather than raw pattern types or numeric scores.
3. Local-Based, Read-Only
claw-lens is a read-only observer. It reads the files OpenClaw already writes to disk and presents them — it does not instrument the agent runtime, does not modify session files, and does not send data off the machine. How it shows up in code:- Server binds to
127.0.0.1, CORS restricted to localhost. No outbound HTTP calls, no telemetry, no analytics. The only network connection is the WebSocket to the local OpenClaw Gateway. - All file operations are reads. The only file claw-lens writes is its own SQLite database (
claw-lens.db). Stopping claw-lens has zero impact on running agents.
4. The Database Is a Cache, Not a Source of Truth
JSONL session files are the source of truth. The SQLite database is a derived index that can be rebuilt at any time. How it shows up in code:ingestAll()scans all JSONL files and rebuilds the database on every startup (skipping unchanged files via mtime + size check).- Deleting
claw-lens.dband re-runningnpx claw-lens-clirestores everything. No data is lost because nothing in SQLite is original — it all comes from files on disk. - When we change how data is computed (e.g. risk scoring logic), affected tables are cleared and re-ingested from source. This is safe precisely because the database is disposable.
Module Responsibilities
CLI Entry (bin/claw-lens.ts)
Accepts --port, --no-open flags and the OPENCLAW_HOME environment variable. Calls startServer() to boot the full service. Default port 4242.
Parser (src/server/parser.ts)
Input: JSONL files under ~/.openclaw/agents/*/sessions/, including archived files with .deleted and .reset suffixes.
Responsibilities:
findSessionFiles(): Scans all agent directories, discovers session files, deduplicates by session ID (prefers active files).parseSessionFile(): Parses JSONL line by line, extracting messages, tool calls, and session metadata.- Cron detection: A session is marked as cron if the first user message contains a
[cron:UUID task-name]prefix. - Non-billable message filtering: Excludes internal Gateway messages (e.g.
delivery-mirror,gateway-injected) that don’t represent actual LLM calls, so they don’t inflate cost or token counts.
ParsedSession, ParsedMessage[], ParsedToolCall[].
Database (src/server/db.ts)
Responsibilities:
openDb(): Opens SQLite with WAL mode and foreign keys enabled.initSchema(): Creates all tables on first run, applies additive schema changes on version upgrades.ingestAll(): Iterates all discovered session files, callsparseSessionFile()+ingestSession()+ingestAuditEvents()for each. Change detection (mtime + size) skips unmodified files. Supports force mode for full wipe and rebuild.ingestSession(): Per-session upsert transaction covering sessions / messages / tool_calls tables.- After ingestion, calls
rebuildAllBaselines()to update per-agent behavioral profiles used by anomaly detection.
API Routes (src/server/api/)
| Route | Purpose |
|---|---|
/api/sessions | Session list, filters, context health |
/api/timeline | Cost & token trends bucketed by day/hour |
/api/tools | Tool usage stats, duration distributions, heatmap |
/api/stats | Agent-level statistics |
/api/audit | Security audit event timeline |
/api/profiler | Session rankings, token consumption analysis |
/api/cron | Cron job management |
/api/debug | Cache trace, context breakdown |
/api/tokens | Token consumption summary and trends |
/api/memory | Agent memory file reads |
/api/refresh | Force re-ingest |
WebSocket Proxy (src/server/api/live.ts)
live.ts powers two user-facing capabilities:
- File watcher: Watches session directories for
.jsonlchanges viafs.watch. On change, debounces 500ms, runsingestAll(), then broadcastsdata_updatedto all connected browser clients. This keeps the dashboard current without manual refresh. - Live Monitor (
/livepage): Proxies real-time agent activity events from the OpenClaw Gateway to the browser via/ws/live, triggering immediate data refresh on the Live Monitor page without waiting for the 30-second polling interval.
~/.openclaw/openclaw.json and sends it as a Bearer header on the WebSocket handshake. If the Gateway process is not running (or crashes), the connection drops — live.ts automatically retries with exponential backoff (1s, 2s, 4s, … capped at 30s) until the Gateway comes back.
If the Gateway is not running, the file watcher still keeps the dashboard current. If session directories don’t exist yet (e.g. no agents have run), the UI falls back to 30-second polling.
Audit System (src/server/audit/)
Independent security audit subsystem:
audit-parser.ts: Reads session JSONL and processes tool calls in three passes — (1) build a tool call map from assistant messages, (2) match tool results, assess risk flags, scan for sensitive data and prompt injection, (3) compute final risk scores with full session context (e.g. was a discovered credential followed by an external call?).risk-scorer.ts: Three-level risk scoring. High (3):rm -rf /, credential exfiltration, prompt injection. Medium (2):sudo, exposed secrets, new external domains. Low (1): unusual hours, volume spikes, atypical file paths. Events scoring 0 are not surfaced in the UI.baseline.ts: Builds a per-agent behavioral profile from the last 30 days — top 20 tools, top 20 directories, top 12 active hours, average tool calls per session, known domains. Rebuilt after every ingestion cycle.anomaly.ts: Compares each tool call against the agent’s baseline to detect deviations — activity outside typical hours, tool call volume >3x the session average, or file access outside typical directories. These deviations become Low-level risk flags.sensitive-data.ts: 34 regex patterns covering API keys (Anthropic, OpenAI, AWS, GitHub, etc.), private keys, database URIs, PII. Matched secrets are masked: first 6 +••••••+ last 4 characters.sensitive-paths.ts: File path pattern matching for sensitive locations —.ssh/,.env, keychain, credential files, PEM/PKCS12 keys. Includes a whitelist for OpenClaw’s own workspace paths.injection-scanner.ts: 9 prompt injection patterns — instruction override, role hijack, exfil request, base64 payload, DAN/jailbreak, etc.
React Frontend (src/ui/)
Vite + React 19 + React Router. Dev server runs on port 6060, proxied to backend 4242.
| Path | Component | Purpose |
|---|---|---|
/ | Overview | KPI dashboard |
/sessions | Sessions | Session table + filters |
/agents | Agents | Agent-level stats + memory |
/live | LiveMonitor | Real-time Gateway monitor |
/audit | Audit | Security audit timeline |
/profiler | Profiler | Tool timing analysis |
/tokens | TokenUsage | Token consumption breakdown |
/timeline | SessionTimeline | Message timeline |
/memory | Memory | Agent memory viewer |
/cron | Cron | Scheduled task management |
/deepturn | AgentLoops | Deep turn analysis |
/contextbreakdown | DebugContext | Context window breakdown |
/cachetrace | DebugReplay | Cache trace viewer |
/settings | Settings | Configuration |
Data Model
The diagram below shows how tables relate to each other. Full column definitions follow in the tables.sessions — one row per agent session.
| Column | Type | Notes |
|---|---|---|
id | text | PK, session UUID |
agent_name | text | agent identifier |
started_at / ended_at | int | Unix ms timestamps |
total_messages | int | message count |
total_cost | real | aggregated USD cost |
total_tokens | int | aggregated tokens |
primary_model | text | dominant model used |
error_count | int | failed tool calls / errored messages |
is_cron | int | 1 if session started from a cron task |
cron_task | text | cron task name if applicable |
task_summary | text | extracted task description |
ingested_at | int | Unix ms, last ingest timestamp |
messages — one row per LLM turn (user / assistant / tool result).
| Column | Type | Notes |
|---|---|---|
id | text | PK, message UUID |
session_id | text | FK → sessions.id |
agent_name | text | denormalized for fast filtering |
parent_id | text | parent message (for branching) |
timestamp | int | Unix ms |
model / provider | text | e.g. claude-sonnet-4 / anthropic |
role | text | user / assistant / tool |
input_tokens / output_tokens | int | per-turn token usage |
cache_read / cache_write | int | prompt cache hits/writes |
total_tokens | int | sum of above |
cost_total | real | per-turn cost |
cost_input / cost_output | real | split by direction |
cost_cache_read / cost_cache_write | real | cache-related cost |
stop_reason | text | e.g. end_turn, tool_use, max_tokens |
error_message | text | populated when call errored |
has_error | int | boolean flag |
is_tool_result | int | 1 if message is a tool result (not user-authored) |
tool_calls — one row per tool invocation inside an assistant message.
| Column | Type | Notes |
|---|---|---|
id | text | PK (composite with message_id) |
message_id | text | PK, parent assistant message |
session_id | text | FK → sessions.id |
agent_name | text | denormalized |
timestamp | int | Unix ms |
tool_name | text | e.g. bash, read, edit |
duration_ms | int | execution time |
success | int | boolean flag |
arguments | text | JSON-encoded arguments |
audit_events — security-relevant events extracted from tool calls.
| Column | Type | Notes |
|---|---|---|
id | int | PK, autoincrement |
session_id | text | FK → sessions.id |
agent_id | text | agent identifier |
timestamp | int | Unix ms |
event_type | text | category (path_access, exec, web_fetch, …) |
tool_name | text | source tool |
target | text | file path, URL, or command |
extra_json | text | structured context |
risk_flags | text | CSV of triggered flags |
risk_score | int | 0-3 (none / low / medium / high) |
raw_input | text | original tool input |
raw_output | text | original tool output |
sensitive_findings — secrets and prompt-injection patterns detected in message content.
| Column | Type | Notes |
|---|---|---|
id | int | PK |
audit_event_id | int | FK → audit_events.id |
session_id | text | FK → sessions.id |
agent_id | text | agent identifier |
timestamp | int | Unix ms |
pattern_type | text | credential / api_key / prompt_injection / exfil_request / … |
pattern_matched | text | the regex or keyword that matched |
context | text | redacted snippet around the match |
severity | text | low / medium / high |
dismissed | int | user-acknowledged flag |
followed_by_external_call | int | 1 if a web/exec call followed the finding |
ingest_state — per-file watermark for incremental parsing.
| Column | Type | Notes |
|---|---|---|
file_path | text | PK, absolute JSONL path |
mtime_ms | int | last observed mtime in ms |
size_bytes | int | last observed size in bytes |
ingested_at | int | Unix ms |
agent_baselines — per-agent behavioral profile over the last 30 days.
| Column | Type | Notes |
|---|---|---|
agent_id | text | PK |
computed_at | int | Unix ms, last rebuild time |
common_tools | text | JSON array of frequently used tools |
typical_paths | text | JSON array of frequently accessed paths |
typical_hours | text | JSON array of active hours |
avg_tool_calls_per_session | real | avg tool calls per session |
known_domains | text | JSON array of seen external domains |
audit_ingest_state — per-file watermark for audit parsing (separate from session ingest).
| Column | Type | Notes |
|---|---|---|
file_path | text | PK, absolute JSONL path |
mtime_ms | int | last observed mtime in ms |
size_bytes | int | last observed size in bytes |
ingested_at | int | Unix ms |
settings — key-value store for schema version tracking and configuration.
| Column | Type | Notes |
|---|---|---|
key | text | PK (e.g. audit_scoring_version, session_aggregate_version) |
value | text | version string or config value |
sessions has separate indexes on agent_name and started_at; messages on session_id, timestamp, model; tool_calls on session_id, tool_name, timestamp; audit_events on (agent_id, timestamp), (event_type, timestamp), session_id. These cover the primary query paths from the API layer.
Technical Choices
SQLite (better-sqlite3) over PostgreSQL
SQLite is the only database that matches the operational model of a local-based tool. claw-lens is distributed vianpx — there is no server to provision, no connection string to configure. The database is a single file at ~/.openclaw/claw-lens.db that lives next to the data it indexes.
better-sqlite3 provides a low-overhead synchronous binding to SQLite’s C engine. The real performance gain comes from transaction batching: on startup, ingestAll wraps thousands of JSONL record upserts in a single db.transaction() block, which reduces fsync calls from one-per-row to one-per-transaction. WAL (Write-Ahead Logging) is enabled so that reads and writes can proceed independently — when you’re browsing the dashboard and a file watcher triggers a background re-ingest at the same time, your page loads don’t stall.
Considered: PostgreSQL (operational mismatch — requiring a running database process for a local tool defeats the purpose), LevelDB/RocksDB (no SQL — aggregations and joins across sessions/messages/tool_calls would be painful), Prisma + SQLite (unnecessary abstraction for a schema we fully control).
Trade-off: SQLite allows only one writer at a time. Under heavy ingestion, concurrent write attempts would queue behind a lock. In practice this is not an issue: claw-lens runs as a single Express server on one machine, and ingestion is the only write path. No concurrent write access from multiple processes, no cross-machine querying — both are acceptable for this use case.
Express over Fastify
Express is chosen for simplicity and debuggability. For a local, single-user dashboard, operational overhead matters more than performance. The router pattern maps cleanly to our ~10 API modules. Considered: Fastify (better performance and built-in schema validation, but both are unnecessary for internal APIs), Koa and Hono (similar capabilities but with smaller ecosystems or less mature tooling). Trade-off: Express lacks built-in validation and has middleware ordering pitfalls, but these are acceptable given the controlled environment.WebSocket Proxy over Direct Browser Connection or SSE
claw-lens server connects to the OpenClaw Gateway over WebSocket (authenticated with a local token), and forwards events to browser tabs via its own WebSocket endpoint (/ws/live).
The browser side is receive-only, so SSE would work. But the upstream is already WebSocket (ws library), and using the same protocol for the downstream half keeps it to one library, one connection model, and less code to maintain. This is a convenience choice, not a fundamental architectural requirement. Socket.IO adds a higher-level protocol layer we do not need. Polling is too latent for live monitoring.
Trade-off: Reconnection is handled manually with exponential backoff (1s → 30s cap). If the gateway is unavailable, claw-lens still serves historical data — the system degrades gracefully.
Rule-Based Risk Scoring, not ML
The audit system uses deterministic rules rather than probabilistic models. Each tool call is evaluated against a fixed rule set — sensitive path access, dangerous command patterns, secret exposure, prompt injection signatures — and assigned a risk level (High / Medium / Low). The rules are transparent: anyone can readrisk-scorer.ts and understand exactly why something was flagged.
Considered: LLM-based classification (hallucination risk on security judgments — not acceptable for a security feature), anomaly-only detection (misses known-bad patterns that deterministic rules catch reliably).
Trade-off: Rules can’t catch novel attack patterns. Anomaly detection (volume spikes, unusual hours, new domains) provides a second layer, but truly novel threats require rule updates.
Disposable Database, Rebuild-Based Schema Evolution
claw-lens is distributed via npm/npx, so upgrades must be zero-maintenance. There is no deploy pipeline, DBA, or migration window — but the previousclaw-lens.db file may still exist on disk with an older schema.
The key design choice is that the SQLite database is a derived cache, not the system of record. JSONL session files are the source of truth. That allows us to favor rebuild-based compatibility over complex migrations.
- Startup validation: On startup,
initSchemaruns all table creation and schema checks before the API begins serving queries. The schema is guaranteed to be current before any request is handled. - Additive changes: New columns are added with
ALTER TABLE ADD COLUMN; if the column already exists, the operation is skipped. - Breaking changes: Versioned derived data (e.g. scoring logic, data format) is invalidated via version keys in the
settingstable (e.g.audit_scoring_version,session_aggregate_version) and rebuilt from source JSONL files. - Failure recovery: If the database becomes inconsistent, deleting
claw-lens.dband restarting safely rebuilds the cache from source.
What We Don’t Build
These are intentional exclusions, not a to-do list.| Not built | Why | Who this serves |
|---|---|---|
| Cloud sync | All data stays on the user’s machine. No account, no login, no data leaving localhost. A builder using AI agents for sensitive work shouldn’t worry about their session logs being uploaded anywhere. | Privacy-conscious users, enterprise builders. |
| Multi-tenant / team features | claw-lens is a single-player tool. One machine, one user, one SQLite file. Team observability is a different product with different trust boundaries. | Solo builders who want simplicity over collaboration overhead. |
| Instrumentation SDK | claw-lens doesn’t inject code into your agents. It doesn’t require you to add import clawLens from 'claw-lens' to your agent code. It reads JSONL files that OpenClaw already writes. Zero coupling. | Users who don’t want to modify their agent setup. |
| Alert routing / PagerDuty | Alerts were prototyped and removed (DROP TABLE alert_history, alert_rules, alert_routing_policies). A local dashboard that nobody else sees doesn’t need PagerDuty. The user is already looking at the screen. | Users who don’t have an ops team. |