Local-first Apache-2.0 Zero runtime dependencies Node 18+

Git shows what changed. TreeTrace shows how you steered the agent.

TreeTrace reads your AI coding sessions locally and turns the corrections into reusable regression tests, handoff memory, and an audit trail of every time an agent touched auth, secrets, or access control. Nothing leaves your machine.

$ npx treetrace
Star on GitHub See how it works
~/payments-api ยท npx treetrace
$ npx treetrace
scanning 1 session ยท 41 prompts ยท redaction gate: clean
 
๐ŸŒณ Prompt lineage: how the human steered the agent
โฌข Add rate limiting to the checkout endpoint
โ”œโ”€โ†’ refactor the middleware
โ”‚ โš‘ agent edited auth.ts without reading the session guard
โ”œโ”€โ†ฉ "check the existing auth flow first" โ† correction
โ”œโ”€โœ— abandoned: in-memory token bucket
โ””โ”€โ—† accepted: Redis-backed limiter + guard preserved
 
โœ” wrote TREETRACE_REPORT.md PROMPT_TREE.md
โœ” wrote .treetrace/ tree.json failures.json evals.jsonl agent-memory.md
 
โ— 1 security_or_privacy_risk โ†’ promoted to eval_007 (auth-context check)
$
No accounts No uploads No telemetry Open schema Zero dependencies
The problem

Your team's most useful regression data evaporates every session.

AI coding agents misunderstand goals, make wrong assumptions, and repeat the same mistakes. The corrections a human makes to fix them are the highest-signal data you have, and today they vanish when the session closes.

Git history

Shows what changed.
  • The final diff, after the fact
  • No record of the wrong assumption
  • No memory of the correction that fixed it
  • The abandoned branches are simply gone

TreeTrace

Shows how you steered to get there.
  • The root goal and every direction change
  • Where the agent went wrong, and why
  • The correction, captured as a reusable eval
  • A handoff file so the next agent does not repeat it
Security regression memory

Know every time an agent touched something dangerous.

AI agents edit auth, move secrets, and weaken access control without anyone reviewing the reasoning. TreeTrace flags those moments as they happen across a session, captures the correction, and turns it into a regression check the next agent has to pass.

A security_or_privacy_risk signal carries a confidence score, the evidence text, and the node where a human pushed back.

edited authentication middleware flagged
touched a secret or credential flagged
modified access control or a route guard flagged
disabled or skipped a test flagged
ran an unsafe shell command flagged
opened an SSRF, RCE, or XSS path flagged
How it works

From raw transcript to regression memory in one command.

Run it in any repo after an AI coding session. It reads local transcripts, never the network.

Discover

Claude Code sessions are found automatically from your local project history. Plain transcripts and other tools import with a flag. Tool noise, retries, and "continue" nudges are filtered out.

Reconstruct

A fork-aware tree is derived from prompt topology and your text: the root goal, direction changes, corrections, abandoned branches, checkpoints, and the accepted path, with failure signals and correction chains attached.

Export

Structured artifacts are written locally for humans, agents, CI, and eval harnesses. Every export passes a redaction gate that fails closed if a secret is detected.

What it produces

Real artifacts, written to your repo.

Human-readable reports plus an open machine schema. Below is genuine output from the bundled example.

.treetrace/failures.json
"type": "scope_drift", "confidence": 0.74, "summary": "Agent added a settings panel that was never requested.", "correctedByNodeId": "node_006", "lesson": "Stay inside the corrected scope; do not add surfaces.", "evalCandidate": true
.treetrace/evals.jsonl
"id": "eval_001", "type": "scope_drift_detection", "task": "Continue without drifting outside the corrected scope.", "expected_behavior": [ "Stay inside the corrected scope", "Do not add unrequested surfaces" ]
.treetrace/agent-memory.md
# Memory pack for the next agent What went wrong last time: - edited auth without reading the existing session guard - drifted into an unrequested UI Rules for next run: - inspect relevant files before editing - prefer minimal diffs - preserve the local-first scope
treetrace --handoff
# Agent handoff brief Goal ship the rate limiter Avoid touching auth.ts without reading middleware.ts first Open wire limiter into 2 routes Evals eval_007 must pass before any auth-adjacent change
The point

Stop the same mistake from happening twice.

Every repeated failure is paid for twice: once in engineering time, and again in the tokens and compute burned getting back to where you already were. Catching it once means fewer wasted runs and lower spend.

01

Capture

Reconstruct the session from local transcripts

02

Mark the failure

Where it went wrong, and the fix that worked

03

Generate an eval

The failure becomes a reusable regression check

04

Hand off

The next agent run starts already knowing

Open schema

Lineage is written as a documented, versioned JSON schema. Consumers ignore unknown fields, so adapters for promptfoo, OpenAI Evals-style harnesses, and dataset tools build on top without changing the local-first core.

Private by default

No accounts, no uploads, no telemetry. A redaction gate scrubs secrets before anything is written and fails closed. Your transcripts never leave the machine.

Model-agnostic

Built for Claude Code today, with importers welcome for Codex CLI, Cursor, and chat exports. Eval cases are generic, so they run wherever your team already tests.

Trace your last AI coding session.

One command, in any repo. Nothing leaves your machine.

$ npx treetrace