Self-learning
How captured patterns, relevance scoring, and organizational pattern stores compound over runs.
Self-learning is the outer loop around the inner loop. It turns each run's behavior into durable pattern data that the next run can draw on, and lets teams share that data without losing project-level specificity.
See self-learning plugin for the plugin-level reference. This page covers the mechanism: what data is captured, how it is scored, and how it influences future runs.
Capture
During any run, agents classified as learning_agents write JSON entries to .learnings/pending/ describing:
- the task they performed
- the pattern (or mistake) they observed
- citations to files and line numbers
- context tags (for example,
auth,api,testing) - the agent that captured it
The subagent-stop-hook.sh enforces that learning_agents acknowledge and capture learnings; agents that do not get re-run up to 2 times.
Scoring
After each iteration of the code loop, run-loop.sh runs an 11-step pipeline:
- Emit
changed-files.jsonfrom git diff. pattern_relevance.pyscores patterns from 0.0 to 1.0 using context tags and keyword overlap.merge_relevance.pyappends|relevance_score|relevance_methodtooutcomes.log.evaluate_goal.pyproducesgoal-outcome.json.merge_goal_outcome.pyappends|goal_name|goal_success|goal_score.verify_citations.pymarks|unverifiedentries where citations do not exist.merge_build_result.pyappends|build_passedor|build_failed.claude -p '/self-learning:process-learnings'classifies pending learnings.write_merged_patterns.pyatomically writes TOON with.bak, 50-pattern cap, sorted by confidence and flags.compute_success_rates.pyrecomputes success rates and assigns flags.claude -p '/self-learning:export-closedloop-learnings'merges ClosedLoop-specific learnings.
Injection
Future runs draw on this data automatically:
subagent-start-hook.shinjects up to 15 relevant org patterns into every agent's context, filtered and sorted by category (mistake > convention > pattern > insight) and relevance.pretooluse-hook.shinjects tool-specific patterns onBash,Write, andEditcalls. Bash calls get build and test patterns; Write and Edit get language-specific patterns chosen by file extension.
Agents never see the raw pattern store. They see filtered, tool-aware, task-relevant patterns.
Goal-weighted success rates
Simple mode: success_rate = passes / applications. Goal-weighted mode:
goal_success=1→ full weight contributiongoal_success=0→relevance_score * 0.5contribution
Matching is tiered from cheapest to most expensive: exact → case-insensitive → substring → Jaccard > 0.6.
Flag lifecycle
Patterns get flagged as they accumulate evidence:
[REVIEW]— success rate below 40%[STALE]— no application in last 10 iterations[UNTESTED]— no applications yet[PRUNE]— more than 20 applications with success rate below 40%
Confidence is binned:
high>= 0.70medium>= 0.40low< 0.40
Organization sharing
/push-learnings and /pull-learnings require CLAUDE_ORG_ID. Echo prevention skips patterns that originated from the current project so contributions do not cycle back into the same project.
Retention
retention.yaml controls pruning (max_runs, max_sessions, max_log_lines, max_archive_age_days, lock_stale_hours, protected_window_minutes). /self-learning:prune-learnings runs the pruner manually; it also runs during step 9 of the post-iteration pipeline.
Why deterministic computation matters
LLMs are extraordinary at classifying patterns but poor at counting. The pipeline delegates counting and scoring to Python scripts that operate on outcomes.log. This produces success rates you can trust.
Why this closes the loop
The inner loop produces an output per run. The outer loop produces better runs over time. Without self-learning, you are running an agent system; with it, you are running a learning team.