Forward Research Plan

Last reviewed: 2026-06-14.

The full plan lives in the repository root at FORWARD_RESEARCH_PLAN.md.

v0.68 taught a useful but uncomfortable lesson: QuarkLM can improve target-rank evidence while damaging profile coverage and branch diversity. The next step is therefore not another direct-answer knob. The next step is the operating system around training: experiment intent, corpus governance, candidate quarantine, closed-world verification, replay planning, recipes, and constraint-first promotion gates.

What We Reviewed

The v0.69 review cross-references three bodies of evidence:

continual-learning and replay research;
self-generated data, self-feedback, and model-collapse research;
public open-source mechanics from OLMo, Pythia, GPT-NeoX, nanoGPT, minGPT, LitGPT, LLM Foundry, Avalanche, Dolma, Open-Instruct, Self-Instruct, Self-Refine, and Hugging Face tokenizers.

Those sources are design references only. They do not change QuarkLM's purity boundary: no pretrained weights, no pretrained tokenizer, no external embeddings, no copied code, and no unledgered training data.

v0.70 adds the deeper Deep research review. It cross-checks primary papers, official open-source mechanics, and the current QuarkLM codebase before the next implementation step.

Main Finding

Mature language-model projects do not improve by secretly changing one training knob at a time. They make data mixtures, recipes, replay buffers, evaluation sets, contamination checks, checkpoints, logs, and release artifacts explicit.

For QuarkLM, that means:

generated lessons must be candidates before they are training data;
replay must be planned before training, not reconstructed inside a loss;
every run needs a hypothesis and acceptance gate;
verifier checks must precede learned self-judgment;
promotion must reject loss or rank gains that erase coverage, diversity, retention, or unknown-policy behavior.

Implementation Sequence

Experiment registry: record hypothesis, allowed data, planned artifacts, gates, failure criteria, and decision before every run.
Replay extraction: move profile-aware replay planning out of the transformer monolith and preserve the v0.67 behavior with focused tests.
Corpus hygiene: report source mixtures, duplicate pressure, train/eval overlap, generated-candidate ratios, and rare-profile coverage.
Candidate quarantine: store generated lessons, probes, and repair notes as candidates that cannot train weights until admitted.
Closed-world verifier: start deterministic, then later train a verifier only from admitted candidate history and run outcomes.
Recipe layer: make model, tokenizer, curriculum, replay plan, objective, optimizer, snapshot cadence, and promotion gates named and reproducible.
Constraint-first promotion: compare loss, rank, and top-k only after retention, leakage, unknown-policy, target coverage, and diversity pass.

Near-Term Decision

v0.69 is strategy evidence, v0.70 is deep research evidence, and v0.71-v0.83 are the first operating-system implementation steps. None of those are model-quality promotion evidence. v0.81 returns to objective-repair work under the narrower operating surfaces with profile target-share anti-collapse pressure; v0.82 screens that pressure and rejects it on branch diversity. v0.83 adds prompt-specific ownership margins and rejects the screen because trained snapshots still lose target-token coverage. v0.84 adds baseline replay anchors and rejects the screen because trained snapshots preserve only half of the baseline QA/heldout coverage floor. v0.85 adds baseline-floor update gating and rejects the screen because the guard preserves the floor only by rejecting all attempted direct-answer updates. v0.86 adds adaptive baseline-floor retries and rejects the screen because all 200/200 retry attempts still violate the floor. v0.87 adds baseline-covered repair retries and rejects the screen because all 200/200 repaired attempts still violate the floor, setting up the v0.88 objective screen. v0.88 adds objective-side baseline-floor anchors and rejects the screen because all 200/200 objective-shaped attempts still violate the floor. v0.89 removes branch-diversity pressure and trains only baseline-covered floor anchors, but all 200/200 stabilization-only attempts still violate the floor, so v0.90 adds guard diagnostics before branch-diversity pressure is added back. v0.90 shows all 200 rejected attempts are stabilization-shaped, every adaptive scale fails 50 times, heldout violates all attempts, and the worst deficit is 0.25 on learning. v0.91 covers all 227 floor anchors across 12 profile-target groups and still rejects all 200/200 attempts. v0.92 changes the repair shape to sequential source-profile floor batches, rejects all 2000 profile-local attempts, and records 200 no-effective-update outer attempts, so the next repair must isolate floor-preserving weight movement rather than only broaden anchor coverage or reorder profiles. v0.93 adds calibrated scales below 0.01 plus coverage-only guard probes and accepts one nonzero bridge:owner source-profile update at scale 0.0025, while model promotion remains blocked on branch diversity. v0.94 adds profile-scale memory, accepts 8 source-profile updates across 60 profile-scale attempts, and keeps promotion blocked on branch diversity. v0.95 adds diversity-aware profile-scale acceptance, accepts 5 score-improving source-profile updates across 58 profile-scale attempts, rejects 11 floor-preserving score regressions, and keeps promotion blocked on branch diversity. v0.96 adds frontier target anchors, accepts 9 score-improving source-profile updates across 43 profile-scale attempts, lowers max dominant predicted rate to 0.9, and keeps promotion blocked on branch diversity. v0.97 adds coverage-frontier acceptance, accepts 1 coverage-gaining source-profile update across 68 attempts, rejects 15 coverage ties plus 2 coverage regressions, and shows the next repair should keep coverage auditing while allowing later missing-target repairs to continue. v0.98 adds coverage-prep frontier acceptance, accepts 9 source-profile updates across 43 attempts, separates 3 coverage gains from 6 coverage-preparation moves, and shows the next repair should turn preparation moves into direct coverage recovery. v0.99 adds coverage-recovery frontier retry, accepts 6 source-profile updates across 54 attempts, converts 2 prepared candidates into direct coverage recoveries, keeps 4 preparation fallbacks, and shows the next repair should stabilize branch diversity after recovery. v0.100.0 adds branch-stable coverage-recovery acceptance, keeps the 2 recovery conversions, records 15 branch-stability checks, rejects 1 retry for branch-score regression, and shows the next repair should increase branch diversity without weakening the recovery floor. v0.101.0 adds branch-diversity recovery after safe profile updates, accepts 5 local branch-score refinements, falls back once, and shows the next repair should turn local score gains into target-token coverage for the collapsed profiles.

v0.71 implements experiment registry and run-intent schemas. v0.72 extracts replay planning into src/closed_world_lm/replay_plan.py while preserving the profile-aware replay behavior. v0.73 adds corpus hygiene and training-plan artifacts for source mixture, duplicates, train/eval overlap, candidate ratio, rare-profile coverage, allowed data sources, planned artifacts, and replay-plan summaries. v0.74 adds the Research implementation map, which ties each next mechanic to source clusters, public implementation patterns, QuarkLM gaps, and acceptance evidence before more code is added. v0.75 implements candidate quarantine artifacts and lifecycle states. v0.76 implements deterministic closed-world verifier checks. v0.77 implements recipes and constraint-first promotion. v0.78 implements transformer experiment/artifact surfaces, trainer utilities, and a direct-answer objective catalog. v0.79 implements transformer model/config and checkpoint metadata surfaces. v0.80 implements transformer eval/checkpoint-load surfaces. v0.81 implements branch-balanced-context-profile-target-share-preserving-deficit-unlikelihood as the first post-surface anti-collapse objective. v0.82 screens it at runs/transformer-answer-v0.82-fullstack-profile-target-share-smoke-dim4-context80/ and rejects it because trained snapshots still collapse QA and heldout branch diversity before rank gains can be trusted. v0.83 adds branch-balanced-context-profile-prompt-ownership-target-share-preserving-deficit-unlikelihood and screens it at runs/transformer-answer-v0.83-fullstack-prompt-ownership-smoke-dim4-context80/. The focused mechanic works, but the full screen remains rejected because rank gains still require target-token coverage collapse. v0.84 adds branch-balanced-context-profile-baseline-anchored-prompt-ownership-target-share-preserving-deficit-unlikelihood and screens it at runs/transformer-answer-v0.84-fullstack-baseline-anchored-prompt-ownership-smoke-dim4-context80/. The run records 562 active baseline prediction anchors and avoids the v0.83 0.0 coverage collapse, but QA/heldout target-token coverage only reaches 0.125 against the 0.25 baseline floor.

v0.85 adds branch-balanced-context-profile-baseline-floor-gated-prompt-ownership-target-share-preserving-deficit-unlikelihood and screens it at runs/transformer-answer-v0.85-fullstack-baseline-floor-gated-prompt-ownership-smoke-dim4-context80/. The run records 562 active baseline prediction anchors and checks 50/50 attempted updates under a baseline-floor guard. The guard rejects all 50 attempts, preserving QA/heldout coverage at 0.25 but accepting no weight updates.

v0.86 adds branch-balanced-context-profile-baseline-floor-adaptive-prompt-ownership-target-share-preserving-deficit-unlikelihood and screens it at runs/transformer-answer-v0.86-fullstack-baseline-floor-adaptive-prompt-ownership-smoke-dim4-context80/. The run tries learning-rate scales 1.0, 0.25, 0.05, and 0.01 for each guarded direct-answer step. It records 200 attempted retry updates, rejects all 200, preserves QA/heldout coverage at 0.25, and accepts no weight updates.

v0.87 adds branch-balanced-context-profile-baseline-floor-repaired-prompt-ownership-target-share-preserving-deficit-unlikelihood and screens it at runs/transformer-answer-v0.87-fullstack-baseline-floor-repaired-prompt-ownership-clean-smoke-dim4-context80/. The run records 227 repair anchors and applies one bounded baseline-covered anchor repair before each failed adaptive retry is accepted or rejected. It records 200 repaired attempts, rejects all 200, preserves QA/heldout coverage at 0.25, and accepts no weight updates.

v0.88 adds branch-balanced-context-profile-baseline-floor-objective-prompt-ownership-target-share-preserving-deficit-unlikelihood and screens it at runs/transformer-answer-v0.88-fullstack-baseline-floor-objective-prompt-ownership-smoke-dim4-context80/. The run records 227 objective-side floor anchors and includes a balanced anchor batch in the same loss and backward pass as branch-diversity pressure. It records 200 objective anchor batches, rejects all 200 attempted updates, preserves QA/heldout coverage at 0.25, and accepts no weight updates.

v0.89 adds branch-context-profile-baseline-floor-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.89-fullstack-baseline-floor-stabilization-smoke-dim4-context80/. The run removes branch-diversity pressure from guarded attempts and trains only baseline-covered floor anchors. It records 227 stabilization anchors, 200 stabilization anchor batches, rejects all 200 attempted updates, preserves QA/heldout coverage at 0.25, and accepts no weight updates.

v0.90 adds baseline-floor rejection diagnostics and screens them at runs/transformer-answer-v0.90-fullstack-baseline-floor-stabilization-diagnostics-smoke-dim4-context80/. The run records rejected update-shape counts, rejected learning-rate scale counts, violation profile counts, compact floor diagnostic samples, and the worst rejected floor violation. It still rejects 200/200 attempts, but it now identifies the next repair targets: heldout, admissions, glossary, qa, and the worst-deficit learning profile.

v0.91 adds branch-context-profile-baseline-floor-profile-targeted-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.91-fullstack-baseline-floor-profile-targeted-stabilization-smoke-dim4-context80/. The run covers 227 floor anchors across 12 profile-target groups on every guarded attempt, but still rejects 200/200 profile-targeted updates with the same violation profile counts as v0.90.

v0.92 adds branch-context-profile-baseline-floor-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.92-fullstack-baseline-floor-sequential-profile-stabilization-smoke-dim4-context80/. The run covers 10 source-profile floor groups sequentially on every guarded attempt, rejects all 2000 profile-local attempts, and records 200 no-effective-update outer attempts. This shifts the next repair from profile ordering toward smaller or more isolated floor-preserving weight movement.

v0.93 adds branch-context-profile-baseline-floor-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.93-baseline-floor-calibrated-sequential-profile-stabilization-step1-dim4-context80/. The run records calibrated scales down to 0.0001, coverage-only guard probes, 50 profile-local attempts, 49 profile-local rejections, and one accepted nonzero bridge:owner update at scale 0.0025. The next repair should expand accepted calibrated movement beyond one source profile.

v0.94 adds branch-context-profile-baseline-floor-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.94-baseline-floor-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run searches calibrated scales per source profile, records 60 profile-scale attempts, accepts 8 source-profile updates, rejects 52 profile-scale attempts, and preserves the baseline floor. The next repair should turn this safe movement into branch-diverse behavior.

v0.95 adds branch-context-profile-baseline-floor-diversity-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.95-baseline-floor-diversity-profile-scale-calibrated-sequential-stabilization-configured-step1-dim4-context80/. The run keeps the calibrated per-profile scale search, records 58 profile-scale attempts, accepts 5 score-improving source-profile updates, rejects 42 floor regressions and 11 floor-preserving score regressions, and preserves the baseline floor. The next repair should convert non-regressive profile movement into full branch-diversity target coverage.

v0.96 adds branch-context-profile-baseline-floor-diversity-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.96-baseline-floor-diversity-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run adds 52 missing-target frontier anchors to eligible profile-scale batches, records 43 profile-scale attempts, accepts 9 score-improving source-profile updates, rejects 28 floor regressions and 6 floor-preserving score regressions, and preserves the baseline floor. The next repair should turn frontier-driven movement into full branch-diversity target coverage.

v0.97 adds branch-context-profile-baseline-floor-diversity-coverage-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.97-baseline-floor-diversity-coverage-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run keeps 52 missing-target frontier anchors active, records 68 profile-scale attempts, accepts 1 coverage-gaining source-profile update, rejects 50 floor regressions, 15 coverage ties, and 2 coverage regressions, and preserves accepted coverage deltas in the update guard. The next repair should keep the coverage-frontier audit but isolate missing-target repairs so one monotonic gain does not starve later source profiles.

v0.98 adds branch-context-profile-baseline-floor-diversity-coverage-prep-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.98-baseline-floor-diversity-coverage-prep-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run keeps 52 missing-target frontier anchors active, records 43 profile-scale attempts, accepts 9 source-profile updates, separates 3 coverage gains from 6 coverage-preparation moves, rejects 28 floor regressions, 4 coverage ties without score gain, and 2 coverage regressions, and preserves the branch-diversity floor. This sets up the v0.99 coverage-recovery retry.

v0.99 adds branch-context-profile-baseline-floor-diversity-coverage-recovery-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.99-baseline-floor-diversity-coverage-recovery-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run keeps 52 missing-target frontier anchors active, records 54 profile-scale attempts, accepts 6 source-profile updates, identifies 6 prepared recovery candidates, runs 15 recovery retries over 95 records, converts 2 candidates into direct coverage recoveries, keeps 4 preparation fallbacks, rejects 38 floor regressions, 7 coverage ties without score gain, and 3 coverage regressions, and preserves coverage while still failing branch diversity. The next repair should make recovery-compatible updates less branch-collapsing.

v0.100.0 adds branch-context-profile-baseline-floor-diversity-branch-stable-coverage-recovery-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.100.0-baseline-floor-diversity-branch-stable-coverage-recovery-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run records 54 profile-scale attempts, 6 accepted source-profile updates, 6 prepared recovery candidates, 15 branch-stability checks, 2 branch-stable coverage recoveries, 4 preparation fallbacks, 7 floor regressions, 5 coverage ties, and 1 branch-score regression inside the recovery retry. The next repair should improve branch-diversity coverage while preserving this stricter recovery acceptance surface.

v0.101.0 adds branch-context-profile-baseline-floor-diversity-branch-stable-coverage-recovery-branch-diversity-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.101.0-baseline-floor-diversity-branch-diversity-recovery-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run records 52 profile-scale attempts, 6 accepted source-profile updates, 6 branch-diversity recovery candidates, 9 branch-diversity recovery attempts, 5 branch-score-improving refinements, 1 fallback, 1 floor-regression rejection, 1 score-regression rejection, and 2 score-tie rejections. The next repair should convert those local branch-score gains into target-token coverage for the profiles that still collapse.

v0.102.0 adds branch-context-profile-baseline-floor-diversity-branch-stable-coverage-recovery-branch-diversity-collapsed-profile-binding-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.102.0-baseline-floor-diversity-collapsed-profile-binding-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run records 54 profile-scale attempts, 11 accepted source-profile updates, 11 collapsed-profile binding candidates, 31 binding attempts, 1 accepted binding update, 10 fallbacks, 27 collapsed-profile ties, 1 floor-regression rejection, and 2 score-regression rejections. The next repair should target learning, owner, and paraphrases, the 3/9 eval profiles that remain collapsed.

v0.103.0 adds branch-context-profile-baseline-floor-diversity-branch-stable-coverage-recovery-branch-diversity-collapsed-profile-binding-remaining-profile-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.103.0-baseline-floor-diversity-remaining-profile-binding-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run records 56 profile-scale attempts, 11 accepted source-profile updates, 21 prioritized remaining-profile attempts, 6 prioritized acceptances, 15 prioritized rejections, 3 branch-diversity refinements, and 2 collapsed-profile binding updates. The next repair should preserve the new learning coverage gain while targeting the still-collapsed owner and paraphrases profiles.

v0.104.0 adds branch-context-profile-baseline-floor-diversity-branch-stable-coverage-recovery-branch-diversity-collapsed-profile-binding-remaining-profile-owner-paraphrase-frontier-profile-scale-calibrated-sequential-profile-stabilization-unlikelihood and screens it at runs/transformer-answer-v0.104.0-baseline-floor-diversity-owner-paraphrase-binding-frontier-profile-scale-calibrated-sequential-stabilization-step1-dim4-context80/. The run records 16 owner/paraphrase-prioritized attempts, 6 prioritized acceptances, 10 prioritized rejections, 75 learning-preservation checks, 24 preservation failures, and 33 narrowed collapsed-profile binding rejections. The next repair should convert those owner/paraphrase attempts from protected ties into target-token coverage or predicted-token diversity gains.

v0.105.0 adds corpus-only retrieval memory and screens it at runs/transformer-answer-v0.105.0-retrieval-memory-owner-paraphrase-frontier-profile-scale-step1-dim4-context80/. The run writes retrieval_memory_report.json, builds 497 cards from the closed corpus, answers 219/219 eval probes exactly, and records no external model, no external embeddings, no pretrained retriever, and no weight updates. The next repair should use retrieval success as an immediate memory-serving rail and train only the neural behavior that still fails branch-diversity and owner/paraphrase target-token diversity gates.

v0.106.0 adds memory-guided consolidation planning and screens it at runs/transformer-answer-v0.106.0-memory-guided-consolidation-owner-paraphrase-frontier-profile-scale-step1-dim4-context80/. The run writes memory_consolidation_plan.json, records 9 memory-backed neural failed profiles, and ranks owner, paraphrases, glossary, admission_paraphrases, and admissions as the top consolidation priorities. v0.107.0 consumes that plan in runs/transformer-answer-v0.107.0-gated-memory-consolidation-owner-paraphrase-glossary-frontier-profile-scale-step1-dim4-context80/. The run targets owner, paraphrases, and glossary, records 26 memory-consolidation prioritized attempts with 8 acceptances and 18 rejections, keeps retrieval exact at 219/219, and still rejects promotion on branch_diversity_target. The next repair should use this evidence to improve branch diversity without treating retrieved answers as learned transformer weights.

v0.108.0 expands the source-plan window in runs/transformer-answer-v0.108.0-expanded-memory-consolidation-owner-paraphrase-heldout-qa-glossary-frontier-profile-scale-step1-dim4-context80/. The run consumes the v0.107.0 plan, targets owner, paraphrases, heldout, qa, and glossary, maps target-only profiles to admitted source labels, and keeps retrieval exact at 219/219. Branch diversity still blocks promotion, so the next repair should target missing first-token diversity directly.

v0.109.0 implements that repair direction in runs/transformer-answer-v0.109.0-missing-first-token-memory-consolidation-owner-paraphrase-heldout-qa-glossary-frontier-profile-scale-step1-dim4-context80/. The run consumes the v0.108.0 plan, extracts missing first-token target maps, runs 22 guarded missing-token attempts, accepts 1 coverage-gain update, keeps retrieval exact at 219/219, and still rejects promotion on branch_diversity_target. The next repair should use that evidence to target the remaining collapsed owner, paraphrases, and learning profiles without relaxing the target-token or branch-diversity gates.

v0.110.0 narrows that evidence in runs/transformer-answer-v0.110.0-remaining-collapsed-missing-first-token-memory-consolidation-owner-paraphrase-learning-frontier-profile-scale-step1-dim4-context80/. The run consumes the v0.109.0 plan, requires collapsed-profile source evidence, targets only owner, paraphrases, and learning, records 16 guarded missing-token attempts, accepts 1 coverage-gain update, keeps retrieval exact at 219/219, and still rejects promotion on branch_diversity_target. The next repair should make the pressure specific to each remaining profile's missing tokens and acceptance deltas.

v0.111.0 implements that profile-specific repair surface in runs/transformer-answer-v0.111.0-profile-specific-missing-first-token-memory-consolidation-owner-paraphrase-learning-frontier-profile-scale-step1-dim4-context80/. The run consumes the v0.110.0 plan, maps source labels to supported targets before missing-token pressure is applied, records 18 guarded missing-token attempts, keeps retrieval exact at 219/219, and still rejects promotion on branch_diversity_target. The next repair should use those target maps and acceptance deltas to recover paraphrases, owner, and re-emergent glossary target-token diversity without weakening constraint-first promotion.

v0.112.0 pauses that repair and adds root-cause diagnostics in runs/transformer-answer-v0.112.0-branch-diversity-root-cause-profile-specific-memory-consolidation-step1-dim4-context80/. The run consumes the v0.111.0 plan, targets owner, paraphrases, and glossary, keeps retrieval exact at 219/219, records 24 guarded missing-token attempts with 0 direct missing-token acceptances, and classifies the final branch failure as a critical target_routing_gap. The next repair should audit logit priors, output-bias escape paths, representation separation, and profile/target imbalance before introducing another branch objective.

v0.113.0 implements that audit in runs/transformer-answer-v0.113.0-branch-routing-audit-profile-specific-memory-consolidation-step1-dim4-context80/. The run consumes the v0.112.0 plan, targets owner, paraphrases, and learning, keeps retrieval exact at 219/219, records 18 guarded missing-token attempts with 0 direct acceptances and 6 fallbacks, and still rejects promotion on branch_diversity_target. branch_routing_audit records high output-bias escape risk, low representation separation across 9/9 multi-target profiles, and a glossary target-imbalance hotspot. The next repair should be chosen only after those logit-prior and hidden-state separation measurements identify a guarded repair surface.

v0.114.0 implements the logit-prior and centroid-separation measurements in runs/transformer-answer-v0.114.0-logit-prior-representation-instrumentation-profile-specific-memory-consolidation-step1-dim4-context80/. The run consumes the v0.113.0 plan, targets owner, paraphrases, and glossary, keeps retrieval exact at 219/219, records 24 guarded missing-token attempts with 0 direct acceptances and 8 fallbacks, and still rejects promotion on branch_diversity_target. branch_logit_prior_profiles show hidden-projection pressure across 9/9 multi-target profiles, while centroid margins remain poorly separated. The next repair should target that hidden-projection/representation surface under the existing guards.

v0.115.0 implements the first hidden-projection candidate in runs/transformer-answer-v0.115.0-hidden-projection-margin-candidate-step1-dim4-context80/. It runs branch-hidden-projection-margin-unlikelihood with output bias frozen and lowers average collapsed-token hidden advantage from about 0.0842 to 0.0736. It still rejects promotion on branch_diversity_target, with all 9/9 multi-target profiles collapsed to "n" and 2 zero-coverage profiles. The next repair should scale hidden-projection routing pressure only under coverage-preserving gates.

What We Reviewed​

Main Finding​

Implementation Sequence​

Near-Term Decision​

What We Reviewed

Main Finding

Implementation Sequence

Near-Term Decision