Documentation · v0.42 · Research prototype
Big idea.Tiny package.
QuarkLM is a closed-world language model: random weights, no pretrained tokenizer, no external embeddings, and learning only through the admitted corpus. These docs cover the model, build loop, operating discipline, and security boundary.
- Run
- runs/self-improve-v0.42/
- Admission probes
- 48/48 direct · 84/84 paraphrase · 38/38 glossary
- Boundary
- no pretrained weights · no external embeddings
- Diagnosis
- passed · no external model
Navigate
Four entry points into QuarkLM.
- 01Learn
Concepts and product model
Understand closed-world learning, the language model, the admitted dataset, and the current evidence.
- vision
- model
- self-improvement
- evidence
- 02Build
Run and extend QuarkLM
Generate curriculum, train random weights, admit new facts, and add generated probes without crossing the purity boundary.
- quickstart
- admission
- probes
- commands
- 03Operate
Promote releases with evidence
Use RC readiness, self-improvement reports, forgetting audits, provenance snapshots, and docs freshness gates.
- RC readiness
- release gates
- provenance
- docs drift
- 04Secure
Keep the world closed
Guard against pretrained weights, unledgered text, prompt leakage, and claims outside the corpus.
- purity
- leakage
- unknowns
- boundaries
Pick a path
Where are you trying to go?
Curated paths for the most common moves in the prototype: understand the experiment, admit new memory, and promote evidence without drift.
01New to QuarkLM
Read the model, then run a smoke cycle
02Teaching a new fact
Admit memory, generate probes, retrain weights
03Protecting the experiment
Audit provenance, leakage, and docs freshness
Primitives
The loop is seven auditable objects.
corpus.ledger
Ledger
The explicit list of files allowed to influence training or evaluation.admission.log
Admitted memory
Structured facts that become learnable only after admission.probe.audit
Generated probes
Direct and paraphrase checks derived from the admitted-memory log.weight.run
Versioned weights
Randomly initialized checkpoints promoted only with recorded metrics.forgetting.audit
Forgetting audit
A comparison against the previous promoted report.diagnosis.report
Self-diagnosis
Rule-based repair recommendations derived from the run report, with no external model.verifier.report
Closed-world verifier
Deterministic approval for candidate checks and training plans, with no external model.recipe.run
Training recipe
A reproducible record of model, tokenizer, data, objective, optimizer, artifacts, gates, and rerun details.transformer.surface
Transformer surfaces
Experiment, artifact, trainer, and objective catalog boundaries for answer-training screens.checkpoint.meta
Checkpoint metadata
Centralized transformer config, checkpoint identity, dataset metadata, and run metadata.eval.report
Eval report
Checkpoint loading, probe scoring, sample JSONL, and eval JSON assembled through narrow surfaces.rc.boundary
RC boundary
Research Prototype RC and Language Model RC stay separate until transformer promotion gates pass.docs.release
Docs gate
README, docs, and marketing content updated with each release when they reference current state.
Eidetic Labs
Need the product story or the source?
The marketing page carries the concise product position. The repository and these docs remain the source of truth for commands, evidence, and release gates.