Documentation · v0.42 · Research prototype

Big idea.Tiny package.

QuarkLM is a closed-world language model: random weights, no pretrained tokenizer, no external embeddings, and learning only through the admitted corpus. These docs cover the model, build loop, operating discipline, and security boundary.

Quickstart Read the evidence$ python3 -m closed_world_lm.self_improve

Run: runs/self-improve-v0.42/
Admission probes: 48/48 direct · 84/84 paraphrase · 38/38 glossary
Boundary: no pretrained weights · no external embeddings
Diagnosis: passed · no external model

Navigate

Four entry points into QuarkLM.

Pick a path

Where are you trying to go?

Curated paths for the most common moves in the prototype: understand the experiment, admit new memory, and promote evidence without drift.

01New to QuarkLM
Read the model, then run a smoke cycle
02Teaching a new fact
Admit memory, generate probes, retrain weights
03Protecting the experiment
Audit provenance, leakage, and docs freshness

Primitives

The loop is seven auditable objects.

corpus.ledger
Ledger
The explicit list of files allowed to influence training or evaluation.
admission.log
Admitted memory
Structured facts that become learnable only after admission.
probe.audit
Generated probes
Direct and paraphrase checks derived from the admitted-memory log.
weight.run
Versioned weights
Randomly initialized checkpoints promoted only with recorded metrics.
forgetting.audit
Forgetting audit
A comparison against the previous promoted report.
diagnosis.report
Self-diagnosis
Rule-based repair recommendations derived from the run report, with no external model.
verifier.report
Closed-world verifier
Deterministic approval for candidate checks and training plans, with no external model.
recipe.run
Training recipe
A reproducible record of model, tokenizer, data, objective, optimizer, artifacts, gates, and rerun details.
transformer.surface
Transformer surfaces
Experiment, artifact, trainer, and objective catalog boundaries for answer-training screens.
checkpoint.meta
Checkpoint metadata
Centralized transformer config, checkpoint identity, dataset metadata, and run metadata.
eval.report
Eval report
Checkpoint loading, probe scoring, sample JSONL, and eval JSON assembled through narrow surfaces.
rc.boundary
RC boundary
Research Prototype RC and Language Model RC stay separate until transformer promotion gates pass.
docs.release
Docs gate
README, docs, and marketing content updated with each release when they reference current state.

Eidetic Labs

Need the product story or the source?

The marketing page carries the concise product position. The repository and these docs remain the source of truth for commands, evidence, and release gates.

Product site GitHub · eidetic-labs/quark-lm

Big idea.Tiny package.

Four entry points into QuarkLM.

Concepts and product model

Run and extend QuarkLM

Promote releases with evidence

Keep the world closed

Where are you trying to go?

Read the model, then run a smoke cycle

Admit memory, generate probes, retrain weights

Audit provenance, leakage, and docs freshness

The loop is seven auditable objects.

Ledger

Admitted memory

Generated probes

Versioned weights

Forgetting audit

Self-diagnosis

Closed-world verifier

Training recipe

Transformer surfaces

Checkpoint metadata

Eval report

RC boundary

Docs gate

Need the product story or the source?