DOSE Experiment Matrix¶

This page turns the current DOSE follow-up work into a fixed experiment matrix. It stays benchmark-only for now: the supported public reproduction entry point is python -m zeroproofml.benchmarks dose ..., and the frozen direction-head follow-up is promoted to the benchmark mode python -m zeroproofml.benchmarks dose --mode frozen-dirhead ....

Public API decision¶

Decision: DOSE-specific heads, samplers, and loss combinations remain benchmark-only for this release. No new zeroproofml.layers or zeroproofml.losses symbol is promoted solely from the DOSE direction-head or finite-MSE/censoring follow-up rows.

The public library surface should stay generic: use AngularProjectiveHead, ProjectiveRationalMultiHead, LossWeightsCurriculum, the core SCM losses, and decode_strict_censored_3way(...) when building similar workflows outside the benchmark harness. The DOSE-specific wrappers under scripts/dose/, scripts/run_dose_nextsteps.py, and the frozen-dirhead benchmark mode are evidence generators, not stable APIs.

Promotion requires at least one of these to be true: - The head/loss is recast as a domain-neutral primitive with tests and docs outside DOSE. - The same primitive clears a benchmark-defined acceptance bar in another scientific domain. - The release plan explicitly accepts the maintenance cost of a zeroproofml.benchmarks.domains.dose-scoped experimental API.

Matrix rules¶

Keep seeds, dataset sizes, and metric definitions fixed across rows.
Compare each treatment against the named control, not against an ad hoc appendix run.
Keep the output layout under results/paper_suite_dose/nextsteps/{pilot,full}/<variant>/ for scripts/run_dose_nextsteps.py runs.
For frozen direction-head follow-up runs, use python -m zeroproofml.benchmarks dose --mode frozen-dirhead ...; it keeps dirhead_only_results.json and the parent per-seed DOSE result together under the same seed_* directory.

Matrix¶

Factor	Control	Treatment	Entry point	Notes
angular vs non-angular heads	`ZeroProofML_SCM_NoDetachRenorm`	`ZeroProofML_SCM_NoDetachRenorm_Angular`	`python scripts/run_dose_nextsteps.py --stage pilot --variant angular`	Same strict gate, isolates tuple parameterization.
detach-renorm on/off	`ZeroProofML_SCM` (`detach_renorm=True`)	`ZeroProofML_SCM_NoDetachRenorm` (`detach_renorm=False`)	`python scripts/run_paper_suite_dose.py --models ZeroProofML_SCM ZeroProofML_SCM_NoDetachRenorm ...`	Isolates detached renormalization without changing the strict SCM decode.
hard vs soft curriculum	`angular_curriculum_50` (hard)	`angular_soft_curriculum_ext` (soft)	`python scripts/run_dose_nextsteps.py --stage pilot --variant angular_curriculum_50` and `python scripts/run_dose_nextsteps.py --stage pilot --variant angular_soft_curriculum_ext`	Hard keeps the stronger `lambda_bot=20` schedule; soft uses the slower `lambda_bot=10` ramp plus longer runway.
class balancing / `DirBalanceSampler` (strict SCM mirror)	`ZeroProofML_SCM_NoDetachRenorm`	`dirbalance`	`python scripts/run_dose_nextsteps.py --stage pilot --variant dirbalance`	Same sampler ablation as the angular row, but without changing representation.
class balancing / `DirBalanceSampler` (angular)	`angular`	`angular_dirbalance`	`python scripts/run_dose_nextsteps.py --stage pilot --variant angular_dirbalance`	Implemented today by `--train-balance-censored-direction`, which uses a `WeightedRandomSampler` to preserve the in-range rate while balancing censored below/above samples.
finite-MSE auxiliary terms (strict SCM mirror)	`curriculum`	`curriculum_fmse`	`python scripts/run_dose_nextsteps.py --stage pilot --variant curriculum_fmse`	Mirrors the mixed-objective row on the original strict SCM head, so optimizer gains are separated from representation gains.
finite-MSE auxiliary terms (angular)	`angular_curriculum`	`angular_curriculum_fmse`	`python scripts/run_dose_nextsteps.py --stage pilot --variant angular_curriculum_fmse`	This is the current mixed-objective row: it keeps the safe censoring behavior terms from `angular_curriculum` and adds `--finite-mse-weight` plus `--finite-mse-eps` to explicitly optimize finite regression quality on the same run.
frozen direction-head follow-up	tuple-only decode from a saved strict-SCM checkpoint	frozen censored-direction head	`python -m zeroproofml.benchmarks dose --mode frozen-dirhead --device cpu` (internally runs `scripts/dose/train_dose_dirhead_only.py` then `scripts/dose/aggregate_dirhead_only.py`)	Freezes the backbone and measures direction utility separately from gate behavior.
joint direction head vs strict gate variants	`ZeroProofML_SCM_NoDetachRenorm`	`ZeroProofML_SCM_NoDetachRenorm_DirHead_DetachBackbone`	`python scripts/run_dose_nextsteps.py --stage pilot --variant dirhead_shared_gate`	Keeps the `\|Q\| < tau_infer` gate and adds a joint direction head with detached-backbone updates.

Issue-separation reads¶

Do not read the rows above as a flat leaderboard. The useful read is whether a change is optimization-only or representation-only.

scripts/run_dose_nextsteps.py exposes the same pairing catalog through build_issue_ablation_catalog(...), so plots and tables can reuse the same control/treatment definitions instead of re-encoding them in notebooks.

Diagnostic pair	Isolates	Control	Treatment	Interpretation
shared curriculum, swap head	`representation-only`	`curriculum`	`angular_curriculum`	If the angular head wins here, the remaining gap is not just schedule tuning.
add finite-MSE on strict SCM	`optimization-only`	`curriculum`	`curriculum_fmse`	If this closes the gap, the strict head was under-optimized rather than fundamentally unable to represent the target.
add finite-MSE on angular SCM	`optimization-only`	`angular_curriculum`	`angular_curriculum_fmse`	Confirms whether the same mixed objective still moves the frontier after the representation swap.
same mixed objective, swap head	`representation-only`	`curriculum_fmse`	`angular_curriculum_fmse`	If angular still wins after matching the loss, there is a real representation effect left.
add DirBalance on angular SCM	`optimization-only`	`angular`	`angular_dirbalance`	Separates sampler effects from the angular parameterization itself.
same DirBalance, swap head	`representation-only`	`dirbalance`	`angular_dirbalance`	If angular still wins with identical balancing, the direction trade-off is not just a sampling problem.

Canonical config IDs¶

For rows that reuse the standard phase17_dose_results.json benchmark payload directly, the canonical config handle is the model key already stored under runs[...] (for example ZeroProofML_SCM, ZeroProofML_SCM_NoDetachRenorm, ZeroProofML_SCM_NoDetachRenorm_Angular, and ZeroProofML_SCM_NoDetachRenorm_DirHead_DetachBackbone).
scripts/run_dose_nextsteps.py now writes results/paper_suite_dose/nextsteps/{pilot,full}/<variant>/variant_config.json with a stable config_id, the exact run_paper_suite_dose.py argv fragment, and a payload hash. Use that config_id in plot captions, table footnotes, and claim-audit notes. Curriculum variants also carry a structured curriculum_schedule block with the named preset plus the resolved target lambdas, so the hard/soft schedules are inspectable without reverse-engineering argv.
The frozen direction-head follow-up writes the same kind of provenance under dirhead_only_results.json["config"], and aggregated/dirhead_only_summary.json["config"] carries that ID forward.

`run_dose_nextsteps.py` default variant IDs¶

Variant	Canonical config ID	Highlighted model
`curriculum`	`dose-nextsteps-curriculum-96bc4fcf9b29`	`ZeroProofML_SCM_NoDetachRenorm`
`curriculum_fmse`	`dose-nextsteps-curriculum_fmse-22b2ace936ca`	`ZeroProofML_SCM_NoDetachRenorm`
`dirbalance`	`dose-nextsteps-dirbalance-7dec73ccc06a`	`ZeroProofML_SCM_NoDetachRenorm`
`angular`	`dose-nextsteps-angular-0da83f140690`	`ZeroProofML_SCM_NoDetachRenorm_Angular`
`angular_curriculum`	`dose-nextsteps-angular_curriculum-d8e655f51927`	`ZeroProofML_SCM_NoDetachRenorm_Angular`
`angular_curriculum_50`	`dose-nextsteps-angular_curriculum_50-ddd871e97ad0`	`ZeroProofML_SCM_NoDetachRenorm_Angular`
`angular_soft_curriculum_ext`	`dose-nextsteps-angular_soft_curriculum_ext-40f62539a259`	`ZeroProofML_SCM_NoDetachRenorm_Angular`
`angular_curriculum_fmse`	`dose-nextsteps-angular_curriculum_fmse-d8193c71c762`	`ZeroProofML_SCM_NoDetachRenorm_Angular`
`angular_dirbalance`	`dose-nextsteps-angular_dirbalance-577a98ae7018`	`ZeroProofML_SCM_NoDetachRenorm_Angular`
`dirhead_shared_gate`	`dose-nextsteps-dirhead_shared_gate-b512ab9d0c8d`	`ZeroProofML_SCM_NoDetachRenorm_DirHead_DetachBackbone`

Frozen direction-head follow-up ID¶

Workflow	Canonical config ID	Notes
`train_dose_dirhead_only.py` + `aggregate_dirhead_only.py`	`dose-dirhead-only-12de1d9bc29c`	Default `base_model_key=ZeroProofML_SCM_NoDetachRenorm`, `tau_infer=0.025`, `epochs=6`, `batch_size=512`, `lr=5e-3`, balanced censored-direction sampling on.

Reporting contract¶

Report the control and treatment rows together in any DOSE trade-off plot or table.
Reuse the benchmark metric names from docs/18_benchmarks.md and zeroproofml.benchmarks.metrics.
Cite the canonical config_id for every plotted/tabled follow-up run.
When a row uses a checkpoint follow-up (train_dose_dirhead_only.py), report the follow-up summary next to the parent seed-level benchmark artifact instead of as a standalone appendix result.