Inference & Deployment¶

Inference in ZeroProofML v0.6.0 uses strict SCM semantics: no stochastic thresholds and explicit ⊥ outputs for singular inputs.

Runtime Rules¶

Use a fixed τ_infer to decide when denominators are treated as singular (|Q| < τ_infer ⇒ ⊥).
When you know the training-time margin τ_train (typically τ_train > τ_infer), you can detect the training–inference gap region τ_infer ≤ |Q| < τ_train and treat it as numerically risky at deployment time (e.g., log it, trigger fallbacks, or tighten τ_infer).
No gradient policies are applied; forward behaviour matches the strict SCM decode rule used by zeroproofml.inference.mode.strict_inference.
Encode ⊥ as nan when bridging to IEEE-754 via zeroproofml.utils.ieee_bridge.to_ieee.

Flattened FRU artifacts may record domain_assumptions such as ("x", "nonzero") as part of the artifact contract. They declare the caller-side facts that justify local symbolic simplification: when simplification cancels a symbolic factor covered by one of those facts, cancellation_domain_assumptions records the subset used for that cancellation so audit/export code does not have to infer it from the reduced P/Q pair. The library audits and records these assumptions as caller contracts only: the runtime does not enforce them unless an explicit validator is attached before evaluating the artifact. Feeding inputs that violate a declared nonzero assumption is a user-contract violation and makes the flattened artifact's behavior undefined relative to its declared semantics. ZeroProofML deployment and audit flattening paths default to simplification_mode="scm_strict" so bottom-preserving SCM factors remain in the emitted artifact unless a caller explicitly opts into unsafe field-rational simplification.

Exporting Models¶

TorchScript is a legacy compatibility path in this project (and deprecated in recent PyTorch): use zeroproofml.inference.script_module(model) only if you still need to serve an existing TorchScript consumer.
ONNX: use zeroproofml.inference.export_onnx_model(...) or zeroproofml.inference.export_bundle(...). This is the preferred deployment path and the one exercised by bundle validation and CI.
Checkpoints: saved via SCMTrainer.save_checkpoint, compatible with both SCM-only and projective graphs.

Bundle validation¶

export_bundle(...) writes a bundle with a machine-readable tensor contract:

model.onnx
metadata.json (includes tau_infer, optional tau_train, the strict output contract, strict_inference_exports, per-input/per-output names + shapes + dtypes, explicit batch-axis semantics, optional preprocessing/postprocessing contract identifiers, optional normalization metadata, and mask/provenance semantics)

The strict inference output contract is versioned via strict_inference_schema_version in metadata.json. Version 2 is the first schema where fault_mask, semantic_bottom_mask, and bottom_provenance are guaranteed-present strict-inference result attributes rather than experimental opt-ins. The richer metadata.json payload is separately versioned via top-level schema_name / schema_version; format_version still guards the minimal bundle layout, while strict_inference_schema_version guards the strict ONNX output contract. Schema-v1 bundles use ["decoded", "bottom_mask", "gap_mask"]. Schema-v2 stable strict bundles use ["decoded", "bottom_mask", "gap_mask", "fault_mask", "semantic_bottom_mask", "bottom_provenance"] in that order. For paper reproduction, zeroproofml==0.4.3 is the canonical package pin; use the matching release tag v0.4.3 when replaying published artifacts. The committed artifacts/paper_2026/manifest.json records that package version, and paper-era bundles should be interpreted under their recorded schema-v1 metadata instead of being reinterpreted as current schema-v2 provenance bundles.

Warning: bottom_mask is the authoritative bottom signal. Any consumer that ignores bottom_mask is violating the deployment contract. The decoded payload may contain NaN; a downstream stage may coerce it with nan_to_num, drop or rewrite it through JSON/CSV conversions, or otherwise turn an invalid payload into a finite-looking downstream value.

strict_inference_exports declares whether a bundle records the historical merged_only_masks contract, the deprecated experimental_provenance_outputs contract, or the promoted stable_provenance_outputs contract. New schema-v2 bundles declare stable_provenance_outputs; recorded schema-v1 merged_only_masks and deprecated experimental_provenance_outputs bundles remain valid under their own metadata. Eager Python strict-inference results expose stable provenance attributes without changing the three-field tuple prefix. When the export config enables provenance metadata, metadata.json also records a versioned provenance schema object so sidecar consumers can tell whether the diagnostic layout is split_masks or bottom_provenance. inputs / outputs record the observed export-time tensor signatures using "batch" for the leading batch axis, and batch_axis_semantics declares the same expectation separately so non-Python consumers can bind feeds safely. If deployment depends on external transforms, preprocessing_contract_id, postprocessing_contract_id, and normalization_metadata provide a stable place to declare that sidecar contract in-band.

Validate a bundle before shipping:

from zeroproofml.inference import validate_bundle

validate_bundle("path/to/bundle_dir")

Load and execute the same bundle through ONNX Runtime. Python unpacking keeps the three-field compatibility prefix, while schema-v2 bundles also expose the provenance attributes on the returned result object:

from zeroproofml.inference import load_onnx_runtime_bundle

runtime = load_onnx_runtime_bundle(
    "path/to/bundle_dir",
    providers=["CPUExecutionProvider"],
)
result = runtime.run(x_numpy)
decoded, bottom_mask, gap_mask = result
fault_mask = result.fault_mask
semantic_bottom_mask = result.semantic_bottom_mask
bottom_provenance = result.bottom_provenance

load_onnx_runtime_bundle(...) validates metadata.json, opens model.onnx, and returns the same three-field unpacking prefix as the eager strict inference helpers. Under recorded schema-v2 metadata, the ONNX graph itself has the six named outputs declared in output_names, and the loader maps the provenance outputs back onto fault_mask, semantic_bottom_mask, and bottom_provenance.

Minimal C++ consumer¶

A lightweight header-only C++ wrapper lives at examples/cpp/zeroproofml_bundle.hpp, with a reference consumer at examples/cpp/minimal_bundle_consumer.cpp. StableStrictBundle uses ONNX Runtime's C++ API plus nlohmann/json to validate the current schema-v2 stable bundle contract before running model.onnx: bundle format/version, strict_inference_exports, mask_semantics, batch_axis_semantics, and the six-output provenance-bearing output_names order.

The same contract applies after readback: any consumer that ignores bottom_mask is violating the deployment contract. Gate decoded values with bottom_mask before using them, because NaN bottom sentinels can be coerced by nan_to_num, lost in JSON/CSV conversions, or otherwise made to look finite downstream.

For the strict-flattened head hygiene explanation behind fault_mask versus semantic_bottom_mask, see docs/34_masks_and_provenance_guide.md.

Build it against your local ONNX Runtime and nlohmann/json install:

c++ -std=c++17 examples/cpp/minimal_bundle_consumer.cpp \
  -I/path/to/onnxruntime/include \
  -I/path/to/nlohmann \
  -L/path/to/onnxruntime/lib \
  -lonnxruntime \
  -o minimal_bundle_consumer
./minimal_bundle_consumer path/to/bundle_dir

The example intentionally targets current stable_provenance_outputs schema-v2 bundles. validate_bundle(...) still accepts recorded schema-v1 merged_only_masks bundles, but the C++ wrapper is a minimal current-contract consumer. The example creates a dummy batch-1 float32 input from the exported tensor signature so the example can show the end-to-end decoded / bottom_mask / gap_mask plus provenance readback without pulling in any Python runtime.

For robotics or embedded adopters that already consume ONNX Runtime from C++, prefer the header wrapper over copying the example. A pure C ABI remains deferred until there is a concrete non-C++ runtime consumer that needs it.

For a quick in-process parity check before shipping, run the bundle against the original wrapped model on a smoke sample:

from zeroproofml.inference import run_bundle_reference_smoke_test

summary = run_bundle_reference_smoke_test(
    "path/to/bundle_dir",
    wrapped_model,
    (x_batch,),
    providers=["CPUExecutionProvider"],
)
print(summary["decoded_max_abs_diff"])

run_bundle_reference_smoke_test(...) is a pure-Python reference runner: it loads the bundle through onnxruntime, feeds the smoke inputs in metadata order, compares the stable (decoded, bottom_mask, gap_mask) prefix and any available provenance outputs against the wrapped model, and raises immediately if the bundle drifts. It validates bottom_mask first and compares decoded values only on non-bottom entries, so bottom payload sentinels do not carry semantics.

Reference deployment (robotics RR IK)¶

An end-to-end reference path (train → bundle → strict inference → fallback → report) is provided as:

python scripts/reference_robotics_deployment.py --device cpu --epochs 2 --n-samples 6000

It writes a self-contained run directory under results/reference_deploy_robotics/ including bundle/, VALIDATION_REPORT.md, and a concise VALIDATION_REPORT.summary.json sidecar for operator handoff. Because the run also emits inference_summary.json next to the bundle, the validation report now auto-includes the measured runtime statistics plus a short fallback-policy routing summary for that deployment. To regenerate the same operator report from an existing bundle directory, run python -m zeroproofml.report bundle <bundle_dir>. Current runs also include a versioned output_contract.json that records the reference deployment artifact paths plus the expected inference-summary fields. Deployment runs now also emit strict_inference_audit.json, a versioned machine-readable sidecar with per-batch provenance breakdowns plus aggregated strict-inference monitor state for operator audits. inference_summary.json continues to record the basic bottom/gap/fallback rates plus a hybrid_path_metrics block that records the current SCM-only (strict_only) baseline, an unconstrained decode baseline, routing frequency, finite-sample MAE, and runtime deltas for the hybrid fallback paths relative to both baselines. It also emits fallback_family_benchmark, a compact comparison of strict reject-only, strict gate + analytic/DLS fallback, strict gate + learned-local-expert fallback, and provenance-aware analytic/DLS fallback when the corresponding data is available. When the held-out evaluation uses provenance diagnostics it also records a provenance_routing_comparison block plus provenance-aware hybrid metrics that separate fault-triggered fallback routes from semantic-triggered rejects (fault -> analytic fallback, semantic -> reject). Those runs now also emit a provenance_routing_materiality block that quantifies semantic-bottom misroute reduction, unsafe-accept reduction, accepted-sample coverage delta, and runtime overhead against the published robotics promotion thresholds. The same summary includes provenance_benchmark_evidence, which packages the merged-mask vs provenance-aware comparison and materiality result as a machine-readable review artifact. The block is still a single-run measurement; it does not satisfy the required review-artifact-plus-rerun gate by itself. For operator-facing plots, zeroproofml.utils.viz.plot_workspace_rate_heatmaps can turn RR workspace coordinates plus bottom/gap/fallback masks into workspace heatmaps, plot_2d_mask_map(...) / plot_3d_mask_map(...) can scatter raw mask decisions over 2D or 3D coordinates, plot_route_to_solver_overlay(...) can mark where invalid samples route to the analytic solver versus stay rejected, plot_fallback_route_timeline(...) can plot per-batch route/reject rates with provenance tags, plot_monitoring_batch_summary(...) can plot monitor batch summaries, and plot_detj_stratified_metrics(...) can bucket fallback or tracking metrics by |det(J)|. The bottom/fallback panels stay provenance-aware when fault/semantic diagnostics are present.

You can also call the same reference path from Python and receive structured artifact paths back:

from zeroproofml.reference_robotics_deployment import (
    ReferenceRoboticsDeploymentConfig,
    load_reference_robotics_deployment_artifacts,
    run_reference_robotics_deployment,
)

artifacts = run_reference_robotics_deployment(
    ReferenceRoboticsDeploymentConfig(device="cpu", epochs=2, n_samples=6000)
)
print(artifacts.bundle_model_path)
print(artifacts.bundle_metadata["benchmark_snapshot_id"])

same_run = load_reference_robotics_deployment_artifacts(artifacts.out_root)
print(same_run.validation_report_path)

load_reference_robotics_deployment_artifacts(...) validates output_contract.json when present and otherwise falls back to the legacy fixed layout so previously generated runs remain readable. That same bundle is now the intended ROS 2 RR IK demo input: the companion workspace ships launch/rr_ik_strict_inference.launch.py, config/rr_ik_strict_inference.yaml, and zeroproofml_ros.rr_ik_demo so the reference deployment artifact can be exercised either through the ROS graph or through a local runtime smoke check without rebuilding another bundle. The ROS node and launch path expose qos_preset: use low_latency_control for fresh control-loop samples or offline_batch_replay for reliable deterministic rosbag/batch replay. The node also publishes a Foxglove/PlotJuggler-friendly std_msgs/msg/Float64MultiArray on telemetry_topic; domain configs set /rr_ik/strict_inference/telemetry and /dose/offline_batch/telemetry. VISUALIZATION_TELEMETRY_FIELDS defines the vector order for batch mask counts, per-batch and running bottom/gap/provenance rates, plus fallback metrics such as fallback_rate, fault_fallback_rate, semantic_fallback_rate, and routing counts/rates, with missing optional values encoded as NaN. For named exporter workflows, the ROS package also provides build_visualization_telemetry_row(...) and write_visualization_telemetry_csv(...), which flatten those same diagnostics into the stable VISUALIZATION_TELEMETRY_EXPORT_FIELDS CSV schema. For RViz workflows, the ROS companion package also exposes build_workspace_heatmap_marker_overlay(...) / build_workspace_heatmap_marker_array_message(...), which convert the plot_workspace_rate_heatmaps(...) summary into MarkerArray-compatible CUBE_LIST overlays for RR-style workspace heatmaps. For single-sample RR IK debugging, build_rr_ik_result_marker_overlay(...) / build_rr_ik_result_marker_array_message(...) build a second RViz overlay that shows the current arm pose, requested workspace displacement, and resolved end-effector target from an rr_ik_demo or StrictInferenceResult payload. For a concrete larger graph, launch rr_ik_strict_inference_graph.launch.py: it starts the same strict-inference node, publishes one deterministic RR IK Float64MultiArray sample with ros2 topic pub, and attaches one-shot result and telemetry echo subscribers. Disable those helper CLI nodes when replacing them with a planner, controller, rosbag replay, Foxglove, PlotJuggler, or another downstream consumer. The same companion workspace now includes the non-robotics offline example launch/dose_offline_batch.launch.py, config/dose_offline_batch.yaml, and demo/dose_offline_batch_input.yaml, which put a DOSE bundle on /dose/offline_batch/* topics with offline_batch_replay as the default QoS.

Pattern: strict gate + direction head (censoring/regimes)¶

For 3-way censoring problems (below / in-range / above), you can combine:

1) strict bottom gating via |Q| < τ_infer; and 2) a 2-class direction head to disambiguate which censored regime applies when the sample is bottom.

import torch

from zeroproofml.inference import decode_strict_censored_3way

# Given projective head outputs (P, Q) and optional direction logits.
decoded, bottom_mask, class_id = decode_strict_censored_3way(
    P.squeeze(-1),
    Q.squeeze(-1),
    tau_infer=1e-6,
    direction_logits=dir_logits,  # shape (B, 2)
    orientation_signal=weak_sign_side_channel,  # optional finite side channel
)

If you also thread the provenance attributes from strict_inference(...), the same helper can keep the direction head focused on semantic bottoms while fault-like bottoms use only an explicit finite orientation side channel:

decoded, bottom_mask, class_id = decode_strict_censored_3way(
    P.squeeze(-1),
    Q.squeeze(-1),
    tau_infer=1e-6,
    direction_logits=dir_logits,
    orientation_signal=weak_sign_side_channel,
    fault_mask=fault_mask,
    semantic_bottom_mask=semantic_bottom_mask,
    bottom_provenance=bottom_provenance,
)

When P or Q is non-finite, strict inference treats the sample as a fault-like bottom. +inf and -inf are IEEE artifacts, not SCM carrier values; the infinity artifact itself populates fault_mask, not semantic_bottom_mask. Semantic-bottom provenance is reserved for independent finite threshold or validity-factor triggers. The helper therefore does not derive a censored direction from an IEEE scalar sign. Orientation near singular boundaries must be preserved before or alongside bottoming by an independently computed weak_sign side channel, an angular/projective head, or an auxiliary direction head. Typed semantic labels can carry the same contract when the label source, rather than the overflowed scalar payload, owns the orientation.

For complex values, weak_sign projects finite nonzero payloads to the unit circle and returns bottom unchanged; it is not a reason to treat IEEE infinities as valid decoded payloads. Operationally, a weak-sign tag computed from the decoded scalar payload shares that payload's fault status: if the decoded payload overflows or otherwise becomes non-finite, the derived tag is dropped with the payload fault. Keep orientation only in an independently computed weak-sign side channel, an angular/projective direction head, or a typed semantic label.

Monitoring + fallbacks¶

At deployment time, treat masks as authoritative and log them explicitly:

from zeroproofml.inference import StrictInferenceMonitor, reject_on_gap
from zeroproofml.utils.logging import JsonlLogger

decoded, bottom_mask, gap_mask = wrapped(x)  # strict outputs
decoded_safe, accept_mask = reject_on_gap(decoded, bottom_mask, gap_mask)

event_logger = JsonlLogger("strict_inference_events.jsonl")
mon = StrictInferenceMonitor(
    bundle_id="my_bundle",
    event_logger=event_logger,
    acceptance_rate_drift_threshold=0.1,
)
mon.update(bottom_mask, gap_mask)
rates = mon.rates()
state = mon.export_state(
    include_histograms=True,
    include_batch_summaries=True,
)  # optional counts, histograms, and per-batch summaries

When an event_logger is configured, StrictInferenceMonitor.update(...) emits structured fault-bottom-trigger, semantic-bottom-trigger, numerical-hazard-trigger, gap-trigger, and acceptance-rate-drift records. The stable JsonlLogger works well for this because each event is just a JSON-serializable mapping. Treat sustained high fault_rate as a non-finite/runtime debugging signal, not a model-quality signal, and set alerting thresholds separately from semantic_bottom_rate. If you compute a finite tiny-denominator hazard mask outside the stable strict output tuple, pass it as numerical_hazard_mask; alert on numerical_hazard_rate separately from both provenance rates.

Mask provenance attributes¶

Eager Python strict-inference results expose the stable bottom provenance split without changing the three-field unpacking path:

from zeroproofml.inference import InferenceConfig, SCMInferenceWrapper, strict_inference

cfg = InferenceConfig(tau_infer=1e-6, tau_train=1e-4)
result = strict_inference(P, Q, config=cfg)

fault_mask = result.fault_mask
semantic_bottom_mask = result.semantic_bottom_mask
bottom_provenance = result.bottom_provenance
decoded, bottom_mask, gap_mask = result  # still the stable three-field unpack

wrapped = SCMInferenceWrapper(model, config=cfg).eval()
wrapped_result = wrapped(x)

mon = StrictInferenceMonitor(bundle_id="my_bundle")
mon.update(
    wrapped_result.bottom_mask,
    wrapped_result.gap_mask,
    fault_mask=getattr(wrapped_result, "fault_mask", None),
    semantic_bottom_mask=getattr(wrapped_result, "semantic_bottom_mask", None),
    bottom_provenance=getattr(wrapped_result, "bottom_provenance", None),
)
rates = mon.rates()  # includes fault_rate / semantic_bottom_rate when present
state = mon.export_state(include_histograms=True, include_batch_summaries=True)

bottom_mask remains the authoritative fail-closed mask and equals fault_mask | semantic_bottom_mask. The richer diagnostics live on the same result object without changing stable unpacking. Consumers can route on bottom_mask for all bottoms, on fault_mask or semantic_bottom_mask for provenance-specific handling, or on bottom_provenance when split masks are not used; the enum carries the same finite/fault/semantic/mixed information. Non-finite P or Q payloads set bottom_mask and route through fault_mask; semantic bottoms are reserved for strict threshold-triggered bottoms. When those diagnostics are threaded into decode_strict_censored_3way(...), the current prototype uses the direction head only for semantic bottoms. Fault-like bottoms use orientation_signal when supplied and otherwise carry the neutral class while bottom_mask remains the authoritative rejection signal. route_to_analytic_solver(...) now accepts the same optional provenance diagnostics plus an optional bottom_mask; when supplied, semantic bottoms are left rejected in-place while fault-like bottoms and non-semantic invalids can still be routed to an analytic fallback. If you pass the same event_logger, the helper also emits structured fallback-routing records with a provenance_tag field so routed fault-like bottoms can be distinguished from unlabeled routes. StrictInferenceMonitor.export_state(...) can also include a batch_summaries list with per-batch bottom/gap/acceptance rates and provenance tag counts when diagnostics are supplied. TorchScript exports continue to use the stable decoded, bottom_mask, gap_mask contract. Schema-v2 ONNX bundle exports use the six-output provenance-bearing contract declared above. If you export a bundle from a provenance-configured wrapper, the metadata.json sidecar records the versioned provenance schema declaration.

The legacy experimental_provenance override is retained for call-site compatibility, but eager Python result objects expose the provenance attributes regardless of that flag.

Stable Python contract¶

The v0.6.x mask set is now the working strict-inference result contract:

The stable Python output tuple remains (decoded, bottom_mask, gap_mask). bottom_mask stays authoritative for reject/fallback decisions.
Extra fields such as fault_mask, semantic_bottom_mask, and bottom_provenance are guaranteed attributes on eager Python result objects.
TorchScript exports continue to ship only the stable decoded, bottom_mask, and gap_mask outputs. Schema-v2 ONNX bundle exports ship those three outputs plus fault_mask, semantic_bottom_mask, and bottom_provenance.

Historical promotion criteria¶

The earlier promotion gate came from the design memo in 22_bottom_mask_provenance_design.md. It is retained here as historical context for bundle/export promotion criteria. The Q2 decision memo in 24_bottom_mask_provenance_q2_decision.md records the old keep-experimental bundle/export decision. The gate expected a like-for-like Q2 review on the same commit, bundle, and evaluation inputs:

Stable-contract non-regression: with provenance disabled, merged bottom_mask / gap_mask behavior stays unchanged.
At least one downstream win clears its threshold:
DOSE calibration improves false_in_range_rate_on_censored or false_censored_rate_on_in_range by >=10% relative at matched coverage within 1 percentage point.
Robotics fallback reduces unsafe accepts or misroutes by >=25% relative while accepted-sample coverage stays within 1 percentage point.
Composability/reporting reduces unresolved bottom/fallback ambiguity by >=15% relative with no more than 1 percentage point valid-coverage loss.
Operational cost stays bounded: the provenance path adds no more than 5% wall-clock overhead and does not require a stable ONNX or metadata contract change before approval.
Evidence is repeatable: the winning use case clears the threshold in the recorded Q2 review artifact and at least one rerun with the same measurement recipe.

At the time of that review, the fallback was to keep provenance experimental with a narrower scope rather than silently promote a weak contract.

The recorded Q2 decision was keep experimental for bundle/export promotion. The v0.6.x eager Python result contract promotes the provenance attributes, and schema-v2 deployment bundle outputs carry the same provenance through ONNX. The historical Q2 gate is therefore superseded for the v0.6.x fault/semantic mask set: fault_mask, semantic_bottom_mask, and bottom_provenance are stable result attributes, and the old criteria are not retroactively re-applied to that promotion.

Choosing `τ_infer` (post-hoc sweep)¶

If your strict gate is of the form |Q| < τ_infer, you can evaluate safety trade-offs post-hoc by sweeping τ_infer over cached |Q| values:

from zeroproofml.metrics import tau_infer_sweep_from_q_abs

# q_abs: cached |Q|, is_in_range: boolean ground-truth mask
curves = tau_infer_sweep_from_q_abs(q_abs, is_in_range=is_in_range, taus=[1e-6, 1e-5, 1e-4])

Use write_tau_infer_sweep(...) to save both a JSON payload and a compact Markdown report.

Typical trade-off: increasing τ_infer reduces false in-range predictions on truly censored points, but can increase false censoring on truly in-range points. When provenance splits are available from strict inference, the DOSE operating-point report also treats fault bottoms more strictly than semantic bottoms via fault_rate + 0.5 * semantic_bottom_rate, so tau_infer selection can distinguish numerical failures from semantic rejects.

Named operating points¶

These are named deployment presets for selecting thresholds + fallback behavior. They are not magic values — use a post-hoc sweep (or a held-out calibration set) to set the actual τ_infer/τ_train for your domain.

The DOSE benchmark harness now writes aggregated/dose_operating_points.{json,md} for completed runs so these names are backed by recorded FP_cens/FN_in/direction-F1 measurements plus the run's threshold values. An optional balanced candidate is evaluated there too, but it is only promoted when it selects a distinct winner; otherwise the public preset set remains safety_first, direction_aware, and accuracy_first. When provenance split metrics are present, that report also records a provenance-weighted bottom-cost signal so semantic bottoms can count less heavily than fault bottoms in the tau_infer calibration tie-breaks.

`safety_first`¶

Goal: minimize unsafe accepts by rejecting both strict-bottom and gap-region samples.

from zeroproofml.inference import InferenceConfig, reject_on_gap, safe_sentinel, strict_inference

cfg = InferenceConfig(tau_infer=1e-4, tau_train=1e-3)
decoded, bottom_mask, gap_mask = strict_inference(P, Q, config=cfg)

decoded, accept_mask = reject_on_gap(decoded, bottom_mask, gap_mask)
decoded = safe_sentinel(decoded, ~accept_mask, sentinel=float("nan"))

Trade-off: lower false-finite-on-censored (or “unsafe accept”) at the cost of higher rejection / false-censoring rates.

`direction_aware`¶

Goal: when bottoms happen, keep the censored direction meaningful by using an auxiliary direction head.

from zeroproofml.inference import decode_strict_censored_3way

decoded, bottom_mask, class_id = decode_strict_censored_3way(
    P.squeeze(-1),
    Q.squeeze(-1),
    tau_infer=1e-6,
    direction_logits=dir_logits,  # (B, 2)
)

Trade-off: adds an extra head (and training/eval complexity), but produces more actionable outputs on censored / singular inputs.

`accuracy_first`¶

Goal: maximize coverage by keeping τ_infer tight and treating the gap region as “monitor-only”.

from zeroproofml.inference import StrictInferenceMonitor, reject_on_bottom

decoded, bottom_mask, gap_mask = wrapped(x)
decoded, accept_mask = reject_on_bottom(decoded, bottom_mask)

mon = StrictInferenceMonitor(bundle_id="my_bundle")
mon.update(bottom_mask, gap_mask)  # log rates; provenance keywords stay optional

Trade-off: higher coverage and in-range accuracy when things are well-behaved, but more risk exposure near the training–inference gap unless you monitor and gate carefully.

Safety Checklist¶

Validate coverage on a held-out set using losses.coverage.coverage before shipping.
Monitor the rate of ⊥ outputs in production; aggressive rejection loss during training usually lowers this.
For robotics or control, keep orientation in a finite weak-sign, angular/projective, typed-label, or direction-head side channel; do not rely on +inf / -inf decoded payloads as orientation carriers.