Inference & Deployment¶
Inference in ZeroProofML v0.6.0 uses strict SCM semantics: no stochastic thresholds and explicit ⊥ outputs for singular inputs.
Runtime Rules¶
- Use a fixed
τ_inferto decide when denominators are treated as singular (|Q| < τ_infer⇒ ⊥). - When you know the training-time margin
τ_train(typicallyτ_train > τ_infer), you can detect the training–inference gap regionτ_infer ≤ |Q| < τ_trainand treat it as numerically risky at deployment time (e.g., log it, trigger fallbacks, or tightenτ_infer). - No gradient policies are applied; forward behaviour matches the strict SCM decode rule used by
zeroproofml.inference.mode.strict_inference. - Encode ⊥ as
nanwhen bridging to IEEE-754 viazeroproofml.utils.ieee_bridge.to_ieee.
Flattened FRU artifacts may record domain_assumptions such as
("x", "nonzero") as part of the artifact contract. They declare the
caller-side facts that justify local symbolic simplification: when
simplification cancels a symbolic factor covered by one of those facts,
cancellation_domain_assumptions records the subset used for that cancellation
so audit/export code does not have to infer it from the reduced P/Q pair.
The library audits and records these assumptions as caller contracts only: the
runtime does not enforce them unless an explicit validator is attached
before evaluating the artifact. Feeding inputs that violate a declared nonzero
assumption is a user-contract violation and makes the flattened artifact's
behavior undefined relative to its declared semantics.
ZeroProofML deployment and audit flattening paths default to
simplification_mode="scm_strict" so bottom-preserving SCM factors remain in
the emitted artifact unless a caller explicitly opts into unsafe
field-rational simplification.
Exporting Models¶
- TorchScript is a legacy compatibility path in this project (and deprecated in recent PyTorch): use
zeroproofml.inference.script_module(model)only if you still need to serve an existing TorchScript consumer. - ONNX: use
zeroproofml.inference.export_onnx_model(...)orzeroproofml.inference.export_bundle(...). This is the preferred deployment path and the one exercised by bundle validation and CI. - Checkpoints: saved via
SCMTrainer.save_checkpoint, compatible with both SCM-only and projective graphs.
Bundle validation¶
export_bundle(...) writes a bundle with a machine-readable tensor contract:
model.onnxmetadata.json(includestau_infer, optionaltau_train, the strict output contract,strict_inference_exports, per-input/per-output names + shapes + dtypes, explicit batch-axis semantics, optional preprocessing/postprocessing contract identifiers, optional normalization metadata, and mask/provenance semantics)
The strict inference output contract is versioned via strict_inference_schema_version in metadata.json.
Version 2 is the first schema where fault_mask, semantic_bottom_mask, and
bottom_provenance are guaranteed-present strict-inference result attributes
rather than experimental opt-ins.
The richer metadata.json payload is separately versioned via top-level
schema_name / schema_version; format_version still guards the minimal
bundle layout, while strict_inference_schema_version guards the strict ONNX
output contract. Schema-v1 bundles use
["decoded", "bottom_mask", "gap_mask"]. Schema-v2 stable strict bundles use
["decoded", "bottom_mask", "gap_mask", "fault_mask",
"semantic_bottom_mask", "bottom_provenance"] in that order.
For paper reproduction, zeroproofml==0.4.3 is the canonical package pin; use
the matching release tag v0.4.3 when replaying published artifacts. The
committed artifacts/paper_2026/manifest.json records that package version, and
paper-era bundles should be interpreted under their recorded schema-v1 metadata
instead of being reinterpreted as current schema-v2 provenance bundles.
Warning:
bottom_maskis the authoritative bottom signal. Any consumer that ignoresbottom_maskis violating the deployment contract. Thedecodedpayload may containNaN; a downstream stage may coerce it withnan_to_num, drop or rewrite it through JSON/CSV conversions, or otherwise turn an invalid payload into a finite-looking downstream value.
strict_inference_exports declares whether a bundle records the historical
merged_only_masks contract, the deprecated experimental_provenance_outputs
contract, or the promoted stable_provenance_outputs contract. New schema-v2
bundles declare stable_provenance_outputs; recorded schema-v1
merged_only_masks and deprecated experimental_provenance_outputs bundles
remain valid under their own metadata. Eager Python strict-inference results
expose stable provenance attributes without changing the three-field tuple
prefix.
When the export config enables provenance metadata, metadata.json also records
a versioned provenance schema object so sidecar consumers can tell whether the
diagnostic layout is split_masks or bottom_provenance.
inputs / outputs record the observed export-time tensor signatures using
"batch" for the leading batch axis, and batch_axis_semantics declares the
same expectation separately so non-Python consumers can bind feeds safely.
If deployment depends on external transforms, preprocessing_contract_id,
postprocessing_contract_id, and normalization_metadata provide a stable
place to declare that sidecar contract in-band.
Validate a bundle before shipping:
from zeroproofml.inference import validate_bundle
validate_bundle("path/to/bundle_dir")
Load and execute the same bundle through ONNX Runtime. Python unpacking keeps the three-field compatibility prefix, while schema-v2 bundles also expose the provenance attributes on the returned result object:
from zeroproofml.inference import load_onnx_runtime_bundle
runtime = load_onnx_runtime_bundle(
"path/to/bundle_dir",
providers=["CPUExecutionProvider"],
)
result = runtime.run(x_numpy)
decoded, bottom_mask, gap_mask = result
fault_mask = result.fault_mask
semantic_bottom_mask = result.semantic_bottom_mask
bottom_provenance = result.bottom_provenance
load_onnx_runtime_bundle(...) validates metadata.json, opens model.onnx,
and returns the same three-field unpacking prefix as the eager strict inference
helpers. Under recorded schema-v2 metadata, the ONNX graph itself has the six
named outputs declared in output_names, and the loader maps the provenance
outputs back onto fault_mask, semantic_bottom_mask, and
bottom_provenance.
Minimal C++ consumer¶
A lightweight header-only C++ wrapper lives at
examples/cpp/zeroproofml_bundle.hpp, with a reference consumer at
examples/cpp/minimal_bundle_consumer.cpp. StableStrictBundle uses ONNX
Runtime's C++ API plus nlohmann/json to validate the current schema-v2 stable
bundle contract before running model.onnx: bundle format/version,
strict_inference_exports, mask_semantics, batch_axis_semantics, and the
six-output provenance-bearing output_names order.
The same contract applies after readback: any consumer that ignores
bottom_mask is violating the deployment contract. Gate decoded values with
bottom_mask before using them, because NaN bottom sentinels can be coerced
by nan_to_num, lost in JSON/CSV conversions, or otherwise made to look finite
downstream.
For the strict-flattened head hygiene explanation behind fault_mask versus
semantic_bottom_mask, see
docs/34_masks_and_provenance_guide.md.
Build it against your local ONNX Runtime and nlohmann/json install:
c++ -std=c++17 examples/cpp/minimal_bundle_consumer.cpp \
-I/path/to/onnxruntime/include \
-I/path/to/nlohmann \
-L/path/to/onnxruntime/lib \
-lonnxruntime \
-o minimal_bundle_consumer
./minimal_bundle_consumer path/to/bundle_dir
The example intentionally targets current stable_provenance_outputs
schema-v2 bundles. validate_bundle(...) still accepts recorded schema-v1
merged_only_masks bundles, but the C++ wrapper is a minimal current-contract
consumer. The example creates
a dummy batch-1 float32 input from the exported tensor signature so the example
can show the end-to-end decoded / bottom_mask / gap_mask plus provenance
readback without pulling in any Python runtime.
For robotics or embedded adopters that already consume ONNX Runtime from C++, prefer the header wrapper over copying the example. A pure C ABI remains deferred until there is a concrete non-C++ runtime consumer that needs it.
For a quick in-process parity check before shipping, run the bundle against the original wrapped model on a smoke sample:
from zeroproofml.inference import run_bundle_reference_smoke_test
summary = run_bundle_reference_smoke_test(
"path/to/bundle_dir",
wrapped_model,
(x_batch,),
providers=["CPUExecutionProvider"],
)
print(summary["decoded_max_abs_diff"])
run_bundle_reference_smoke_test(...) is a pure-Python reference runner: it
loads the bundle through onnxruntime, feeds the smoke inputs in metadata
order, compares the stable (decoded, bottom_mask, gap_mask) prefix and any
available provenance outputs against the wrapped model, and raises immediately
if the bundle drifts. It validates bottom_mask first and compares decoded
values only on non-bottom entries, so bottom payload sentinels do not carry
semantics.
Reference deployment (robotics RR IK)¶
An end-to-end reference path (train → bundle → strict inference → fallback → report) is provided as:
python scripts/reference_robotics_deployment.py --device cpu --epochs 2 --n-samples 6000
It writes a self-contained run directory under results/reference_deploy_robotics/ including bundle/, VALIDATION_REPORT.md, and a concise VALIDATION_REPORT.summary.json sidecar for operator handoff. Because the run also emits inference_summary.json next to the bundle, the validation report now auto-includes the measured runtime statistics plus a short fallback-policy routing summary for that deployment. To regenerate the same operator report from an existing bundle directory, run python -m zeroproofml.report bundle <bundle_dir>.
Current runs also include a versioned output_contract.json that records the
reference deployment artifact paths plus the expected inference-summary fields.
Deployment runs now also emit strict_inference_audit.json, a versioned
machine-readable sidecar with per-batch provenance breakdowns plus aggregated
strict-inference monitor state for operator audits.
inference_summary.json continues to record the basic bottom/gap/fallback
rates plus a hybrid_path_metrics block that records the current SCM-only
(strict_only) baseline, an unconstrained decode baseline, routing frequency,
finite-sample MAE, and runtime deltas for the hybrid fallback paths relative to
both baselines. It also emits fallback_family_benchmark, a compact comparison
of strict reject-only, strict gate + analytic/DLS fallback, strict gate +
learned-local-expert fallback, and provenance-aware analytic/DLS fallback when
the corresponding data is available. When the held-out evaluation uses
provenance diagnostics it also records a provenance_routing_comparison block
plus provenance-aware hybrid metrics that separate fault-triggered fallback
routes from semantic-triggered rejects (fault -> analytic fallback,
semantic -> reject). Those runs now also emit a
provenance_routing_materiality block that quantifies semantic-bottom
misroute reduction, unsafe-accept reduction, accepted-sample coverage delta,
and runtime overhead against the published robotics promotion thresholds. The
same summary includes provenance_benchmark_evidence, which packages the
merged-mask vs provenance-aware comparison and materiality result as a
machine-readable review artifact. The block is still a single-run measurement;
it does not satisfy the required review-artifact-plus-rerun gate by itself.
For operator-facing plots, zeroproofml.utils.viz.plot_workspace_rate_heatmaps
can turn RR workspace coordinates plus bottom/gap/fallback masks into
workspace heatmaps, plot_2d_mask_map(...) / plot_3d_mask_map(...) can
scatter raw mask decisions over 2D or 3D coordinates,
plot_route_to_solver_overlay(...) can mark where invalid samples route to
the analytic solver versus stay rejected, plot_fallback_route_timeline(...)
can plot per-batch route/reject rates with provenance tags,
plot_monitoring_batch_summary(...) can plot monitor batch summaries, and
plot_detj_stratified_metrics(...) can bucket fallback or tracking metrics by
|det(J)|. The bottom/fallback panels stay provenance-aware when
fault/semantic diagnostics are present.
You can also call the same reference path from Python and receive structured artifact paths back:
from zeroproofml.reference_robotics_deployment import (
ReferenceRoboticsDeploymentConfig,
load_reference_robotics_deployment_artifacts,
run_reference_robotics_deployment,
)
artifacts = run_reference_robotics_deployment(
ReferenceRoboticsDeploymentConfig(device="cpu", epochs=2, n_samples=6000)
)
print(artifacts.bundle_model_path)
print(artifacts.bundle_metadata["benchmark_snapshot_id"])
same_run = load_reference_robotics_deployment_artifacts(artifacts.out_root)
print(same_run.validation_report_path)
load_reference_robotics_deployment_artifacts(...) validates
output_contract.json when present and otherwise falls back to the legacy fixed
layout so previously generated runs remain readable.
That same bundle is now the intended ROS 2 RR IK demo input: the companion
workspace ships launch/rr_ik_strict_inference.launch.py,
config/rr_ik_strict_inference.yaml, and zeroproofml_ros.rr_ik_demo so the
reference deployment artifact can be exercised either through the ROS graph or
through a local runtime smoke check without rebuilding another bundle. The ROS
node and launch path expose qos_preset: use low_latency_control for fresh
control-loop samples or offline_batch_replay for reliable deterministic
rosbag/batch replay. The node also publishes a Foxglove/PlotJuggler-friendly
std_msgs/msg/Float64MultiArray on telemetry_topic; domain configs set
/rr_ik/strict_inference/telemetry and /dose/offline_batch/telemetry.
VISUALIZATION_TELEMETRY_FIELDS defines the vector order for batch mask counts,
per-batch and running bottom/gap/provenance rates, plus fallback metrics such
as fallback_rate, fault_fallback_rate, semantic_fallback_rate, and
routing counts/rates, with missing optional values encoded as NaN. For named
exporter workflows, the ROS package also provides
build_visualization_telemetry_row(...) and
write_visualization_telemetry_csv(...), which flatten those same diagnostics
into the stable VISUALIZATION_TELEMETRY_EXPORT_FIELDS CSV schema.
For RViz workflows, the ROS companion package also exposes
build_workspace_heatmap_marker_overlay(...) /
build_workspace_heatmap_marker_array_message(...), which convert the
plot_workspace_rate_heatmaps(...) summary into MarkerArray-compatible
CUBE_LIST overlays for RR-style workspace heatmaps. For single-sample RR IK
debugging, build_rr_ik_result_marker_overlay(...) /
build_rr_ik_result_marker_array_message(...) build a second RViz overlay that
shows the current arm pose, requested workspace displacement, and resolved
end-effector target from an rr_ik_demo or StrictInferenceResult payload.
For a concrete larger graph, launch
rr_ik_strict_inference_graph.launch.py: it starts the same strict-inference
node, publishes one deterministic RR IK Float64MultiArray sample with
ros2 topic pub, and attaches one-shot result and telemetry echo subscribers.
Disable those helper CLI nodes when replacing them with a planner, controller,
rosbag replay, Foxglove, PlotJuggler, or another downstream consumer.
The same companion workspace now includes the non-robotics offline example
launch/dose_offline_batch.launch.py, config/dose_offline_batch.yaml, and
demo/dose_offline_batch_input.yaml, which put a DOSE bundle on
/dose/offline_batch/* topics with offline_batch_replay as the default QoS.
Pattern: strict gate + direction head (censoring/regimes)¶
For 3-way censoring problems (below / in-range / above), you can combine:
1) strict bottom gating via |Q| < τ_infer; and
2) a 2-class direction head to disambiguate which censored regime applies when the sample is bottom.
import torch
from zeroproofml.inference import decode_strict_censored_3way
# Given projective head outputs (P, Q) and optional direction logits.
decoded, bottom_mask, class_id = decode_strict_censored_3way(
P.squeeze(-1),
Q.squeeze(-1),
tau_infer=1e-6,
direction_logits=dir_logits, # shape (B, 2)
orientation_signal=weak_sign_side_channel, # optional finite side channel
)
If you also thread the provenance attributes from strict_inference(...), the
same helper can keep the direction head focused on semantic bottoms while
fault-like bottoms use only an explicit finite orientation side channel:
decoded, bottom_mask, class_id = decode_strict_censored_3way(
P.squeeze(-1),
Q.squeeze(-1),
tau_infer=1e-6,
direction_logits=dir_logits,
orientation_signal=weak_sign_side_channel,
fault_mask=fault_mask,
semantic_bottom_mask=semantic_bottom_mask,
bottom_provenance=bottom_provenance,
)
When P or Q is non-finite, strict inference treats the sample as a
fault-like bottom. +inf and -inf are IEEE artifacts, not SCM carrier
values; the infinity artifact itself populates fault_mask, not
semantic_bottom_mask. Semantic-bottom provenance is reserved for independent
finite threshold or validity-factor triggers. The helper therefore does not
derive a censored direction from an IEEE scalar sign. Orientation near
singular boundaries must be preserved before or alongside bottoming by an
independently computed weak_sign side channel, an angular/projective head, or
an auxiliary direction head. Typed semantic labels can carry the same contract
when the label source, rather than the overflowed scalar payload, owns the
orientation.
For complex values, weak_sign projects finite nonzero payloads to the unit
circle and returns bottom unchanged; it is not a reason to treat IEEE
infinities as valid decoded payloads.
Operationally, a weak-sign tag computed from the decoded scalar payload shares
that payload's fault status: if the decoded payload overflows or otherwise
becomes non-finite, the derived tag is dropped with the payload fault. Keep
orientation only in an independently computed weak-sign side channel, an
angular/projective direction head, or a typed semantic label.
Monitoring + fallbacks¶
At deployment time, treat masks as authoritative and log them explicitly:
from zeroproofml.inference import StrictInferenceMonitor, reject_on_gap
from zeroproofml.utils.logging import JsonlLogger
decoded, bottom_mask, gap_mask = wrapped(x) # strict outputs
decoded_safe, accept_mask = reject_on_gap(decoded, bottom_mask, gap_mask)
event_logger = JsonlLogger("strict_inference_events.jsonl")
mon = StrictInferenceMonitor(
bundle_id="my_bundle",
event_logger=event_logger,
acceptance_rate_drift_threshold=0.1,
)
mon.update(bottom_mask, gap_mask)
rates = mon.rates()
state = mon.export_state(
include_histograms=True,
include_batch_summaries=True,
) # optional counts, histograms, and per-batch summaries
When an event_logger is configured, StrictInferenceMonitor.update(...)
emits structured fault-bottom-trigger, semantic-bottom-trigger,
numerical-hazard-trigger, gap-trigger, and acceptance-rate-drift records.
The stable JsonlLogger works well for this because each event is just a
JSON-serializable mapping. Treat sustained high fault_rate as a
non-finite/runtime debugging signal, not a model-quality signal, and set
alerting thresholds separately from semantic_bottom_rate. If you compute a
finite tiny-denominator hazard mask outside the stable strict output tuple,
pass it as numerical_hazard_mask; alert on numerical_hazard_rate
separately from both provenance rates.
Mask provenance attributes¶
Eager Python strict-inference results expose the stable bottom provenance split without changing the three-field unpacking path:
from zeroproofml.inference import InferenceConfig, SCMInferenceWrapper, strict_inference
cfg = InferenceConfig(tau_infer=1e-6, tau_train=1e-4)
result = strict_inference(P, Q, config=cfg)
fault_mask = result.fault_mask
semantic_bottom_mask = result.semantic_bottom_mask
bottom_provenance = result.bottom_provenance
decoded, bottom_mask, gap_mask = result # still the stable three-field unpack
wrapped = SCMInferenceWrapper(model, config=cfg).eval()
wrapped_result = wrapped(x)
mon = StrictInferenceMonitor(bundle_id="my_bundle")
mon.update(
wrapped_result.bottom_mask,
wrapped_result.gap_mask,
fault_mask=getattr(wrapped_result, "fault_mask", None),
semantic_bottom_mask=getattr(wrapped_result, "semantic_bottom_mask", None),
bottom_provenance=getattr(wrapped_result, "bottom_provenance", None),
)
rates = mon.rates() # includes fault_rate / semantic_bottom_rate when present
state = mon.export_state(include_histograms=True, include_batch_summaries=True)
bottom_mask remains the authoritative fail-closed mask and equals
fault_mask | semantic_bottom_mask. The richer diagnostics live on the same
result object without changing stable unpacking.
Consumers can route on bottom_mask for all bottoms, on fault_mask or
semantic_bottom_mask for provenance-specific handling, or on
bottom_provenance when split masks are not used; the enum carries the same
finite/fault/semantic/mixed information.
Non-finite P or Q payloads set bottom_mask and route through
fault_mask; semantic bottoms are reserved for strict threshold-triggered
bottoms.
When those diagnostics are threaded into decode_strict_censored_3way(...),
the current prototype uses the direction head only for semantic bottoms.
Fault-like bottoms use orientation_signal when supplied and otherwise carry
the neutral class while bottom_mask remains the authoritative rejection
signal.
route_to_analytic_solver(...) now accepts the same optional provenance
diagnostics plus an optional bottom_mask; when supplied, semantic bottoms are
left rejected in-place while fault-like bottoms and non-semantic invalids can
still be routed to an analytic fallback. If you pass the same event_logger,
the helper also emits structured fallback-routing records with a
provenance_tag field so routed fault-like bottoms can be distinguished from
unlabeled routes. StrictInferenceMonitor.export_state(...) can also include a
batch_summaries list with per-batch bottom/gap/acceptance rates and
provenance tag counts when diagnostics are supplied.
TorchScript exports continue to use the stable decoded, bottom_mask,
gap_mask contract. Schema-v2 ONNX bundle exports use the six-output
provenance-bearing contract declared above.
If you export a bundle from a provenance-configured wrapper, the
metadata.json sidecar records the versioned provenance schema declaration.
The legacy experimental_provenance override is retained for call-site
compatibility, but eager Python result objects expose the provenance attributes
regardless of that flag.
Stable Python contract¶
The v0.6.x mask set is now the working strict-inference result contract:
- The stable Python output tuple remains
(decoded, bottom_mask, gap_mask).bottom_maskstays authoritative for reject/fallback decisions. - Extra fields such as
fault_mask,semantic_bottom_mask, andbottom_provenanceare guaranteed attributes on eager Python result objects. - TorchScript exports continue to ship only the stable
decoded,bottom_mask, andgap_maskoutputs. Schema-v2 ONNX bundle exports ship those three outputs plusfault_mask,semantic_bottom_mask, andbottom_provenance.
Historical promotion criteria¶
The earlier promotion gate came from the design memo in
22_bottom_mask_provenance_design.md.
It is retained here as historical context for bundle/export promotion criteria.
The Q2 decision memo in
24_bottom_mask_provenance_q2_decision.md
records the old keep-experimental bundle/export decision.
The gate expected a like-for-like Q2 review on the same commit, bundle, and
evaluation inputs:
- Stable-contract non-regression: with provenance disabled, merged
bottom_mask/gap_maskbehavior stays unchanged. - At least one downstream win clears its threshold:
- DOSE calibration improves
false_in_range_rate_on_censoredorfalse_censored_rate_on_in_rangeby>=10% relativeat matched coverage within1 percentage point. - Robotics fallback reduces unsafe accepts or misroutes by
>=25% relativewhile accepted-sample coverage stays within1 percentage point. - Composability/reporting reduces unresolved bottom/fallback ambiguity by
>=15% relativewith no more than1 percentage pointvalid-coverage loss. - Operational cost stays bounded: the provenance path adds no more than
5%wall-clock overhead and does not require a stable ONNX or metadata contract change before approval. - Evidence is repeatable: the winning use case clears the threshold in the recorded Q2 review artifact and at least one rerun with the same measurement recipe.
At the time of that review, the fallback was to keep provenance experimental with a narrower scope rather than silently promote a weak contract.
The recorded Q2 decision was keep experimental for bundle/export promotion.
The v0.6.x eager Python result contract promotes the provenance attributes,
and schema-v2 deployment bundle outputs carry the same provenance through ONNX.
The historical Q2 gate is therefore superseded for the v0.6.x
fault/semantic mask set: fault_mask, semantic_bottom_mask, and
bottom_provenance are stable result attributes, and the old criteria are not
retroactively re-applied to that promotion.
Choosing τ_infer (post-hoc sweep)¶
If your strict gate is of the form |Q| < τ_infer, you can evaluate safety trade-offs post-hoc by sweeping τ_infer over cached |Q| values:
from zeroproofml.metrics import tau_infer_sweep_from_q_abs
# q_abs: cached |Q|, is_in_range: boolean ground-truth mask
curves = tau_infer_sweep_from_q_abs(q_abs, is_in_range=is_in_range, taus=[1e-6, 1e-5, 1e-4])
Use write_tau_infer_sweep(...) to save both a JSON payload and a compact Markdown report.
Typical trade-off: increasing τ_infer reduces false in-range predictions on truly censored points, but can increase false censoring on truly in-range points.
When provenance splits are available from strict inference, the
DOSE operating-point report also treats fault bottoms more strictly than
semantic bottoms via fault_rate + 0.5 * semantic_bottom_rate, so
tau_infer selection can distinguish numerical failures from semantic rejects.
Named operating points¶
These are named deployment presets for selecting thresholds + fallback behavior. They are not magic values — use a post-hoc sweep (or a held-out calibration set) to set the actual τ_infer/τ_train for your domain.
The DOSE benchmark harness now writes aggregated/dose_operating_points.{json,md}
for completed runs so these names are backed by recorded FP_cens/FN_in/direction-F1
measurements plus the run's threshold values. An optional balanced candidate is
evaluated there too, but it is only promoted when it selects a distinct winner;
otherwise the public preset set remains safety_first, direction_aware, and
accuracy_first. When provenance split metrics are present, that report also
records a provenance-weighted bottom-cost signal so semantic bottoms can count
less heavily than fault bottoms in the tau_infer calibration tie-breaks.
safety_first¶
Goal: minimize unsafe accepts by rejecting both strict-bottom and gap-region samples.
from zeroproofml.inference import InferenceConfig, reject_on_gap, safe_sentinel, strict_inference
cfg = InferenceConfig(tau_infer=1e-4, tau_train=1e-3)
decoded, bottom_mask, gap_mask = strict_inference(P, Q, config=cfg)
decoded, accept_mask = reject_on_gap(decoded, bottom_mask, gap_mask)
decoded = safe_sentinel(decoded, ~accept_mask, sentinel=float("nan"))
Trade-off: lower false-finite-on-censored (or “unsafe accept”) at the cost of higher rejection / false-censoring rates.
direction_aware¶
Goal: when bottoms happen, keep the censored direction meaningful by using an auxiliary direction head.
from zeroproofml.inference import decode_strict_censored_3way
decoded, bottom_mask, class_id = decode_strict_censored_3way(
P.squeeze(-1),
Q.squeeze(-1),
tau_infer=1e-6,
direction_logits=dir_logits, # (B, 2)
)
Trade-off: adds an extra head (and training/eval complexity), but produces more actionable outputs on censored / singular inputs.
accuracy_first¶
Goal: maximize coverage by keeping τ_infer tight and treating the gap region as “monitor-only”.
from zeroproofml.inference import StrictInferenceMonitor, reject_on_bottom
decoded, bottom_mask, gap_mask = wrapped(x)
decoded, accept_mask = reject_on_bottom(decoded, bottom_mask)
mon = StrictInferenceMonitor(bundle_id="my_bundle")
mon.update(bottom_mask, gap_mask) # log rates; provenance keywords stay optional
Trade-off: higher coverage and in-range accuracy when things are well-behaved, but more risk exposure near the training–inference gap unless you monitor and gate carefully.
Safety Checklist¶
- Validate coverage on a held-out set using
losses.coverage.coveragebefore shipping. - Monitor the rate of ⊥ outputs in production; aggressive rejection loss during training usually lowers this.
- For robotics or control, keep orientation in a finite weak-sign, angular/projective, typed-label, or direction-head side channel; do not rely on
+inf/-infdecoded payloads as orientation carriers.