Projective Learning Mode¶
Projective learning lifts selected subgraphs to homogeneous tuples (N, D) so training occurs on a smooth manifold while inference retains strict SCM semantics.
When to Use¶
- Rational heads that should avoid instantiating ⊥ during training.
- Safety-critical outputs where distinguishing
+∞vs−∞matters (use with sign consistency loss). - Scenarios where gradient dead zones around
Q ≈ 0hurt convergence.
Forward/Backward Contract¶
- Encoding:
φ(x) = (x, 1)for finite values;φ(⊥) = (1, 0). - Decoding:
φ⁻¹(N, D) = N/DwhenD ≠ 0, otherwise ⊥. - Detached renormalisation:
(N, D) ← (N, D) / sg(√(N² + D²) + γ)to keep tuples bounded without leaking gradients through the norm. - Gradients: Standard autograd on
(N, D); coverage/penalties computed after decoding.
Integration Steps¶
- Lift targets to tuples with
training.targets.lift_targets. - Use
GradientPolicy.PROJECTinside projective regions to mask gradients when a path decodes to ⊥. - Combine implicit, margin, and sign-consistency losses to shape the tuple dynamics.
- Decode to SCM at boundaries and apply coverage/rejection losses there.
Gap Region¶
Training uses stochastic thresholds (τ_train_min, τ_train_max) to avoid learning a brittle boundary at exactly τ_train. Inference sets a fixed τ_infer and returns ⊥ when |Q| < τ_infer.
When τ_train > τ_infer, the interval τ_infer ≤ |Q| < τ_train is the gap region where inference still returns a finite value but the denominator is small enough to be numerically risky. Use strict_inference(..., InferenceConfig(tau_infer=τ_infer, tau_train=τ_train)) to obtain an explicit gap_mask for monitoring.
Angular parameterization (unit-circle tuples)¶
For some censoring / boundary problems, you can eliminate the free tuple magnitude by predicting an angle θ and emitting a unit-circle tuple:
P = cos(θ)Q = sin(θ)
This removes the projective scale gauge and can make strict gating around Q≈0 more symmetric/stable.
import torch
import torch.nn as nn
from zeroproof.inference import InferenceConfig, SCMInferenceWrapper
from zeroproof.layers.angular_projective import AngularProjectiveHead
from zeroproof.losses.implicit import implicit_loss
from zeroproof.training import SCMTrainer, TrainingConfig
from torch.utils.data import DataLoader, TensorDataset
class ToyAngularProjective(nn.Module):
def __init__(self) -> None:
super().__init__()
self.backbone = nn.Sequential(nn.Linear(1, 32), nn.ReLU(), nn.Linear(32, 32), nn.ReLU())
self.head = AngularProjectiveHead(input_dim=32, output_dim=1, theta_scale=1.0)
def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
h = self.backbone(x)
p, q = self.head(h)
return p.squeeze(-1), q.squeeze(-1)
model = ToyAngularProjective()
wrapped = SCMInferenceWrapper(model, config=InferenceConfig(tau_infer=1e-6, tau_train=1e-4))
# Train on smooth tuples (wrapper in train mode passes (P, Q) through unchanged).
x = torch.linspace(-1.0, 1.0, 512).unsqueeze(-1)
y = 1.0 / (x + 0.1) # may include large-magnitude targets near the pole
train_loader = DataLoader(TensorDataset(x, y.squeeze(-1)), batch_size=128, shuffle=True)
def loss_fn(outputs, lifted_targets):
p, q = outputs
y_n, y_d = lifted_targets
return implicit_loss(p, q, y_n, y_d)
opt = torch.optim.AdamW(wrapped.parameters(), lr=1e-3)
trainer = SCMTrainer(model=wrapped, optimizer=opt, loss_fn=loss_fn, train_loader=train_loader)
trainer.fit()
# Infer on strict (wrapper in eval mode decodes + emits masks).
wrapped.eval()
decoded, bottom_mask, gap_mask = wrapped(x)