Projective Learning Mode

Projective learning lifts selected subgraphs to homogeneous tuples (N, D) so training occurs on a smooth manifold while inference retains strict SCM semantics.

When to Use

  • Rational heads that should avoid instantiating ⊥ during training.
  • Safety-critical outputs where distinguishing +∞ vs −∞ matters (use with sign consistency loss).
  • Scenarios where gradient dead zones around Q ≈ 0 hurt convergence.

Forward/Backward Contract

  • Encoding: φ(x) = (x, 1) for finite values; φ(⊥) = (1, 0).
  • Decoding: φ⁻¹(N, D) = N/D when D ≠ 0, otherwise ⊥.
  • Detached renormalisation: (N, D) ← (N, D) / sg(√(N² + D²) + γ) to keep tuples bounded without leaking gradients through the norm.
  • Gradients: Standard autograd on (N, D); coverage/penalties computed after decoding.

Integration Steps

  1. Lift targets to tuples with training.targets.lift_targets.
  2. Use GradientPolicy.PROJECT inside projective regions to mask gradients when a path decodes to ⊥.
  3. Combine implicit, margin, and sign-consistency losses to shape the tuple dynamics.
  4. Decode to SCM at boundaries and apply coverage/rejection losses there.

Gap Region

Training uses stochastic thresholds (τ_train_min, τ_train_max) to avoid learning a brittle boundary at exactly τ_train. Inference sets a fixed τ_infer and returns ⊥ when |Q| < τ_infer.

When τ_train > τ_infer, the interval τ_infer ≤ |Q| < τ_train is the gap region where inference still returns a finite value but the denominator is small enough to be numerically risky. Use strict_inference(..., InferenceConfig(tau_infer=τ_infer, tau_train=τ_train)) to obtain an explicit gap_mask for monitoring.

Angular parameterization (unit-circle tuples)

For some censoring / boundary problems, you can eliminate the free tuple magnitude by predicting an angle θ and emitting a unit-circle tuple:

  • P = cos(θ)
  • Q = sin(θ)

This removes the projective scale gauge and can make strict gating around Q≈0 more symmetric/stable.

import torch
import torch.nn as nn

from zeroproof.inference import InferenceConfig, SCMInferenceWrapper
from zeroproof.layers.angular_projective import AngularProjectiveHead
from zeroproof.losses.implicit import implicit_loss
from zeroproof.training import SCMTrainer, TrainingConfig
from torch.utils.data import DataLoader, TensorDataset


class ToyAngularProjective(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.backbone = nn.Sequential(nn.Linear(1, 32), nn.ReLU(), nn.Linear(32, 32), nn.ReLU())
        self.head = AngularProjectiveHead(input_dim=32, output_dim=1, theta_scale=1.0)

    def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
        h = self.backbone(x)
        p, q = self.head(h)
        return p.squeeze(-1), q.squeeze(-1)


model = ToyAngularProjective()
wrapped = SCMInferenceWrapper(model, config=InferenceConfig(tau_infer=1e-6, tau_train=1e-4))

# Train on smooth tuples (wrapper in train mode passes (P, Q) through unchanged).
x = torch.linspace(-1.0, 1.0, 512).unsqueeze(-1)
y = 1.0 / (x + 0.1)  # may include large-magnitude targets near the pole
train_loader = DataLoader(TensorDataset(x, y.squeeze(-1)), batch_size=128, shuffle=True)


def loss_fn(outputs, lifted_targets):
    p, q = outputs
    y_n, y_d = lifted_targets
    return implicit_loss(p, q, y_n, y_d)


opt = torch.optim.AdamW(wrapped.parameters(), lr=1e-3)
trainer = SCMTrainer(model=wrapped, optimizer=opt, loss_fn=loss_fn, train_loader=train_loader)
trainer.fit()

# Infer on strict (wrapper in eval mode decodes + emits masks).
wrapped.eval()
decoded, bottom_mask, gap_mask = wrapped(x)