Real-time training stability monitoring. Identifies which layer is failing. Intervenes automatically. 100% detection, 0% false positives, 90% auto-recovery across 30 seeds and 6 architectures.
Drop into your existing training loop. No changes to your model, optimizer, or data pipeline.
from arc_vigil import BendexMonitor, BendexConfig, BendexIntervention monitor = BendexMonitor(model, config=BendexConfig()) intervention = BendexIntervention(model, optimizer) for step, batch in enumerate(dataloader): event = monitor.observe(step) # detects instability intervention.step(event, step) # intervenes if needed loss = model(batch) loss.backward() intervention.apply_grad_clip() optimizer.step() optimizer.zero_grad()
Three things no other training monitor does simultaneously.
Mean 78-step lead time before instability shows up in any standard metric. When Arc Vigil fires, your loss curve hasn't moved yet.
Identifies the specific module that deviated first. No other tool tells you where the problem is — they only tell you something is wrong.
Fires a three-phase intervention automatically when a trigger fires. No other tool intervenes — they only alert.
Validated against every standard instability detection method.
| Method | Detection | False Positives | Recovery | Attributes Module | Auto-Intervenes |
|---|---|---|---|---|---|
| Arc Vigil | 100% | 0% | 90% | Yes | Yes |
| Loss spike | 100% | 80% | 0% | No | No |
| Gradient norm | 90% | 50% | 0% | No | No |
| Patience | 100% | 50% | 0% | No | No |
30-seed benchmark across 6 architectures: MLP, CNN, ViT, DistilBERT, GPT-2, ResNet-50 + LR spike stress test.
Arc Vigil is fully open source. Full access on PyPI and GitHub with no limits or sign-up required.