Satellite Health Monitor
A full end-to-end anomaly-detection system for a simulated Low-Earth-Orbit satellite. A Rust physics engine generates realistic telemetry; a Python XGBoost classifier watches the stream in real time and fires alerts within ~65 seconds of fault onset. All predictions come with SHAP explanations traceable to physical quantities.
Context
Satellite anomaly detection is a classic real-time ML problem: the data stream is continuous, faults can be subtle, and a false positive is almost as costly as a missed alert. The challenge is compounded by periodic physical events (eclipse every ~95 minutes) that create abrupt telemetry changes indistinguishable from certain faults unless the model has a long enough temporal horizon.
I built both the simulator and the detector from scratch. The simulator runs in Rust for speed and models ECI orbital mechanics, attitude control, eclipse transitions, thermal effects, and battery dynamics. Each 24-hour run injects one fault at a random time, producing a labelled CSV. The detector is trained on 500 such runs (~43 million raw telemetry rows before feature engineering).
Method
orbital-mechanics.rsECI orbit + attitude dynamics, eclipse model, fault injection at a random start time.
preprocess.pyRolling stats (30 s - 600 s), slopes (1 min - 1 hr), eclipse counters. 532 features total.
train.py400 trees, multi:softprob, 8 classes. Time-split: 70 % train / 30 % test.
evaluate.pyPer-class threshold on adj. proba = proba / threshold. Battery=4.0, solar_panel=3.5.
explain.pyTreeExplainer on 500 stratified test rows per class; global + per-class importance.
inference.pySliding buffer (4 000 raw rows), predict every 30 s, alert on 3 consecutive hits.
Anomaly classes
Results: Live Demo
Pick a scenario and watch the detector in action. Telemetry plays back at accelerated speed; the ML panel updates every 30 simulated seconds. The alert fires when the model predicts the same class three times in a row with at least 50 % confidence. SHAP bars show which features drove each prediction.
Conclusion
The final system achieves macro F1 = 1.00 on a held-out time split and zero false positives across all tested nominal runs, including during the eclipse/re-entry windows that originally caused spurious solar-panel alerts. The SHAP breakdown confirms that each anomaly class is genuinely driven by the physically appropriate signal: angular velocity noise for wheel friction, z-axis drift for thruster failure, battery SOC slope for degradation, and the long-range power slope for solar panel faults during eclipse.
The key lesson is architectural: when an anomaly can be masked by a periodic physical event, you need a feature whose temporal horizon spans that period. Short rolling windows encode current state; long-range slopes encode pre-event baselines. Combining both is what made the eclipse ambiguity learnable.