sst el niño iforest

The standard way of tracking El Niño has always bugged me. Things like the ONI index rely on a single global average, which completely washes out the intense, localized temperature spikes happening at specific ocean buoys in real life. I wanted to see if unsupervised machine learning could catch these hidden spatial anomalies by looking directly at over 178,000 daily observations from the raw TAO buoy network.

To test this, we set up a brutal arena pitting Isolation Forest against Local Outlier Factor and One-Class SVM. We ran two totally different feature scenarios. One was just pure Sea Surface Temperature data, and the other was a complex multivariate setup packed with atmospheric data like wind and humidity. We wanted to see if throwing more complex data at the models would actually yield better anomaly detection. The results were a huge reality check. The complex multivariate setup totally failed to improve performance. It turns out that throwing more data at a problem does not help if your target proxy is fundamentally derived from internal ocean temperatures. The real winner was LOF using the simple SST-only features. It completely crushed the benchmark with a solid F1-score, proving that catching El Niño in this dataset is all about isolating local density deviations. Even though LOF technically won the accuracy race, Isolation Forest proved to be incredibly robust across every hyperparameter we threw at it. That makes it the most viable tool for scaling up to massive sensor networks.

Repository

SST El Niño iForest