Seismic Phase Detection with Limited Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Seismic phase picking is fundamental to earthquake detection and hazard assessment, yet modern deep learning approaches typically require hundreds of thousands of labeled seismograms—limiting their applicability in data-scarce regions. We present a U-Net-based framework that achieves high-accuracy P-wave arrival detection using only a few hundred training samples by integrating signal-derived features directly into the model input. These features, including multiscale STA/LTA ratios, Hilbert-envelope derivatives, frequency-band energy attributes, and maximum-amplitude volatility, provide the network with physically meaningful cues analogous to those used by human analysts. Trained on as few as 92 seismograms, the model achieves sub-second median pick errors (0.46–0.75 s) and robust convergence, while models trained solely on raw waveforms exhibit systematic biases exceeding 9 s. Feature-weight analysis reveals that envelope- and frequency-based features dominate model decision-making, confirming that incorporating domain knowledge guides learning toward interpretable, physically consistent representations. The approach dramatically reduces data requirements—by more than three orders of magnitude compared to state-of-the-art pickers such as PhaseNet or EQTransformer—while maintaining comparable accuracy. This signal-guided neural architecture bridges traditional seismological feature extraction with modern machine learning, providing a practical, data-efficient, and interpretable solution for automated phase picking in both well-instrumented and data-limited regions.

Article activity feed