AI Seismology and the Hidden Earthquake Catalog of Alaska's Subduction Zone

The story in four numbers

~1,750

Previously undetected earthquakes identified by machine-learning analysis of continuous seismic data along the Alaska subduction zone — events that were recorded by instruments but invisible to conventional detection methods

155 mi

Length of the subduction zone boundary segment that the new microseismic catalog delineates in southern Alaska — a fault boundary whose geometry is now mapped with a spatial resolution the prior catalog could not provide

M1.5–2.5

Approximate completeness magnitude of conventional Alaska seismograph networks — the threshold below which events are recorded but not reliably identified, representing the detection gap that machine learning is now closing

9.2

Magnitude of the 1964 Good Friday earthquake — the largest seismic event in North American recorded history, originating on the same Alaska subduction zone that the AI-expanded catalog is now mapping in finer resolution

// The thesis in one paragraph

Earthquake catalogs — the comprehensive records of seismic events that seismologists use to understand fault behaviour, stress accumulation, and rupture history — have always been incomplete at the bottom end of the magnitude scale. The instruments exist to record ground motion from very small earthquakes; the analytical methods to distinguish genuine seismic signals from the noise floor of continuous data streams historically have not. A machine-learning analysis of seismic recordings from southern Alaska has changed that equation for one of the most consequential fault boundaries in the North American seismological record, identifying approximately 1,750 earthquakes along a 155-mile segment of the Alaska subduction zone that conventional detection methods had not registered. The firm reads this as a demonstration of a principle that now applies across geophysical monitoring: neural-network-based pattern recognition is systematically revealing a physical world that was always present but operationally invisible, and the strategic implications for seismic hazard assessment, infrastructure planning, and the insurance markets that price catastrophic risk are beginning to be material.

The completeness problem in earthquake catalogs

The seismograph network monitoring earthquake activity in Alaska and the broader Pacific Northwest is among the most dense and well-instrumented on Earth — maintained by the Alaska Earthquake Center, the USGS Earthquake Hazards Programme, the IRIS Consortium, and multiple research institutions operating broadband seismometers across a region that generates more seismic energy per unit area than almost anywhere outside the circum-Pacific belt. Yet the earthquake catalog that those networks have produced is profoundly incomplete at the small-magnitude end. This incompleteness is not a failure of instrumentation. Modern broadband seismometers are sensitive enough to record ground motion from earthquakes with magnitudes below zero, and the continuous data streams they generate contain every seismic event that occurs within detection range. The incompleteness is a failure of the analytical methods used to detect and characterize seismic arrivals in data that contains not only genuine tectonic signals but also the broadband noise of ocean waves, industrial activity, wind loading on station infrastructure, and dozens of other sources of ground motion that a conventional detection algorithm cannot reliably distinguish from a small earthquake. The completeness magnitude of a seismograph network — the magnitude threshold above which the network confidently detects all events — in Alaska ranges from approximately M1.5 to M2.5 depending on station density and local noise environment. Below that threshold, events are recorded but not catalogued, and the size-frequency distribution of earthquakes on the subduction zone — the Gutenberg-Richter relationship that governs how many events occur at each magnitude level — disappears from the record precisely where it is most numerous. For the Alaska subduction zone boundary, this means that the seismic catalog used to characterise fault behaviour, delineate seismogenic zones, and calibrate probabilistic hazard models has been systematically missing the smallest and most spatially informative events the fault produces.

// Section 01 of 04

01 · Why conventional networks miss small earthquakes

The mechanics of conventional seismic event detection are straightforward in principle and difficult in practice. Understanding where the method fails clarifies why machine learning produces such a different result when applied to the same continuous data streams.

The dominant automated detection algorithm in operational seismology for several decades has been the Short-Term Average over Long-Term Average (STA/LTA) trigger, which identifies a seismic arrival by detecting a rapid increase in the ratio of signal amplitude in a short time window relative to a longer background window. The method is robust, computationally cheap, and works well for moderate and large earthquakes whose P-wave arrivals produce a clear amplitude step above the ambient noise level. Its limitation is a hard boundary between detectable and undetectable events that is set by the local noise floor: an earthquake whose P-wave arrival does not produce a sufficient amplitude step above background — because the event is too small, too distant, or because the station is in a high-noise environment — does not trigger a detection. No amount of threshold tuning recovers an event that sits below this boundary; the method lacks the ability to distinguish a genuine low-amplitude seismic arrival from the background noise in which it is embedded. Template matching — a second-generation method that cross-correlates continuous data streams against waveform templates from known events — extends detection sensitivity substantially by exploiting the fact that earthquakes on the same fault patch produce highly similar waveforms. Where template matching finds events that STA/LTA misses, it does so by recognising the specific waveform shape of a known event type, not by improving the general signal-to-noise discrimination. Its limitation is that it requires an existing catalog of template events and is most effective where the target earthquake family is already partially characterised. The machine-learning analysis applied to the Alaska data operates differently from both: it learns the signal features that distinguish genuine seismic arrivals from noise across the full complexity of the data, without being constrained by a fixed amplitude threshold or a predefined template library. The result is a detection capability that extends substantially below the completeness magnitude of the conventional catalog, recovering events that neither STA/LTA nor template matching would have identified.

The conventional seismograph network does not fail to detect small earthquakes because the instruments cannot sense them. It fails because the analytical layer between the instrument and the catalog is built around methods that were designed for a different signal-to-noise regime. Machine learning does not improve the instrument — it replaces the analytical layer, and in doing so recovers a decade of seismicity that was sitting in the data waiting to be found.

// Section 02 of 04

02 · What machine learning actually does to seismic data

The application of deep learning to seismological event detection is not a single technique — it is a family of approaches, each designed to solve a specific part of the problem of extracting reliable event detections from continuous, noisy, multi-channel recordings.

The dominant deep-learning architectures in this application domain are convolutional neural networks (CNNs) and recurrent neural networks (RNNs), often combined in hybrid architectures that have been trained on tens of millions of labelled seismic waveform examples drawn from global earthquake databases. The training process teaches the network to recognise the characteristic shapes of P-wave and S-wave arrivals — the compressional and shear energy pulses that radiate from an earthquake source and arrive at a seismograph at predictable times determined by the velocity structure of the crust between the source and the receiver. A network trained on a large and diverse waveform dataset learns to identify these shapes across an enormous range of signal-to-noise ratios, source distances, and frequency contents, because it has seen all of those variants in training and has learned the features that are consistent across them. When applied to a continuous seismic data stream, such a network scans the data in short windows and assigns a probability to each window of containing a P-wave arrival, an S-wave arrival, or background noise. Detections are declared where the probability exceeds a specified threshold, and the resulting event list can be considerably more complete at small magnitudes than anything a conventional algorithm produces on the same data. The Alaska analysis likely used one of the established published architectures — PhaseNet, EQTransformer, or a similar model — possibly fine-tuned on regional Alaska waveform data to improve performance in the specific noise environment of southern Alaska stations. The 1,750 newly identified events represent the portion of the machine-learning catalog that clears a quality threshold above the detection boundary — events for which the algorithm assigned high-confidence arrival picks that were subsequently confirmed by location analysis showing a coherent spatial pattern consistent with subduction zone seismicity rather than spurious detections scattered randomly through the data.

// Exhibit 1 · Seismic detection method comparison: capability and limitation

Characterisations represent typical configurations across common detection architectures. Performance varies by noise environment, station density, and regional calibration. Not a vendor evaluation.

Method	Min detectable magnitude (typical)	Requires training data	Processing speed	Generalises to new event types
STA/LTA amplitude trigger	Mc ~1.5–2.5	No	Real-time	Yes (amplitude-based)
Template matching	Mc ~0.5–1.0 (with good templates)	Yes (template catalog)	Moderate (computationally intensive)	Limited (requires matching template)
Deep learning (CNN/RNN)	Mc ~0.0–0.5 (well-trained models)	Yes (large labelled dataset)	Fast (GPU-accelerated)	Yes (generalised waveform features)
Human analyst review	Mc ~0.5–1.0 (experienced analyst)	Implicit (expertise)	Slow (not scalable to continuous data)	Yes (contextual judgment)

// Section 03 of 04

03 · What 1,750 new earthquakes reveal about the subduction zone boundary

The value of the expanded catalog is not in the count of new events — it is in the spatial pattern those events collectively define, and what that pattern tells seismologists about the geometry and behaviour of the fault interface that has been generating some of the largest earthquakes in recorded history.

The Alaska subduction zone — the convergent boundary where the Pacific Plate descends beneath the North American Plate along a trench that extends from the Gulf of Alaska westward through the Aleutian Island chain — is the source of the 1964 Good Friday earthquake, at magnitude 9.2 the largest seismic event in North American recorded history and among the largest ever instrumentally recorded anywhere on Earth. The fault interface that produced that rupture extends along the southern coastline of Alaska for roughly 2,500 kilometres, and its behaviour — which segments are locked and accumulating elastic strain, which are slipping freely, where the seismogenic zone transitions from the brittle upper crust to the deeper ductile regime — governs the long-term seismic hazard of the entire Pacific Northwest, including the tsunami risk that affects coastlines from Alaska to Japan. The 155-mile boundary segment that the machine-learning analysis maps with unprecedented resolution is a portion of this interface, and the spatial distribution of the 1,750 newly identified events provides information about the fault geometry at a scale the prior catalog could not resolve. Specifically, clusters of microseismicity define the edges of fault patches — the boundaries between locked zones where earthquakes do not occur because the fault is in contact and not slipping, and more freely slipping zones where microearthquakes occur continuously as the fault moves. This boundary mapping is not directly predictive — the presence of a locked patch does not tell you when it will rupture — but it is the foundational information on which probabilistic seismic hazard models are built, and its improvement is a direct improvement in the accuracy of those models for the infrastructure and population centres that depend on them.

The 1,750 new earthquakes are not the finding — the spatial pattern they form is. Each event individually tells you that a small slip occurred on the fault at a specific location and time. Taken together, they draw the outline of the fault interface with a resolution that changes the probabilistic hazard calculation for every structure built above it.

// WHAT THE EXPANDED CATALOG CHANGES

Fault geometry resolution — the spatial density of microseismic events defines the boundaries between locked and creeping fault segments at a scale the prior catalog could not resolve, directly improving the input data for seismic hazard models. Stress state mapping — the spatial pattern of microseismicity indicates where elastic strain is being accommodated by slow slip or fault creep versus where it is accumulating on locked patches, providing a dynamic picture of the stress field that static geodetic measurements alone cannot supply. Detection completeness — the expanded catalog closes the gap in the magnitude-frequency distribution below the conventional network completeness threshold, enabling better calibration of the Gutenberg-Richter relationship that governs hazard model frequency estimates. Temporal analysis potential — a more complete catalog enables better detection of temporal clustering and seismicity rate changes that may precede larger events.

// WHAT THE EXPANDED CATALOG DOES NOT CHANGE

Earthquake prediction — the detection of more microearthquakes does not enable reliable prediction of when or where the next large rupture will occur; the physical mechanisms connecting microseismicity to major earthquake nucleation remain an open research problem. The locked patch question — knowing where a fault segment is locked does not tell you when it will rupture; the recurrence interval and loading history require additional paleoseismic and geodetic data that the seismic catalog cannot provide. Infrastructure vulnerability — improved hazard characterisation informs design standards and retrofitting priorities over planning horizons of years to decades; it does not reduce the vulnerability of existing structures. The tsunami early warning timeline — detection of a large subduction zone earthquake triggers tsunami warnings in minutes; microseismic catalog quality does not materially affect this operational timeline.

// Section 04 of 04

04 · From expanded catalog to seismic hazard — the practical stakes

Probabilistic seismic hazard assessment is the translation layer between the scientific record of past earthquakes and the engineering standards, insurance pricing, and infrastructure investment decisions that depend on quantified estimates of future ground shaking. The quality of that translation depends directly on the completeness of the input catalog.

The infrastructure exposure in southern Alaska is concentrated, strategically significant, and disproportionately dependent on accurate seismic hazard characterisation. The Trans-Alaska Pipeline System, which carries crude oil from the North Slope fields at Prudhoe Bay to the Valdez marine terminal approximately 800 miles to the south, crosses multiple active fault zones and was specifically engineered to survive moderate seismic displacement — but its design basis was calibrated against the hazard models available at the time of construction in the 1970s, models that are now several decades old and built on earthquake catalogs that the ML-expanded record demonstrates were incomplete. Anchorage, the largest city in Alaska, sits on poorly consolidated glacial sediments that amplify seismic shaking significantly relative to bedrock sites, and its building stock spans a wide range of vintage and seismic performance. The military installations in Alaska — including Fort Wainwright, Elmendorf-Richardson, and several radar and communications facilities along the Aleutian chain — represent strategic assets whose operational continuity under seismic loading is an explicit defence infrastructure requirement. For each of these assets, the probabilistic seismic hazard analysis (PSHA) that governs design standards and retrofitting decisions is only as accurate as the earthquake catalog it draws on. The Gutenberg-Richter relationship that PSHA uses to estimate the frequency of large earthquakes from the observed rate of small ones is calibrated on the catalogued record — and a catalog that is systematically missing thousands of events below the completeness threshold will underestimate that rate if the missing events are not uniformly distributed. The discovery of 1,750 events concentrated along a specific 155-mile fault segment is exactly the kind of spatially structured completeness gap that can introduce systematic errors into regional hazard models, and its correction provides a direct improvement in the foundation on which those models stand. The reinsurance and catastrophe modelling industries, which price Alaska earthquake risk using PSHA-derived exceedance curves, are the commercial community with the most immediate sensitivity to the quality of the underlying catalog — and the most direct financial incentive to track the improvement of that catalog as ML-enhanced seismic analysis becomes standard practice.

Near-term: catalog enrichment and hazard model refinement

The immediate application of the expanded Alaska seismic catalog is as an input to the next revision of the USGS National Seismic Hazard Model for Alaska, which is updated on a roughly 6-year cycle and which explicitly incorporates new seismicity data, new fault source characterisation, and new ground motion models as they become available. The 1,750 new events and the fault segment geometry they define will be evaluated by the hazard model team against the existing catalog and fault source model, and where they reveal systematic incompleteness or geometric inaccuracy in the prior model, they will affect the ground motion exceedance curves that underpin building codes, infrastructure design standards, and insurance risk pricing across the region.

Long-term: AI-standard seismology and global catalog quality

The broader implication of the Alaska result is methodological: the same ML analysis framework can be applied to continuous seismic data archives globally, and in every region where it is applied it will reveal a microseismicity catalog that the conventional network has systematically missed. The Alaska result is one of a growing number of similar analyses — on the Cascadia Subduction Zone, on the Himalayan frontal thrust, on the Manila Trench — each of which is improving the fault characterisation that hazard models require. As ML-enhanced seismology becomes standard practice and the improved catalogs accumulate, the probabilistic hazard models for the world's most seismically exposed infrastructure will be progressively recalibrated against a more complete picture of the fault systems they are meant to characterise.

The incomplete record and what fills it

The earthquake catalog that seismologists have used to characterise the Alaska subduction zone was not wrong — it was a faithful representation of what conventional detection methods could reliably identify from the data those methods were designed to analyse. What machine learning has now demonstrated is that the data itself contained substantially more information than those methods could extract, and that the gap between the recorded data and the catalogued events was not noise but signal — 1,750 genuine earthquakes that were systematically below the analytical horizon of the tools used to search for them. Filling that gap does not resolve the fundamental unpredictability of large earthquake occurrence; the physics of fault rupture nucleation are not changed by a more complete microseismicity catalog. What it does is provide the most accurate available picture of where the fault is active, where stress is being accommodated, and where the boundaries between seismogenic and aseismic behaviour lie — the foundational information that hazard models, infrastructure design, and risk pricing depend on, and that has been systematically incomplete for as long as seismology has been monitoring this boundary. That incompleteness is now being closed, one archive at a time, by the same neural network architectures that the technology sector has deployed for image recognition and natural language processing — applied, with equal effectiveness, to the detection of the Earth's smallest recorded expressions of tectonic motion.

// The closing thought

The firm reads the Alaska microseismicity result as a representative case of a broader transformation in observational science: the application of machine learning to data archives that were collected with instruments capable of recording more than the analytical methods of the time could extract. The value of those archives was always present in the data; what has changed is the analytical capability to realise it. For the Alaska subduction zone specifically, the 1,750 recovered events and the fault geometry they define represent an improvement in the quality of the seismic hazard foundation that will compound through every model, every standard, and every risk calculation built on top of it.

Sources: Peer-reviewed seismological research on Alaska subduction zone microseismicity; USGS National Seismic Hazard Model documentation; Alaska Earthquake Center operational network data; PhaseNet and EQTransformer model publications (Zhu and Beroza, 2019; Mousavi et al., 2020); USGS Good Friday 1964 earthquake records. This note is for informational purposes only and does not constitute investment advice.

Hero photograph: Provided via Unsplash.

AI seismology and the hidden earthquake catalog of Alaska's subduction zone

The story in four numbers

The completeness problem in earthquake catalogs

01 · Why conventional networks miss small earthquakes

02 · What machine learning actually does to seismic data

03 · What 1,750 new earthquakes reveal about the subduction zone boundary

04 · From expanded catalog to seismic hazard — the practical stakes

The incomplete record and what fills it

How does your firm's infrastructure or reinsurance exposure to Pacific Northwest seismic risk account for the evolving completeness of the underlying hazard catalog?
Open a conversation.

The story in four numbers

The completeness problem in earthquake catalogs

01 · Why conventional networks miss small earthquakes

02 · What machine learning actually does to seismic data

03 · What 1,750 new earthquakes reveal about the subduction zone boundary

04 · From expanded catalog to seismic hazard — the practical stakes

The incomplete record and what fills it

How does your firm's infrastructure or reinsurance exposure to Pacific Northwest seismic risk account for the evolving completeness of the underlying hazard catalog? Open a conversation.

How does your firm's infrastructure or reinsurance exposure to Pacific Northwest seismic risk account for the evolving completeness of the underlying hazard catalog?
Open a conversation.