Machine Learning Insights into the Geochemical Life Cycle of the Columbia River Flood Basalts
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Flood basalts are challenging to characterize in detail, despite enormous their erupted volumes, due to their age and chemical homogeneity. Here we explore machine learning (ML) approaches for classification and pattern identification in whole rock geochemical data of the Columbia River Flood Basalts (CRFB), which provide key constraints on magma generation, transport, and emplacement. We utilize supervised and unsupervised ML workflows to (1) classify unknown samples into an assumed CRFB stratigraphy, and (2) identify geochemical patterns independent of eruptive order that fingerprint the underlying petrologic processes associated with magma genesis and ascent. We synthesize a large new database of chemical analyses and use a high dimensional approach that leverages all possible ratios of major and trace elements. The supervised model demonstrates ~99% effectiveness for classification into Formation level classes and ~89% effectiveness on a member level from a complete labeled dataset, and slightly lower (~90%) Formation-level accuracy for a partially labeled dataset. Unsupervised clustering suggests similarities between member classes across stratigraphic boundaries in mantle source composition and crustal processing pathways. In particular, we find compositional clusters that point to a rapid expansion of crustal storage within the Grande Ronde, and primitive samples spanning all Formations that likely fingerprint persistent recharge from multiple mantle sources. Comparison of supervised and unsupervised approaches re-enforces known patterns in CRFB chemistry but also highlights areas where further data assimilation could lead to robust insights. This approach represents a powerful framework for classification and highlights significant future opportunities to objectively characterize petrologic datasets like the CRFB.