Proof of Concept for Dunham Classification of Carbonate Thin Sections Using Vision Transformers
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The classification of carbonate rocks using petrographic thin sections is a fundamental task in sedimentology and is traditionally performed by geologists using schemes such as Dunham’s depositional-texture classification. However, this approach is time-consuming and may be prone to inter-observer variability, as reported in prior studies. This paper presents a proof of concept for using a Vision Transformer (ViT) to classify carbonate thin-section images into Dunham classes. The dataset from Lokier and Al Junaibi (2016) is used for fine-tuning and consists of 14 carbonate samples imaged at multiple magnifications and polarization states and labeled through expert consensus. A pre-trained ViT model (vit-base-patch16-224-in21k) is fine-tuned using open-source tools, with an emphasis on reproducibility through transparent methodology and publicly available code. The fine-tuned model achieves high accuracy on a held-out test split derived from the same dataset; however, this performance likely reflects overfitting due to the limited size of the dataset. Evaluation on an independent validation dataset sourced from the CarbonateWorld website yields a lower accuracy of 67%, consistent with known ambiguities in carbonate texture classification. Notably, when considering the top-2 predicted classes, validation accuracy improves to 85.71%, indicating that secondary prediction probabilities capture these ambiguities. Overall, the results highlight both the potential of ViT-based models for carbonate classification and the importance of open, expert-labeled datasets. Rather than proposing a state-of-the-art classifier, this study demonstrates a reproducible and extensible workflow. By providing a clear methodology and code snippets, it enables readers to fine-tune ViT models for their own use cases.