Comparative Study on Vision Transformer and Convolutional Neural Networks for Solar Image Classification

Yuchen Ao
Dong Chen

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

With the rapid advancement of solar observational technologies and the surge of multi-wavelength data acquisition, solar physics has entered the era of big data, posing new challenges for image analysis and classification. In this study, we present a systematic comparison between the Vision Transformer (ViT) and Convolutional Neural Networks (CNNs), focusing on their performance and underlying mechanisms in classifying solar images of the photosphere and chromosphere observed by the New Vacuum Solar Telescope (NVST). Using transfer learning with ImageNet-1k pretrained weights and data augmentation strategies, both models were trained and evaluated on a multi-class dataset of manually labeled solar images. Our results show that while ViT achieves comparable classification performance to CNNs, it exhibits greater potential in handling images with multi-structured solar features.Attention-based visualizations using Grad-CAM reveal that ViT tends to focus more broadly and semantically coherently on key solar features, such as sunspot and its penumbrae. In contrast, CNNs are more prone to focusing on dominant local features, which may limit their effectiveness in complex classification scenarios. Finally, we reveal for the first time the potential of ViT in solar image segmentation and recognition, highlighting its attention maps' strong alignment with characteristic solar morphological features.

Version published to 10.21203/rs.3.rs-7187076/v1 on Research Square
Jul 29, 2025

Ensemble Deep Learning for Real-Bogus Classification with Sky Survey Images

This article has 4 authors:
1. Pakpoom Prommool
2. Sirikan Chucherd
3. Natthakan Iam-On
4. Tossapon Boongoen
This article has no evaluationsLatest version Sep 2, 2025
A Vision Transformer Model for the Detection of Glaucoma from Optic Disc Photographs

This article has 9 authors:
1. Ella Bouris
2. Brayden K. Leyva
3. Ojo Perpetua Odugbo
4. Jericho Lawson
5. Sang Wook Jin
6. Zhe Fei
7. Esteban Morales
8. Omar Alkhalili
9. Joseph Caprioli
This article has no evaluationsLatest version Sep 8, 2025
SpectraViT: A Novel Hybrid Architecture for Enhanced Melanoma Classification

This article has 4 authors:
1. Samridhi Raj Sinha
2. Asmi Parikh
3. Archana Bhise
4. Supriya Agarwal
This article has no evaluationsLatest version Aug 26, 2025

Listed in

Abstract

Article activity feed

Related articles

Ensemble Deep Learning for Real-Bogus Classification with Sky Survey Images

A Vision Transformer Model for the Detection of Glaucoma from Optic Disc Photographs

SpectraViT: A Novel Hybrid Architecture for Enhanced Melanoma Classification