ResNet34-Based Galaxy Morphology Classification with Machine Unlearning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Galaxy morphology classification is fundamental to observational astronomy. The structure of a galaxy, whether it is smooth and elliptical, features spiral arms and rotation, or presents an optical artifact—tells us a lot about how it formed, if it has merged with others, and where it is headed evolutionarily. As surveys have grown from thousands to hundreds of thousands of galaxies, manually classifying each one has become impossible. Automated deep-learning pipelines have therefore become the standard approach for scaling galaxy morphology classification. However, a challenge arises. Large citizen-science datasets like Galaxy Zoo 2 suffer from label noise because volunteers have different expertise levels and sometimes disagree on ambiguous images. This noise is especially damaging for rare classes; a small percentage of wrong labels can significantly compromise what the model learns. Once a model is trained on noisy data, standard fine-tuning does not provide an effective method to fix labeled errors without retraining entirely. This paper addresses both problems simultaneously. We trained a ResNet34-based CNN on 61,578 Galaxy Zoo 2 images to classify galaxies into three categories: Smooth, Featured/Disk, and Artifact, then applied three machine unlearning methods to reduce the influence of approximately 10–12% intentionally mislabeled samples. We compared Gradient Ascent Unlearning, Fisher Forgetting, and Full Retraining. Our classifier achieved 85.58% validation accuracy, 79.34% balanced accuracy, and a macro-F1 score of 75.15%. Among unlearning methods, Gradient Ascent was fastest (11.01% forget-set accuracy in 103.6 seconds), while Full Retraining gave the best retention (99.64% at 2649.4 seconds). Our experiments also reveal an important implementation constraint: Fisher Forgetting can collapse when the numerical stabilizer is not large enough. Parameters with low importance receive substantial noise perturbations instead of selective forgetting. This is a non-trivial issue with significant practical implications.

Article activity feed