Biomimetic Transfer Learning-based Complex Gastrointestinal Polyp Classification
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: This research investigates the application of Artificial Intelligence (AI), particularly biomimetic convolutional neural networks (CNNs), for the automatic classification of gastrointestinal (GI) polyps in endoscopic images. The study combines AI and Transfer learning techniques to support early detection of colorectal cancer by enhancing diagnostic accuracy with pre-trained models. Methods: The Ksavir dataset, comprising 4,000 annotated endoscopic images across eight polyp categories, was used. Images were pre-processed via normalisation, resizing and data augmentation. Several CNN architectures—including state-of-the-art optimized ResNet50, DenseNet121, MobileNetV2, and others—were trained and evaluated. Models were assessed through training, validation, and testing phases, using performance metrics such as overall accuracy, confusion matrix, precision, recall, and F1 score. Results: ResNet50 achieved the highest validation accuracy at 90%, followed closely by DenseNet121 with 87.5% and MobileNetV2 with 86.5%. The models demonstrated good generalisation, with small differences between training and validation accuracy. Average inference time was under 0.5 seconds on a computer with limited resources, confirming real-time applicability. Confusion matrix analysis indicates common errors frequently occurred between visually similar classes, particularly when reviewed by less-experienced medical physicians. These errors underscore the difficulty of distinguishing subtle features in gastrointestinal imagery and highlight the value of model-assisted diagnostics in supporting clinical decision-making. Conclusions: This obtained results confirms that Deep learning-based CNN architectures, combined with Transfer learning and optimisation techniques, can classify accuratly endoscopic images and support medical diagnostics. Recommended solutions to address classification challenges included the use of advanced data augmentation strategies, such as rotation, flipping, contrast adjustment, and scaling—to artificially increase dataset diversity and improve model generalisation. Additionally, explainability techniques were applied, most notably Gradient-weighted Class Activation Mapping (Grad-CAM), which generates visual heatmaps to highlight the regions within an image that influenced the model’s prediction. These methods help identify potential sources of error and improve transparency, making CNN-based decision systems more interpretable for clinicians.