Leveraging Pretrained Vision Transformers for Automated Cancer Diagnosis in Optical Coherence Tomography Images
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study presents a novel approach to brain cancer detection based on Optical Coherence Tomography (OCT) images and advanced machine learning techniques. The research addresses the critical need for accurate, real-time differentiation between cancerous and noncancerous brain tissue during neurosurgical procedures. The proposed method combines a pre-trained Vision Transformer (ViT) model, specifically DiNOV2, with a convolutional neural network (CNN) operating on Grey Level Co-occurrence Matrix (GLCM) texture features. This dual-path architecture leverages both the global context capture capabilities of transformers and the local texture analysis strengths of GLCM + CNNs. The dataset comprised OCT images from 11 patients, with 5,831 B-frame slices used for training and validation, and 1,610 slices for testing. The model achieved high accuracy in distinguishing cancerous from noncancerous tissue, with 99.7% ± 0.1% accuracy on the training dataset, 99.4% ± 0.1% on the validation dataset and 94.9% accuracy on the test dataset. This approach demonstrates significant potential for achieving and improving intraoperative decision-making in brain cancer surgeries, offering real-time, high-accuracy tissue classification and surgical guidance.