Pancreatic Tumor Detection in Computed Tomography Images Using Rotary Positional Siamese Vision Transformer
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Vision Transformer (ViT)-based techniques are boosting in the domain of medical and cancer imaging, comprising pancreatic cancer applications. In recent times, several research works have employed Deep Learning (DL) techniques using computed tomography (CT) images for pancreatic cancer diagnosis based on ViTs. In spite of that, the existing methods frequently suffer from high computational complexity particularly with false negative clinical diagnosis of malignant lesions that necessitate attention. This paper proposes an intelligent pancreatic tumor detection method called, Rotary Positional Siamese Vision Transformer (RPSViT) with the objective of accurately detecting and classifying patients with pancreatic tumor and their ability to localize abnormalities within images obtained from CT scan. RPSViT adopts a patch-based approach when input images are split into fixed-size patches with each path treated as a token employing Linear Patch Embedding. Following which Rotary Positional Embedding are added to assist the model comprehend spatial relationships within sample image therefore improving the accuracy of tumor localization in complex anatomical regions. The Siamese Transformer Encoder efficiently extracts feature vectors of input samples in high-level space and completes the classification of the disease detection. We trained RPSViT using the Pancreatic-CT-CBCT-SEG dataset and noticed significant performance improvements compared to conventional method. The RPSViT method reached an accuracy of 11% and minimizing misclassification rate by 48%, showing its robustness in ascertain pancreatic tumors. The promising results demonstrate that RPSViT can play an important role in clearing the way for accurate diagnosis, while also exhibiting the prospective of vision transformer-based architectures in medical imaging.