Multi-Class Liver Disease Classification Using a Hybrid Deep Learning Framework Based on YOLO and CaiT Transformer Architectures

Kerim Berkay BUÇAN
Serhat KILIÇARSLAN
Aykut DİKER

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Doctors must tell liver disease patterns apart fast but reading scans by eye often drifts with opinion and the classes look. This work builds a two part deep-learning tool - it first pulls local shapes through the YOLOv8m backbone then lets a Class Attention Image Transformer (CaiT) see the whole picture. Validation and test data came from Roboflow: 3 976 tagged liver images that show four structural lesions - steatosis, ballooning, inflammation, fibrosis - split 70%, 20%, 10%. The pipeline keeps YOLOv8m as the visual encoder, feeds its feature tensor through an adapter so the Transformer can ingest it, sends the result to a CaiT-XS24-384 block for context vectors plus ends with a dense layer that outputs the class. We measured how well the model worked - computing accuracy, balanced accuracy, precision, recall, the F1-score, Cohen's kappa and the confusion matrix. On the separate test set, the hybrid model reached 95.00% accuracy, 95.03% balanced accuracy, 95.25% precision, 95.03% recall, 95.00% F1-score and 93.33% Cohen's kappa. Those numbers show high agreement between predicted plus true labels and stable separation among disease classes. When we compared methods, combining convolutional locality with Transformer based class-attention delivered solid performance but also balanced semantic detail against classification reliability for liver disease images.

Version published to 10.21203/rs.3.rs-8870050/v1 on Research Square
Feb 26, 2026

HDFF-Net: A Hybrid Dual-Feature Fusion Network with Cross-Modal Attention for Automated Colposcopic Transformation Zone Classification

This article has 2 authors:
1. B. Shubhaker¹
2. B. S. Raghavendra²
This article has no evaluationsLatest version Apr 7, 2026
RNNet-MST: A ResNet-50 with Multi-Scale Transformer Blocks for Pulmonary Nodule Classification and Attention-Based Localization on Chest X-Ray Images

This article has 9 authors:
1. Edrill F. Bilan
2. Emman T. Manduriaga
3. Hernando S. Salapare III
4. Ymir M. Garcia
5. Khatalyn E. Mata
6. Rose Anna R. Banal
7. Imelda C. Ang
8. Wei-Ta Chu
9. Dan Michael A. Cortez
This article has no evaluationsLatest version Apr 10, 2026
ML-ConvNet: A Lightweight and Interpretable Unified Architecture for Medical Image Classification Across Modalities

This article has 10 authors:
1. Williams Ayivi
2. Xiaoling Zhang
3. Yeongx Yeong Hyeon Gu
4. Amil Aligayev
5. Ali Alqahtani
6. Wisdom Xornam Ativi
7. Francis Sam
8. Muhammed Amin Abdullah
9. Emmanuel Sarpong Addai Gyarteng
10. Mugahed A. Al-antari
This article has no evaluationsLatest version Mar 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

HDFF-Net: A Hybrid Dual-Feature Fusion Network with Cross-Modal Attention for Automated Colposcopic Transformation Zone Classification

RNNet-MST: A ResNet-50 with Multi-Scale Transformer Blocks for Pulmonary Nodule Classification and Attention-Based Localization on Chest X-Ray Images

ML-ConvNet: A Lightweight and Interpretable Unified Architecture for Medical Image Classification Across Modalities