Real Time Detection of Deepfakes Using the Efficient Swin Attention Network with Global and Local Facial Features

Muhammad Javed Bhutto
Dezhi Han
Fida Hussain Dahri
Teerath Kumar
Jameel Ahmed Bhutto
Ashfaque Khowaja

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The fast advance in deepfake technology poses significant challenges in ensuring authenticity and combating misinformation. Although recent deepfake detection approaches have achieved notable progress, many efforts with robust performance across diverse datasets fail to strike a balance between accuracy and real-time efficiency. This research proposes the Efficient-Swin Attention Network (ESANet), a novel framework that leverages local and global facial features for enhanced real-time deepfake detection. Our framework integrates EfficientNet-B0 for lightweight local feature extraction and the Swin Transformer to capture hierarchical global relationships. Leveraging both deep model strengths enables a comprehensive feature representation through an efficient feature fusion mechanism. We evaluate ESANet on three benchmark datasets: FaceForensics++, CelebV1, and CelebV2. The experimental results demonstrate that our ESANet framework achieves higher accuracy of 96.5%, 95.3%, and 94.8% on FaceForensics++, CelebV1, and CelebV2 datasets. While maintaining inference times as low as possible, as is appropriate in real-time cases. Furthermore, cross-dataset tests demonstrate the robustness and extensibility of our proposed scheme. The new scheme effectively addresses challenges in real-time deepfake detection.

Version published to 10.21203/rs.3.rs-7655236/v1 on Research Square
Oct 20, 2025

Real-Time Deepfake Detection via Frame-Level EfficientNet Ensemble and Client-Server Deployment

This article has 7 authors:
1. Vishwakalyan Patil
2. Akshay Sarapure
3. Harshal Poriwade
4. Jyoti Kamble
5. Sonam Bhandurge
6. Dhanashree Kulkarni
7. Anand Deshpande
This article has no evaluationsLatest version Nov 4, 2025
MNI-GAIR: Multi-scale Normal Image and Grid Attention-based Image Recognition

This article has 11 authors:
1. Maoyang Xu
2. Zhuqing Zheng
3. Borun He
4. Yinfeng Chen
5. Jinye Wang
6. Chen Han
7. Gengyifan Shang
8. Lihe Chen
9. Wancheng Zhao
10. YuFei Zhou
11. Changjiang Zhang
This article has no evaluationsLatest version Nov 19, 2025
Deepfake Detection Across Image, Video, and Audio: A Comprehensive Survey with Empirical Evaluation of Generalization and Robustness

This article has 4 authors:
1. Hong-Hanh Nguyen-Le
2. Van-Tuan Tran
3. Dinh-Thuc Nguyen
4. Nhien-An Le-Khac
This article has no evaluationsLatest version Nov 5, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Real-Time Deepfake Detection via Frame-Level EfficientNet Ensemble and Client-Server Deployment

MNI-GAIR: Multi-scale Normal Image and Grid Attention-based Image Recognition

Deepfake Detection Across Image, Video, and Audio: A Comprehensive Survey with Empirical Evaluation of Generalization and Robustness