Real-Time Deepfake Detection via Frame-Level EfficientNet Ensemble and Client-Server Deployment

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

DeepFake technology threatens the integrity of digital content by allowing the generation of hyperrealistic but misleading content. In this paper, we propose a frame-based DeepFake detection framework that aims to enhance detection accuracy and computational efficiency. Our approach takes advantage of three large-scale datasets—Celeb-DF, FaceForensics++ (FF++), and DeeperForensics—consisting of close to 17GB of training data. Preprocessing includes frame extraction and face alignment with Dlib’s 68-point facial landmark detection to maintain high-resolution facial features. Two CNN models, EfficientNet-B4 and EfficientNet-B5, are ensembled and trained to detect coarse and fine-grained forgery artifacts. The model performs AUC of 0.9958 and F1-score of 0.9726 on benchmark sets, on par with state-of-the-art techniques. To ensure real-world deployability, the detection pipeline is incorporated within a cross-platform client-server setup in a modular structure. A responsive web frontend and an Android app provide real-time predictions and Grad-CAM-based visualizations. This end-to-end solution bridges the gap between research and deployment by providing a scalable, interpretable, and accessible platform for DeepFake detection. Unlike previous methods, our system integrates multimodal detection in videos with a real-time, cross-platform deployment pipeline, bridging research with practical application.

Article activity feed