Classification of deepfake images with RANSAC for feature extraction and a hybrid model of YOLOv5 and ResNet-50
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Deepfake technology is employed as a tool to create fabricated visual and audio content derived from an individual's existing media. This technology can replace an individual's face and voice with fabricated material to create a lifelike appearance, which is unethical and poses a harm to society. Currently, deepfakes are extensively utilized in cybercrimes, including identity theft, cyber extortion, financial fraud, and the creation of fake content of anyone for blackmailing individuals. This study uses, Random Sample Consensus (RANSAC) which enables efficient feature extraction of keypoints from both original and fake frames, which aids in the training of deep-learning models like You Only Look Once (YOLOv5), ResNet-50, and an innovative hybrid model that integrates the optimal features of YOLOv5 and ResNet-50 applied on the Celeb-DF dataset, consisting of both original and fake frames of many celebrities. The proposed hybrid YOLOv5 + ResNet-50 model performed much better compared to the individual models by attaining an accuracy of 99.58%, while the YOLOv5 and the ResNet-50 models achieved 90.9% and 88.91% accuracy, respectively. The finding confirms the validity of the proposed approach for practical applications such as verification of digital media and social media monitoring, as well as identity protection, where exact and trustworthy detection of the manipulated media becomes necessary.