Leveraging Convolutional and Transformer Synergy for Robust Camera Source Attribution
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The majority of contemporary communication relies on electronic devices. Digital images and videos are regarded the primary modes of modern communication.Identifying the source camera model is crucial in digital forensics, having applications in authenticity verification and copyright protection.Traditional deep learning-based techniques to recognizing the validity of a digital source in the field of digital forensics have grown in prominence over the previous decade.In light of the necessity for optimizing model complexity and enhancing performance on this topic, we explore video spectrum and spectral information analysis to provide an effective solution for video source identification. In this work, we propose a scalable approach for device identification that uses spectrum images extracted from video frames using Fast Fourier Transform (FFT) and feeds them into a hybrid Convolutional Neural Network (CNN)-based Transformer model.The proposed approach has also been compared to state-of-the-art source camera detection technologies. The experimental findings illustrate the efficacy and advantages of the proposed system in terms of accuracy and robustness. Here, we contribute to both local and global representation learning with the help of self-attention mechanism. The overall accuracy of the proposed model achieved is over 98%.
