Leveraging 3DCNN and Weighted Similarity Metrics forEnhanced Content-Based Video Retrieval
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
With the widespread adoption of high-speed networks such as 4G and 5G, along with the explo-sive growth of social media platforms, video content is now frequently captured and shared onlinewithout accompanying metadata such as tags or descriptions. This absence of textual annotationspresents a significant challenge for indexing and retrieving relevant video content. Content-BasedVideo Retrieval systems address this issue by analyzing the visual content of videos rather than rely-ing on external metadata. However, only limited efforts in the literature have jointly explored both thespatial and temporal context of video data for retrieval. To address this gap, we propose a Content-Based Video Retrieval framework that leverages a 3D Convolutional Neural Network, specificallythe R(2+1)D architecture enhanced with transfer learning. This model decomposes spatiotemporalconvolutions to more effectively capture both spatial and temporal video features. In addition, weintroduce a novel classification-similarity-based weighted distance approach, which overcomes the lim-itations of traditional distance-based and classifier-based retrieval methods. Experimental evaluationon the UCF101 dataset demonstrates that the proposed system achieves a significant improvementin retrieval performance, with over a 20% increase in AUC compared to baseline techniques.