Leveraging 3DCNN and Weighted Similarity Metrics forEnhanced Content-Based Video Retrieval

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

With the widespread adoption of high-speed networks such as 4G and 5G, along with the explo-sive growth of social media platforms, video content is now frequently captured and shared onlinewithout accompanying metadata such as tags or descriptions. This absence of textual annotationspresents a significant challenge for indexing and retrieving relevant video content. Content-BasedVideo Retrieval systems address this issue by analyzing the visual content of videos rather than rely-ing on external metadata. However, only limited efforts in the literature have jointly explored both thespatial and temporal context of video data for retrieval. To address this gap, we propose a Content-Based Video Retrieval framework that leverages a 3D Convolutional Neural Network, specificallythe R(2+1)D architecture enhanced with transfer learning. This model decomposes spatiotemporalconvolutions to more effectively capture both spatial and temporal video features. In addition, weintroduce a novel classification-similarity-based weighted distance approach, which overcomes the lim-itations of traditional distance-based and classifier-based retrieval methods. Experimental evaluationon the UCF101 dataset demonstrates that the proposed system achieves a significant improvementin retrieval performance, with over a 20% increase in AUC compared to baseline techniques.

Article activity feed