Video-to-Video Retrieval Using a Multidimensional (3D) CNN and Hash method

Aboulfazl Gharahsouflou
Vafa Maihami
Keyhan Khamforoosh

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Every day, with the advancement of visual equipment, a large amount of data is produced in the frame of films and videos and uploaded to the Internet. Extracting the desired video frame a large database has become a major challenge, and for this reason, many specialists and experts have tried to provide various solutions, each of which has its strengths and weaknesses. In some methods, by increasing the accuracy of feature extraction, we encounter a large amount of information, which reduces the efficiency of the method, and in some other previous methods, in order to increase the retrieval speed, they have reduced the volume of extracted features by ignoring some features, which causes the loss of important information and, as a result, the accuracy of image matching has decreased. A content-based video retrieval system consists of three basic steps: key frame extraction, important feature extraction, and similarity comparison. Hashing is one of the methods used for information retrieval, which is mostly used for image retrieval. In this paper, we propose a new framework using the hash method to solve the video retrieval problem, which uses a multidimensional (3D) CNN to obtain spatial and temporal video features. In this paper, pre-trained CNNs are used on ImageNet, a large visual database designed for use in visual object recognition software research and have achieved good results. Other important advantages of using these network models are a large saving in workload, time, and reducing the problems caused by insufficient training data. In this regard, we have also chosen the method [35] to extract the feature of a trained (3D) CNN model. the features extracted from each key frame are transferred to a binary space by the pre-trained network using a Hashing function to obtain compressed binary video codes. Compared with the existing methods, the experimental results show that the method has a certain improvement in the retrieval accuracy on the commonly used video datasets UCF-101 and THUMOS'14, which verififies the feasibility of the method.

Version published to 10.21203/rs.3.rs-7085737/v1 on Research Square
Jul 15, 2025

Leveraging 3DCNN and Weighted Similarity Metrics forEnhanced Content-Based Video Retrieval

This article has 3 authors:
1. Farooq Shaik
2. Ashu Abdul
3. Jatindra Kumar Dash
This article has no evaluationsLatest version Jun 16, 2025
Enhancing Face Image Inpainting via Low-Parameter Multi-Order Feature Interaction

This article has 4 authors:
1. Shuang Liu
2. Qian Zhang
3. Bai Wuer
4. Wang Chen
This article has no evaluationsLatest version Jun 2, 2025
Image based Natural Scene Text Segmentation and Classification using Enhanced Retrieval and Optimization Technique

This article has 4 authors:
1. Ghulam Jillani Ansari
2. Shahbaz Hassan Wasti
3. Syed Imran Abbas Qazmi
4. Muhammad Jawad Hussain
This article has no evaluationsLatest version Jun 25, 2025

Listed in

Abstract

Article activity feed

Related articles

Leveraging 3DCNN and Weighted Similarity Metrics forEnhanced Content-Based Video Retrieval

Enhancing Face Image Inpainting via Low-Parameter Multi-Order Feature Interaction

Image based Natural Scene Text Segmentation and Classification using Enhanced Retrieval and Optimization Technique