Video-to-Video Retrieval Using a Multidimensional (3D) CNN and Hash method

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Every day, with the advancement of visual equipment, a large amount of data is produced in the frame of films and videos and uploaded to the Internet. Extracting the desired video frame a large database has become a major challenge, and for this reason, many specialists and experts have tried to provide various solutions, each of which has its strengths and weaknesses. In some methods, by increasing the accuracy of feature extraction, we encounter a large amount of information, which reduces the efficiency of the method, and in some other previous methods, in order to increase the retrieval speed, they have reduced the volume of extracted features by ignoring some features, which causes the loss of important information and, as a result, the accuracy of image matching has decreased. A content-based video retrieval system consists of three basic steps: key frame extraction, important feature extraction, and similarity comparison. Hashing is one of the methods used for information retrieval, which is mostly used for image retrieval. In this paper, we propose a new framework using the hash method to solve the video retrieval problem, which uses a multidimensional (3D) CNN to obtain spatial and temporal video features. In this paper, pre-trained CNNs are used on ImageNet, a large visual database designed for use in visual object recognition software research and have achieved good results. Other important advantages of using these network models are a large saving in workload, time, and reducing the problems caused by insufficient training data. In this regard, we have also chosen the method [35] to extract the feature of a trained (3D) CNN model. the features extracted from each key frame are transferred to a binary space by the pre-trained network using a Hashing function to obtain compressed binary video codes. Compared with the existing methods, the experimental results show that the method has a certain improvement in the retrieval accuracy on the commonly used video datasets UCF-101 and THUMOS'14, which verififies the feasibility of the method.

Article activity feed