A Hybrid Movie Recommendation System Using BERT-Based Semantic Embeddings and SVD Collaborative Filtering
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
On modern digital platforms, recommender systems are crucial in assisting users in finding pertinent content. Users frequently find it difficult to find products that fit their individual interests as the amount of information available keeps expanding. While content-based methods rely on item information like titles and genres, traditional collaborative filtering techniques learn from past user ratings. Despite their widespread use, both strategies have definite drawbacks. While content-based systems frequently struggle to fully comprehend the meaning of text, collaborative filtering suffers when data is sparse or when new users or items appear. The hybrid movie recommendation system proposed in this paper combines the advantages of both methods. A pre-trained DistilBERT model that can extract contextual meaning from text is used to convert movie titles and genres into semantic vectors. Simultaneously, Singular Value Decomposition (SVD) is used to model user preferences based on rating data from the past. Using a weighted linear fusion strategy, the outputs of the two models are combined to determine the final recommendation score. Standard Top-N evaluation metrics were used to conduct the experiments on the MovieLens dataset. The hybrid model consistently performs better than standalone BERT-based and SVD-based systems, according to the results. The model achieves an NDCG@10 value near 0.91 at the ideal fusion weight, indicating robust and realistic ranking performance. These results show that, particularly in cold-start and sparse-data scenarios, combining semantic content understanding with collaborative signals results in more precise, reliable, and customized movie recommendations. This study presents a lightweight hybrid recommendation framework that balances semantic understanding and collaborative learning.