Meta-Ensemble Learning for IMDb Ratings: A Stacked Hybrid Model Integrating Gradient Boosting and Deep Neural Networks
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Precise IMDb movie ratings predictions are vital to the stakeholders of the film industry since it will dictate investment, marketing, and content recommendation. Standard machine learning techniques are not capable of handling high-cardinality categorical columns, bad interactions between features, and non-linear relations between reviews and metadata. This paper proposes the Meta-Ensemble Predictor (MEP), a new state-of-the-art hybrid framework that integrates various gradient boosters (CatBoost, LightGBM, XGBoost) among themselves and with a deep neural network and tuned using a meta-learning algorithm with a Random Forest as the final predictor. With a database of 33,600 movies from 1960 to 2024 having metadata details like genre, director, cast, runtime, box office, and voter ratings, MEP model uses TF-IDF text feature vectorization, polynomial interaction of features, and dimensionality reduction methods for improved feature representation. The model presented here has achieved RMSE of 0.4389 and accuracy of 96.96% at a difference of 1-point from actual IMDb ratings when compared to other state-of-the-art models. The study showcases the strength of ensemble learning to learn the sophisticated patterns of the ratings of the movies in becoming a good film industry predictive analytics tool.