A Comprehensive Framework for Multi-Aspect Analysis of Amazon Customer Reviews Using Machine and Deep Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The exponential rise of e-commerce has transformed customer engagement, with online reviews emerging as a major driver of purchasing decisions. This paper presents a comprehensive framework for multi-aspect analysis of Amazon U.S. customer reviews using both machine learning (ML) and deep learning (DL) approaches applied to textual data. The framework addresses three core research questions: (1) predicting the star rating based on review content, (2) categorizing reviews according to product type, and (3) predicting review helpfulness. To ensure consistency, the dataset underwent systematic preprocessing which includes tokenization, stopword removal, lemmatization, and lowercasing. Two embedding techniques, TF-IDF and BERT, were employed to capture both term significance and contextual semantics. A wide range of ML models (Logistic Regression, Naive Bayes, Random Forest, SVM, XGBoost) and DL architectures (RNN, LSTM, GRU) were implemented and evaluated using F1-score for imbalanced tasks (rating and category prediction) and accuracy for balanced tasks (helpfulness prediction). Results demonstrate that BERT-based embeddings consistently enhance model performance, particularly for sequential architectures like RNN and GRU. Overall, the framework establishes an interpretable, scalable pipeline for extracting actionable insights from large-scale e-commerce text, contributing to data-driven decision-making and improved customer understanding in online retail platforms.

Article activity feed