Identification of the Recurrence of Differentiated Thyroid Cancer by Stacking Classifier

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The performance of different machine learning models for predicting well-differentiated thyroid cancer recurrence is compared in this study using several accuracy metrics such as accuracy, sensitivity, precision, F1 score, specificity, the area under the curve (ROC), and Kappa statistics. The models that the paper considered for ranking are Logistic Regression (LR), Naïve Bayes (NB), Support Vector Machine (SVM), Decision Trees (DT), Random Forest (RF), and the proposed Stacked model. The results suggest that the use of ensemble learning methods, especially the proposed Stacked model, results in a generalized improvement over individual classifiers in terms of most of the measures. From Stacked models, there was a boosted level of sensitivity, precision, and F1-score, and the AUC in the higher train-test split (such as 80-20%) and 30-fold cross-validation where the accuracy was at par 100% and consistent. Random Forest also showed good accuracy of results and increased their speed when working with large data sets. The best outcomes were achieved using Decision Trees depending on the 80-20 split and 30-fold cross-validation. However, in Naive Bayes, which was used as a baseline, all the metrics were the lowest, indicating its inapplicability to this data set. Among the ensemble models, the newly designed Stacked model is the best for prediction accuracy of thyroid cancer recurrence; Random Forest is preferred for volume datasets. The results imply that using ensemble methods of constructing classifiers and selecting training data splits are indicative of operationalizing better models in intricate classification problems.

Article activity feed