Machine learning-based prediction of subclinical mastitis in large-scale dairy herds using a locally established somatic cell count threshold

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study aims to establish a somatic cell count threshold to identify cows with subclinical mastitis (SCM) and to develop a machine learning model to predict the incidence of SCM using individual cow data, milk production data and composition data. Milk samples were collected from 2420 cows, for CMT, SCC determination and milk composition analysis. Information on individual cows and their milk production was obtained from farm records. The diagnostic SCC threshold was identified on the basis of CMT scores using the Youden’s index. Two sets of models; one with all the variables (M1) and another with five selected variables (M2) were trained. The SCC threshold yielding the highest Youden’s index was 353,000 cells/mL. For Model 1 (M1), CatBoost achieved the highest accuracy (74.5%) and precision (0.716), while naïve bayes attained the highest recall (0.780) and the decision tree produced the highest F1 score (0.662). CatBoost also recorded the highest AUC (0.801). Milk conductivity (mS/cm) consistently emerged as the most influential predictor across nearly all the algorithms. In Model 2 (M2), the overall performance decreased, with logistic regression achieving the highest accuracy (67.7%) and AUC (0.686), support vector machine demonstrating the highest precision (65.8%), and naïve bayes outperforming the others methods in terms of the recall and F1 score.

Article activity feed