The livestock industry is a cornerstone of the global economy, contributing approximately 40% to agricultural GDP, according to the Food and Agriculture Organization of the United Nations (
Plummer and Plummer, 2012). However, its success is hindered by infectious diseases caused by bacteria, viruses and fungi. Among these, bovine mastitis poses a significant challenge, severely affecting milk quality and production.
Waseem et al., (2020) emphasized that this disease is mainly caused by bacterial infections like
Staphylococcus aureus and
Streptococcus agalactiae, resulting in inflammation of the udder. These pathogens inhabit the udder and teat skin, eventually colonizing and growing within the teat canal. Early identification and elimination of mastitis during lactation can yield substantial economic benefits by mitigating its adverse effects.
Bovine mastitis is classified into clinical and subclinical forms based on the causative agent. Subclinical mastitis, while not visibly detectable, significantly affects the somatic cell count and the temperature of the udder’s skin surface
(Vieira et al., 2021) and is higher prevalence compared to clinical mastitis (
Seddar-Yagoub et al., 2023). To identify mastitis, various detection methods have been developed, including the California Mastitis Test (CMT) kit
(Bouamra et al., 2024; Dingwell et al., 2003) and the Somatic Cell Counter unit
(Rychtarova et al., 2021), which enable effective monitoring and diagnosis based on observable symptoms and changes.
Research shows a strong correlation between CMT scores and SCC, establishing the CMT kit as a reliable, cost-effective tool for detecting subclinical mastitis (SCM)
(Cai et al., 2018; Ma et al., 2021; Zhou et al., 2022). Machine learning, a branch of artificial intelligence, further enhances early detection capabilities by analyzing large datasets, enabling advancements in disease detection and classification to benefit farmers and scientists.
Mikail and Keskin, (2013) proposed using Support Vector Machine (SVM) for subclinical mastitis detection, achieving 50% training accuracy and 85% testing accuracy and outperforming logistic regression. However, false negatives posed significant risks, particularly with insufficient somatic cell count data. Similarly,
(Shaltout et al., 2014) demonstrated the effectiveness of Information Gain (IG) for feature selection, achieving 90% accuracy in influenza classification using a decision tree classifier.
Ryan et al., (2021) demonstrated early mastitis detection through continuous cattle monitoring, leveraging changes in milk components such as fat, protein, lactose and somatic cell count (SCC), achieving 85% model accuracy.
Bobbo et al., (2021) compared machine learning models for mastitis detection using SCC and identified the Random Forest classifier as the most accurate when milk components were utilized for model construction.
Ma et al., (2021) proposed a non-invasive method to estimate cattle body temperature using machine learning techniques like Linear Regression and SVM. The approach achieved 63.8% accuracy for detecting common illnesses but struggled to predict outcomes from historical data for individual cattle.
Grodkowski et al., (2022) identified key features for designing a mastitis prediction model, demonstrating that logistic regression outperforms artificial neural networks (ANN) when using features like cattle movement, feed intake, resting period and rumination.
Rao et al., (2023) evaluated various machine learning models for cattle disease prediction using Kaggle datasets and discussed different techniques for disease detection.
Ankhita et al., (2020) compared the performance of KNN and SVM for mastitis detection, recommending SVM for disease detection applications due to its superior performance.
(Wang et al., 2022) proposed a deep learning-based supervised learning approach for mastitis detection, achieving 99.9% accuracy, outperforming machine learning models under default parameters.
Mohan et al., (2019) introduced an expert system for animal disease diagnosis using Convolutional Neural Networks (CNNs), achieving 98.8% accuracy with a diagnostic system consisting of a convolution layer and pooling layer that takes RGB images as input.
Abdul Ghafoor and Sitkowska, (2021) proposed a machine learning-based system for detecting clinical mastitis in cattle, improving detection speed based on symptoms exhibited by the cattle. The comparison of machine learning models revealed that the K-nearest neighbors (KNN) model achieved 99.46% accuracy with sensitivity and specificity of 94.7% and 98.9%, respectively. However, since KNN is a lazy training model, it is less efficient in terms of overall model performance. The mastitis detection algorithm aims to accurately predict cattle mastitis status for real-time applications. The subsequent sections cover the proposed methodology, which utilizes optimized feature descriptors and machine learning algorithms, followed by experimental results and a comparison of the algorithm’s performance with various models. The conclusion is provided in Section IV.