Classification of Daily Body Weight Gains in Beef
Classification of Daily Body Weight Gains in Beef
Article
Classification of Daily Body Weight Gains in Beef Calves Using
Decision Trees, Artificial Neural Networks, and
Logistic Regression
Wilhelm Grzesiak 1 , Daniel Zaborski 1, * , Renata Pilarczyk 1 , Jerzy Wójcik 1 and Krzysztof Adamczyk 2
1 Department of Ruminants Science, West Pomeranian University of Technology, Klemensa Janickiego 29,
71-270 Szczecin, Poland
2 Department of Genetics, Animal Breeding and Ethology, University of Agriculture in Krakow,
al. Mickiewicza 24/28, 30-059 Kraków, Poland; krzysztof.adamczyk@urk.edu.pl
* Correspondence: daniel.zaborski@zut.edu.pl; Tel.: +48-9-1449-6813; Fax: +48-9-1449-6800
Simple Summary: In the management of beef cattle, it can be useful to divide individuals based on
a specific trait value (below and above average). This in turn allows for focusing on a larger group
of animals with the aim of improving, e.g., their growth rate or obtaining a more uniform group in
terms of a given trait. Classifying calves into less (below average) and more (above average) efficient
growth creates an opportunity for producers to direct their efforts towards the “worse” animals and
improve their performance through adjustments in nutrition, animal grouping, or reorganization
of work. In this study, models were developed based on data from a beef farm. They were used
to classify beef calves into poorer and better growth groups. In order to obtain more input data,
predictions were made for the third calf. Among the analyzed models, random forest was the most
effective. The most significant factors influencing daily body weight gains were also identified
and discussed in the present study. The results demonstrate that machine learning models can be
useful for classifying calves based on their growth rates. However, it is necessary to maintain proper
breeding documentation from which the predictors can be obtained.
Citation: Grzesiak, W.; Zaborski, D.; Abstract: The aim of the present study was to compare the predictive performance of decision trees,
Pilarczyk, R.; Wójcik, J.; Adamczyk, artificial neural networks, and logistic regression used for the classification of daily body weight gains
K. Classification of Daily Body in beef calves. A total of 680 pure-breed Simmental and 373 Limousin cows from the largest farm in
Weight Gains in Beef Calves Using
the West Pomeranian Province, whose calves were fattened between 2014 and 2016, were included in
Decision Trees, Artificial Neural
the study. Pre-weaning daily body weight gains were divided into two categories: A—equal to or
Networks, and Logistic Regression.
lower than the weighted mean for each breed and sex and B—higher than the mean. Models were
Animals 2023, 13, 1956. https://
doi.org/10.3390/ani13121956
developed separately for each breed. Sensitivity, specificity, accuracy, and area under the curve on
a test set for the best model (random forest) were 0.83, 0.67, 0.76, and 0.82 and 0.68, 0.86, 0.78, and
Academic Editor: Sébastien
0.81 for the Limousin and Simmental breeds, respectively. The most important predictors were daily
Buczinski
weight gains of the dam when she was a calf, daily weight gains of the first calf, sex of the third calf,
Received: 17 April 2023 milk yield at first lactation, birth weight of the third calf, dam birth weight, dam hip height, and
Revised: 7 June 2023 second calving season. The selected machine learning models can be used quite effectively for the
Accepted: 9 June 2023 classification of calves based on their daily weight gains.
Published: 11 June 2023
Keywords: classification; body weight gains; beef calves; decision trees; artificial neural networks;
logistic regression
achieve optimal body weight at slaughter within a specified time. It is also significant to
“balance” the herd structure or the group of animals in terms of factors such as body weight
gains, despite individual variations. Focusing on a larger group of animals to improve a
certain trait is sometimes more desirable than paying special attention to single individuals,
which in turn is typical of breeding.
Classifying calves into less (below average) and more (above average) efficiently grow-
ing categories allows the producer to direct efforts towards “inferior” animals in order to
improve performance through adjustments in nutrition, animal grouping, reorganization
of work, etc. Regular, at least monthly, body weight control by weighing of animals is
a fundamental activity that determines fattening progress (growth rate analysis) and its
completion. It should, however, be noted that weighing is a stressful activity for animals
(especially under grazing conditions). It reduces their welfare and generates additional
financial costs and organizational challenges for a farmer. However, it is possible to use
alternative methods of body weight estimation and data processing by the application of
certain statistical models [6,7]. Advanced technologies, including machine learning meth-
ods, can reduce human errors, increase farmers’ profits, improve farm productivity and
animal welfare, and aid in the development of more holistic, humane, and environmentally
friendly practices [8].
Predicting the magnitude of a trait (defined class) and determining the influence of
various factors and rules that affect this prediction becomes possible with the use of classical
statistical methods, the so-called machine learning methods, and artificial intelligence [9].
However, determining in advance which method will provide the highest accuracy can be
challenging [10].
Decision trees are classification or regression models formulated in a tree-like architec-
ture. The dataset is progressively organized into smaller, homogeneous subsets, forming a
connected tree graph [11]. Each internal node of the tree structure represents a pairwise
comparison of a selected trait. Each branch depicts the outcome of that comparison [12].
Leaf nodes represent the final decision or prediction made after traversing the path from
the root to the leaf, expressed as a classification rule [13–15].
Decision trees, belonging to machine learning methods, allow for predicting the level
of different traits while controlling for the factors affecting it. The application of these
techniques in various zootechnical analyses is increasing [16–20]. The obtained information
allows for identifying individuals with potentially low values of a given trait. This in turn
enables actions towards improving this condition or indicating animals for culling [16].
The choice of decision trees is based on many useful aspects of their applicability, such as
the clear explanation of the obtained solutions and the intuitiveness of interpretation [21].
Moreover, unlike artificial neural networks, the construction of a decision tree model is easy
to follow. The quality of trees is not only determined by their predictive performance, but
also by the splitting rules illustrating useful relationships [22]. Artificial neural networks
(ANN), on the other hand, are a combination of multiple processing units that create a spe-
cific topology and mimic complex biological functions to solve regression or classification
problems [15].
In terms of the previous studies on the application of statistical methods to daily weight
gain prediction, Cominotte et al. [6] predicted body weight and average daily body weight
gains in beef cattle based on three-dimensional images using multiple linear regression,
least absolute shrinkage and selection operator, partial least squares, and ANN. The ANN
was superior for predicting average daily body weight gains for the weaning to stocker,
weaning to the beginning of feedlot, weaning to the end of feedlot, stocker to the beginning
of feedlot, and beginning to the end of feedlot periods. Benedeti et al. [23] estimated and
validated regression equations to predict carcass weight, empty body weight gain, and
retained energy of Zebu cattle based on the independent variables such as slaughter plant,
sex, genotype, shrunk body weight, carcass gain, and equivalent empty body weight. The
authors stated that the equations accurately and precisely estimated empty body weight
gain of Zebu and Zebu-cross cattle in the independent validation dataset based on daily
Animals 2023, 13, 1956 3 of 16
carcass gain. Zhao et al. [7] applied a commercial software (Cornell Net Carbohydrate
and Protein System) for predicting daily weight gains in Chinese local beef breeds based
on an ingested metabolizable energy allowance automatically calculated by the computer
model. The generated predictions were fairly accurate as evidenced by low biases between
predicted and observed values. Lee et al. [24] used different machine learning methods
(linear regression, tree regression, adaptive boosting, and a deep neural network) for
predicting average daily weight gains in pigs based on temperature, humidity, feed intake,
and the current animal weight. The applied algorithms were capable of predicting the
trait accurately even despite the heterogeneity of the growth characteristics of pigs. In
addition, ANNs were superior to other models. Osorio et al. [25] predicted average daily
gains in lambs based on the digestibility and composition of the diet using a regression
model and obtained acceptable accuracy and precision. Finally, Aranda et al. [26] utilized a
deterministic model for predicting daily weight gains in growing steers grazing tropical
pastures. It included the effects of protein and energy intake from forages and supplements.
However, its accuracy was quite limited.
The aim of this study was to classify daily body weight gains in beef calves (below
or above the regional average based on data recorded on a farm) using decision trees
(classification and regression trees (CART), chi-square automatic interaction detection
(CHAID) trees, and random forests (RF)). The obtained results were compared with another
artificial intelligence method, i.e., ANN, and a classical statistical method, i.e., logistic
regression (LR). The second aim of the present study was to identify the variables that
contributed most to the classification of daily body weight gains.
The article is organized as follows: Section 2 describes materials and methods. Section 3
contains results. Section 4 discusses the results and describes related works in the same field.
Section 5 includes the final conclusions and possible avenues for future work.
Table 1. Selected morphometric parameters and performance indicators for the examined dams
grouped by breed.
Beef cattle were kept in a rotation grazing system with voluntary use of shelters and
fed a total mixed ration (TMR). The silage consisting mainly of corn, grass, alfalfa, and crop
plants supplemented with vitamin–mineral premixes was fed twice a day. The management
and feeding were consistent for all animals during the analyzed period (adjusted for animal
age and body weight), and no significant changes or deviations that could affect body
Animals 2023, 13, 1956 4 of 16
weight gains were observed. A more detailed description of the management system is
provided by Pilarczyk and Wójcik [28].
The data from farm records about dams and their offspring for three successive
calvings were included in the models (Table 2).
A total of 21 predictor variables were used. The classification (predicted) variable was
the average daily body weight gain of the third calf of the cow, measured until its weaning
at approximately 210 days of age. The gains were calculated from birth to the weaning of
each calf. The weighted average was determined for calves of the same breed and sex from
the two previous periods of evaluating beef cattle productive value in the breeding region
where the research was conducted. For the purpose of classification, daily body weight
gains of calves from the third calving of the cow (third production period) were divided
into two categories (classes):
• Class A—the gains equal to or lower than the established weighted average body
weight gains. This class represented calves with “worse” daily body weight gains.
• Class B—the gains higher than the established weighted average body weight gains.
This class represented calves with “better” daily body weight gains.
The dataset was balanced for the Simmental (Class A—229 calves (43.45%) and
Class B—298 calves (56.55%)) and Limousin (Class A—175 calves (55.73%) and Class
B—139 calves (44.27%)) breeds.
The average daily weight gains for calves of a given breed and sex from the two
previous periods of evaluating beef cattle productive value (successive production years)
were obtained from the “Evaluation of Beef Cattle Productive Value” published by the
Polish Association of Beef Cattle Breeders and Producers (https://bydlo.com.pl/ocena-
wartosci-uzytkowej-bydla-miesnego/, accessed on 24 January 2023). They are presented in
Table 3.
Animals 2023, 13, 1956 5 of 16
Table 3. Weighted average daily body weight gains (g) during the study period for Limousin and
Simmental heifer and bull calves in the West Pomeranian Province.
The same training set was used to prepare the LR, ANN, and decision tree (CART,
CHAID, and RF) models. Similarly, the same test set was used to verify the predictive
performance of each model, i.e., their ability to classify (detect) daily body weight gains
of calves above and below the average daily body weight gain determined for a given
breed and sex. In the process of hyper-parameter tuning, the default values were used
as the basis for each model. They were subsequently modified in order to check whether
model performance improved. Model selection was based on a validation set or a 10-fold
cross-validation. In the case of ANN, the so-called automatic network designer was used to
find the best model among multilayer perceptrons (MLP) with one and two hidden layers,
radial basis function (RBF), and linear networks. The total number of the analyzed networks
was 200, whereas activation functions for the hidden and output neurons included linear,
logistic, hyperbolic tangent, and exponential ones. The sum of squares and cross-entropy
were used as error functions and weight reduction was performed from 0.001 to 0.01. For
RF, the optimal number of component trees was determined on the validation set, whereas
the 10-fold cross-validation was applied to find the best CART, CHAID, and LR models.
The final parameters for individual models are presented in Table 5.
Table 5. Cont.
TP TN TP + TN
SEN = , SPF = , ACC =
TP + FN FP + TN TP + FP + FN + TN
where:
Observed
Predicted
Class A (Lower Gains) Class B (Higher Gains)
Class A (lower gains) TP FP
Class B (higher gains) FN TN
chart for better readability, so that the bar for the most important variable is the highest).
The importance of variables to the LR model was primarily analyzed based on their
statistical significance, and the ranks were subsequently assigned as for the models above.
For ANN, predictor importance was determined using sensitivity analysis, taking into
account the ratio and rank of each variable. The ratio should be interpreted as follows:
if the value for a given variable is above one, removing that variable from the training
set may result in decreased model quality, and vice versa: values below one indicate that
the contribution of the variable is insignificant. The chart shows the five most important
variables for each model.
This correlation coefficient is based on true positives and false positives as well as
true negatives and false negatives. It is a balanced measure even when individual classes
have different sample sizes. MCC ranges from −1 to +1, where values close to −1 indicate
perfect misclassification and those close to +1 show excellent classification. Values around
0 suggest random classification [30,31].
Statistical analyses were performed using Statistica software (v. 13.3, Tibco Inc., Tulsa,
OK, USA 2018) and Statistica Neural Networks program (StatSoft Inc., Tulsa, OK, USA).
3. Results
The CART and CHAID models obtained in the present study are presented in Figure S1.
Table S1 summarizes their quality indicators (SEN, SPF, ACC, and MCC) calculated on the
training set for the Limousin and Simmental breeds. Table S2 presents the number of cases
in classification matrices obtained for the training set.
For the Limousin breed, RF had the highest sensitivity (0.83) and CHAID had the
lowest one (0.55). The sensitivity of other models ranged from 0.72 for ANN to 0.78 for
CART. It was statistically significantly different only from the lowest sensitivity of the
CHAID model (Table 7). The specificity ranged from 0.59 for LR to 0.77 for CHAID,
although the differences were not statistically significant. The accuracy ranged from
0.64 for CHAID to 0.76 for RF.
For the Simmental breed, ANN had the highest sensitivity (0.77), while LR had the
lowest one (0.66). The sensitivity of other models was similar (0.73 for CHAID, 0.70 for
CART, and 0.68 for RF). These values did not differ significantly. The RF model had
the highest specificity (0.86), followed by CART (0.80). These values were statistically
significantly different from those for other models (0.67 for LR, 0.62 for CHAID and ANN).
Animals 2023, 13, 1956 8 of 16
Table 7. Model performance indicators and Matthews correlation coefficients (MCC) on the test set
for the individual models.
Limousin Simmental
Model
SEN SPF ACC MCC SEN SPF ACC MCC
LR 0.76 B 0.59 0.68 0.35 0.66 0.67 Bb 0.67 b 0.33 b
CART 0.78 B 0.70 0.74 0.48 0.70 0.80 ACa 0.76 0.50
CHAID 0.55 Aa 0.77 0.64 0.33 0.73 0.62 BC 0.67 b 0.35
RF 0.83 B 0.67 0.76 0.52 0.68 0.86 A 0.78 a 0.55 a
ANN 0.73 b 0.67 0.70 0.39 0.77 0.62 BC 0.69 0.38 b
LR—logistic regression, CART—classification and regression trees, CHAID—chi-square automatic interaction
detector, RF—random forest, ANN—artificial neural networks, SEN—sensitivity, SPF—specificity, ACC—accuracy,
MCC—Matthews correlation coefficient; values with different superscript letters differ at p ≤ 0.05 (small letters)
and p ≤ 0.01 (capital letters).
The RF model also had the highest overall accuracy (0.78), while the LR and CHAID
models had the lowest one (0.67). These differences were statistically significant. The
ACC values for ANN and CART were 0.69 and 0.76, respectively (Table 7). The respective
classification matrices are shown in Table S3.
The MCC values for the Limousin breed ranged from 0.33 for CHAID to 0.52 for RF,
but these differences were not statistically significant. For the Simmental breed, the MCC
values ranged between 0.33 for LR and 0.55 for RF. Significant differences (p < 0.05) were
observed between the MCC values for RF and LR, as well as RF and CHAID (Table 7).
ROC curves with AUC values for individual models are presented in Figure 1. For the
Limousin breed, RF had the largest AUC (0.82), indicating that it was the best model based
on this criterion. CHAID had the smallest AUC (0.67). For other models, the AUC values
ranged from 0.74 to 0.78. Similar AUC values were observed for the Simmental breed. RF
had the highest AUC (0.81), while CHAID had the lowest one (0.71). The AUC values for
other models ranged from 0.75 to 0.78.
Predictor Importance
Animals 2023, 13, x FOR PEER REVIEW For each model, the most important predictor was daily body weight gains of9 the of 17dam
when she was a calf (DDBWG). Daily body weight gains for the first calf (CDBWG1) were
included among the top five most important predictive variables for all models, except
CHAID. Sex of the third calf (CG3) was important for all models, except RF. Additionally,
milk ROC
yieldcurves with AUC
from calving values for
to weaning at individual models
first lactation are presented
(CMY1) in Figure
was important 1. models,
for all For
the Limousin breed, RF had the largest AUC (0.82), indicating that it was the
except ANN. The remaining variables (milk yield from calving to weaning at second best model
based on this criterion. CHAID had the smallest AUC (0.67). For other models, the AUC
lactation (CMY2), birth weight of the third calf (CBW3), dam birth weight (DBW), hip
values ranged from 0.74 to 0.78. Similar AUC values were observed for the Simmental
height (HH), and second calving season (CS2)) were important for some models (Figure 2).
breed. RF had the highest AUC (0.81), while CHAID had the lowest one (0.71). The AUC
The full set of predictor variables with their ranks is presented in the Supplementary
values for other models ranged from 0.75 to 0.78.
Material Table S4.
Figure 1. Cont.
Animals 2023, 13, 1956 9 of 16
Figure 1. Receiver
Receiver operating
operatingcharacteristic
characteristic(ROC)
(ROC)curves
curvesforfor
individual models
individual onon
models thethe
testtest
setset
(the
(the
Limousine and Simmental breed). RF—random forest, CHAID—chi-square automatic
Limousine and Simmental breed). RF—random forest, CHAID—chi-square automatic interactioninteraction
Animals 2023, 13, x FOR PEER REVIEW
detector, CART—classifications and regression trees, LR—logistic regression, ANN—artificial neu- 10 of 17
detector, CART—classifications and regression trees, LR—logistic regression, ANN—artificial
ral network.
neural network.
Predictor Importance
For each model, the most important predictor was daily body weight gains of the
dam when she was a calf (DDBWG). Daily body weight gains for the first calf (CDBWG1)
were included among the top five most important predictive variables for all models, ex-
cept CHAID. Sex of the third calf (CG3) was important for all models, except RF. Addi-
tionally, milk yield from calving to weaning at first lactation (CMY1) was important for
all models, except ANN. The remaining variables (milk yield from calving to weaning at
second lactation (CMY2), birth weight of the third calf (CBW3), dam birth weight (DBW),
hip height (HH), and second calving season (CS2)) were important for some models (Fig-
ure 2). The full set of predictor variables with their ranks is presented in the Supplemen-
tary Material Table S4.
Figure2.2.Comparison
Figure Comparisonofofthe
thetop
topfive
fivemost
mostimportant
importantvariables
variablesfor
forindividual
individualmodels.
models.DDBWG—
DDBWG—
daily body weight gains of the dam when she was a calf; CDBWG1—daily body weight gains for
daily body weight gains of the dam when she was a calf; CDBWG1—daily body weight gains for
the first calf; CG3—sex of the third calf; CMY2—milk yield from calving to weaning at second lac-
the first calf; CG3—sex of the third calf; CMY2—milk yield from calving to weaning at second
tation; CMY1—milk yield from calving to weaning at first lactation; CBW3—birth weight of the
lactation;
third calf;CMY1—milk
DBW—damyieldbirthfrom calving
weight; to weaning
HH—hip at CS2—second
height; first lactation;calving
CBW3—birth
season;weight of the
RF—random
third calf; DBW—dam birth weight; HH—hip height; CS2—second calving season;
forest, CHAID—chi-square automatic interaction detector, CART—classifications and regressionRF—random
forest, CHAID—chi-square
trees, LR—logistic automatic
regression, interaction
ANN—artificial detector,
neural CART—classifications and regression
network.
trees, LR—logistic regression, ANN—artificial neural network.
4. Discussion
4. Discussion
The novelty of our approach consisted in the use of routinely collected on-farm data
The novelty of our approach consisted in the use of routinely collected on-farm data
for daily body weight gain prediction. The second advantage was the prediction of daily
for daily body weight gain prediction. The second advantage was the prediction of daily
body weight gains in beef cattle (and not body weight itself). There are few studies about
body weight gains in beef cattle (and not body weight itself). There are few studies
daily weight gain prediction in cattle, especially using machine learning methods. Finally,
about daily weight gain prediction in cattle, especially using machine learning methods.
the classification of daily weight gains was applied in our study, which enabled animal
Finally, the classification of daily weight gains was applied in our study, which enabled
grouping.
animal grouping.
4.1. Model Quality and Predictive Performance
The analysis of two separate data sets, divided by breed, resulted from the heteroge-
neity of classification accuracy in data segments, as indicated by Magidson [32] and Ratner
[33]. Due to an insufficient sample size, the data set was not further divided according to
sex. The performance indicators obtained on the test set for individual models were gen-
erally lower than those on the training set. Exceptions were sensitivity, accuracy, and MCC
Animals 2023, 13, 1956 10 of 16
measurements. Moreover, the prediction pertains to the daily weight gains of the third calf
of the cow while maintaining continuity (i.e., provided that the data for the first and second
calf of the same dam are available). In future research, a larger sample size, additional
predictors, and other types of predictive models should be used.
5. Conclusions
The present study showed that the applied predictive models based on farm-available
data had the ability to fairly accurately classify daily body weight gains in the third
production cycle. They showed high sensitivity in predicting lower gains, moderate
specificity (ability of the model to correctly identify higher body weight gains), balanced
accuracy, and moderate MCC values. The RF model performed the best in terms of various
quality indicators, with most of them being higher than those for other models, whereas the
CHAID model exhibited the worst predictive performance. LR and ANN also performed
well. However, decision trees and ANN have an advantage over LR, because they do not
Animals 2023, 13, 1956 13 of 16
require assumptions of their applicability. Among the 21 variables, several of the most
important ones can be utilized, namely daily body weight gains of the dam when she was a
calf, daily body weight gains for the first calf, sex of the third calf (from the third pregnancy),
milk yield from calving to weaning at first lactation, milk yield from calving to weaning at
second lactation, birth weight of the third calf, dam birth weight, hip height, and second
calving season, although not for all models, each of which requires an individual approach.
It should be mentioned that random forest can be used for the preliminary processing of
on-farm collected data to assess the chances of dams’ offspring for low or high body gains
according to the adopted criterion.
In the future, predictors that turned out to be less important in the present study
could be excluded from the models, which would facilitate their practical application on
beef farms. On the other hand, the current predictive performance could be increased by
including additional predictors (not utilized in the present study), such as animal welfare
parameters (e.g., temperature–humidity index), the occurrence of (chronic) diseases, the
proportion of animals that died during the fattening period, economic factors related to the
producer’s decisions about fattening management (e.g., the ratio of livestock prices to feed
prices), and those associated with animal feeding. These new variables would, however,
require additional measurements to be taken. It is also recommended to develop ensemble
models (based on majority voting), which would integrate decision trees, artificial neural
networks, logistic regression, etc., in order to further improve predictive performance.
Taking into account prediction results for beef calves obtained in the present study, similar
models could also be used for classifying daily weight gains in other beef breeds for different
production regions. Such a classification system may be extended to other livestock species
(e.g., pigs) and combined with dressing percentage prediction. Finally, the developed
models could be implemented in a computer program.
References
1. Yin, T.; König, S. Genetic Parameters for Body Weight from Birth to Calving and Associations between Weights with Test-Day,
Health, and Female Fertility Traits. J. Dairy Sci. 2018, 101, 2158–2170. [CrossRef] [PubMed]
2. Noinan, K.; Wicha, S.; Chaisricharoen, R. The IoT-Based Weighing System for Growth Monitoring and Evaluation of Fattening
Process in Beef Cattle Farm. In Proceedings of the 2022 Joint International Conference on Digital Arts, Media and Technology
with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT
& NCON), Online, 26–28 January 2022; pp. 384–388.
3. Ruchay, A.; Kober, V.; Dorofeev, K.; Kolpakov, V.; Dzhulamanov, K.; Kalschikov, V.; Guo, H. Comparative Analysis of Machine
Learning Algorithms for Predicting Live Weight of Hereford Cows. Comput. Electron. Agric. 2022, 195, 106837. [CrossRef]
4. Wang, Z.; Shadpour, S.; Chan, E.; Rotondo, V.; Wood, K.M.; Tulpan, D. ASAS-NANP SYMPOSIUM: Applications of Machine
Learning for Livestock Body Weight Prediction from Digital Images. J. Anim. Sci. 2021, 99, skab022. [CrossRef]
5. Gjergji, M.; de Moraes Weber, V.; Silva, L.O.C.; da Costa Gomes, R.; De Araújo, T.L.A.C.; Pistori, H.; Alvarez, M. Deep Learning
Techniques for Beef Cattle Body Weight Prediction. In Proceedings of the 2020 International Joint Conference on Neural Networks
(IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8.
6. Cominotte, A.; Fernandes, A.F.A.; Dorea, J.R.R.; Rosa, G.J.M.; Ladeira, M.M.; van Cleef, E.; Pereira, G.L.; Baldassini, W.A.; Neto,
O.M. Automated Computer Vision System to Predict Body Weight and Average Daily Gain in Beef Cattle during Growing and
Finishing Phases. Livest. Sci. 2020, 232, 103904. [CrossRef]
7. Zhao, J.S.; Zhou, Z.M.; Ren, L.P.; Xiong, Y.Q.; Du, J.P.; Meng, Q.X. Evaluation of Dry Matter Intake and Daily Weight Gain
Predictions of the Cornell Net Carbohydrate and Protein System with Local Breeds of Beef Cattle in China. Anim. Feed Sci. Technol.
2008, 142, 231–246. [CrossRef]
8. Neethirajan, S. The Role of Sensors, Big Data and Machine Learning in Modern Animal Farming. Sens. Bio-Sens. Res. 2020,
29, 100367. [CrossRef]
9. Yusuf, M. Understanding the Relationship between Weather Variables, Dry Matter Intake, and Average Daily Gain of Beef Cattle.
Ph.D. Thesis, North Dakota State University, Fargo, ND, USA, 2021.
10. White, B.J.; Amrine, D.E.; Larson, R.L. Big Data Analytics and Precision Animal Agriculture Symposium: Data to Decisions.
J. Anim. Sci. 2018, 96, 1531–1539. [CrossRef]
11. Chen, W.; Xie, X.; Wang, J.; Pradhan, B.; Hong, H.; Bui, D.T.; Duan, Z.; Ma, J. A Comparative Study of Logistic Model Tree,
Random Forest, and Classification and Regression Tree Models for Spatial Prediction of Landslide Susceptibility. Catena 2017, 151,
147–160. [CrossRef]
12. Xia, Y.; Liu, C.; Li, Y.; Liu, N. A Boosted Decision Tree Approach Using Bayesian Hyper-Parameter Optimization for Credit
Scoring. Expert Syst. Appl. 2017, 78, 225–241. [CrossRef]
13. Gómez, M.A.; Ibáñez, S.J.; Parejo, I.; Furley, P. The Use of Classification and Regression Tree When Classifying Winning and
Losing Basketball Teams. Kinesiology 2017, 49, 47–56. [CrossRef]
14. Khandelwal, M.; Armaghani, D.J.; Faradonbeh, R.S.; Yellishetty, M.; Majid, M.Z.A.; Monjezi, M. Classification and Regression
Tree Technique in Estimating Peak Particle Velocity Caused by Blasting. Eng. Comput. 2017, 33, 45–53. [CrossRef]
15. Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine Learning in Agriculture: A Review. Sensors 2018, 18, 2674.
[CrossRef] [PubMed]
16. Grzesiak, W.; Rzewucka-Wójcik, E.; Zaborski, D.; Szatkowska, I.; Kotarska, K.; Dybus, A. Classification of Daily Body Weight
Gains in Beef Cattle via Neural Networks and Decision Trees. Appl. Eng. Agric. 2014, 30, 307–313. [CrossRef]
17. Eyduran, E.; Zaborski, D.; Waheed, A.; Celik, S.; Karadas, K.; Grzesiak, W. Comparison of the Predictive Capabilities of Several
Data Mining Algorithms and Multiple Linear Regression in the Prediction of Body Weight by Means of Body Measurements in
the Indigenous Beetal Goat of Pakistan. Pak. J. Zool. 2017, 49, 257–265. [CrossRef]
18. Aksoy, A.; Ertürk, Y.E.; Eyduran, E.; Tariq, M.M. Comparing Predictive Performances of MARS and CHAID Algorithms for
Defining Factors Affecting Final Fattening Live Weight in Cultural Beef Cattle Enterprises. Pak. J. Zool. 2018, 50, 2279–2286.
[CrossRef]
19. Hakem, M.; Boulouard, Z.; Kissi, M. Classification of Body Weight in Beef Cattle via Machine Learning Methods: A Review.
Procedia Comput. Sci. 2022, 198, 263–268. [CrossRef]
20. Tirink, C. Comparison of Bayesian Regularized Neural Network, Random Forest Regression, Support Vector Regression and
Multivariate Adaptive Regression Splines Algorithms to Predict Body Weight from Biometrical Measurements in Thalli Sheep.
Kafkas Üniversitesi Vet. Fakültesi Derg. 2022, 28, 411–419.
21. Stiglic, G.; Kocbek, P.; Fijacko, N.; Zitnik, M.; Verbert, K.; Cilar, L. Interpretability of Machine Learning-Based Prediction Models
in Healthcare. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2020, 10, e1379. [CrossRef]
22. Goel, P.K.; Prasher, S.O.; Patel, R.M.; Landry, J.-A.; Bonnell, R.B.; Viau, A.A. Classification of Hyperspectral Data by Decision
Trees and Artificial Neural Networks to Identify Weed Stress and Nitrogen Status of Corn. Comput. Electron. Agric. 2003, 39,
67–93. [CrossRef]
23. Benedeti, P.D.B.; Filho, S.C.V.; Chizzotti, M.L.; Marcondes, M.I.; de Sales Silva, F.A. Development of Equations to Predict Carcass
Weight, Empty Body Gain, and Retained Energy of Zebu Beef Cattle. Animal 2021, 15, 100028. [CrossRef] [PubMed]
24. Lee, W.; Han, K.-H.; Kim, H.T.; Choi, H.; Ham, Y.; Ban, T.-W. Prediction of Average Daily Gain of Swine Based on Machine
Learning. J. Intell. Fuzzy Syst. 2019, 36, 923–933. [CrossRef]
Animals 2023, 13, 1956 15 of 16
25. Osorio, A.I.; Mendoza, G.D.; Plata, F.X.; Martínez, J.A.; Vargas, L.; Ortega, G.C. A Simulation Model to Predict Body Weight Gain
in Lambs Fed High-Grain Diets. Small Rumin. Res. 2015, 123, 246–250. [CrossRef]
26. Aranda, E.; González, S.; Arjona, E.; Plata, F.; Vargas, L. A Simulation Model to Predict Body Weight Gain in Growing Steers
Grazing Tropical Pastures. Agric. Syst. 2006, 90, 99–111.
27. Deodhar, M.; Ghosh, J. A Framework for Simultaneous Co-Clustering and Learning from Complex Data. In Proceedings of the
13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, USA, 12–15 August 2007;
pp. 250–259.
28. Pilarczyk, R.; Wójcik, J. Comparison of Calf Rearing Results and Nursing Cow Performance in Various Beef Breeds Managed
under the Same Conditions in North-Western Poland. Czech J. Anim. Sci. 2007, 52, 325. [CrossRef]
29. Demler, O.V.; Pencina, M.J.; D’Agostino Sr, R.B. Misuse of DeLong Test to Compare AUCs for Nested Models. Stat. Med. 2012, 31,
2577–2587. [CrossRef] [PubMed]
30. Boughorbel, S.; Jarray, F.; El-Anbari, M. Optimal Classifier for Imbalanced Data Using Matthews Correlation Coefficient Metric.
PLoS ONE 2017, 12, e0177678. [CrossRef] [PubMed]
31. Chicco, D.; Jurman, G. The Advantages of the Matthews Correlation Coefficient (MCC) over F1 Score and Accuracy in Binary
Classification Evaluation. BMC Genom. 2020, 21, 6. [CrossRef]
32. Magidson, J. Some Common Pitfalls in Causal Analysis of Categorical Data. J. Mark. Res. 1982, 19, 461–471. [CrossRef]
33. Ratner, B. Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data; CRC Press: Boca Raton,
FL, USA, 2003.
34. Antipov, E.; Pokryshevskaya, E. Applying CHAID for Logistic Regression Diagnostics and Classification Accuracy Improvement.
J. Target. Meas. Anal. Mark. 2010, 18, 109–117. [CrossRef]
35. Wi˛eckowska, B. PQStat User Guide; PQStat Software: Poznan, Poland, 2013.
36. Luque, A.; Carrasco, A.; Martín, A.; de Las Heras, A. The Impact of Class Imbalance in Classification Performance Metrics Based
on the Binary Confusion Matrix. Pattern Recognit. 2019, 91, 216–231. [CrossRef]
37. Hill, T.; Lewicki, P. Statistics: Methods and Applications; StatSoft: Tulsa, OK, USA, 2007; ISBN 1-884233-59-7.
38. Breiman, L. Classification and Regression Trees; Chapman & Hall: Boca Raton, FL, USA, 1984; ISBN 978-0-412-04841-8.
39. Neslin, S.A.; Gupta, S.; Kamakura, W.; Lu, J.; Mason, C.H. Defection Detection: Measuring and Understanding the Predictive
Accuracy of Customer Churn Models. J. Mark. Res. 2006, 43, 204–211. [CrossRef]
40. Levin, N.; Zahavi, J. Continuous Predictive Modeling—A Comparative Analysis. J. Interact. Market. 1998, 12, 5–22. [CrossRef]
41. Soberon, F.; Raffrenato, E.; Everett, R.W.; Van Amburgh, M.E. Preweaning Milk Replacer Intake and Effects on Long-Term
Productivity of Dairy Calves. J. Dairy Sci. 2012, 95, 783–793. [CrossRef] [PubMed]
42. Hoseyni, F.; Mahjoubi, E.; Zahmatkesh, D.; Yazdi, M.H. Effects of Dam Parity and Pre-Weaning Average Daily Gain of Holstein
Calves on Future Milk Production. J. Dairy Res. 2016, 83, 453–455. [CrossRef]
43. Beal, W.E.; Notter, D.R.; Akers, R.M. Techniques for Estimation of Milk Yield in Beef Cows and Relationships of Milk Yield to Calf
Weight Gain and Postpartum Reproduction. J. Anim. Sci. 1990, 68, 937–943. [CrossRef] [PubMed]
44. Rahbar, R.; Abdullahpour, R.; Sadeghi-Sefidmazgi, A. Effect of Calf Birth Weight on Milk Production of Holstein Dairy Cattle in
Desert Climate. J. Anim. Behav. Biometeorol. 2016, 4, 65–70. [CrossRef]
45. Han, L.; Heinrichs, A.J.; De Vries, A.; Dechow, C.D. Relationship of Body Weight at First Calving with Milk Yield and Herd Life.
J. Dairy Sci. 2021, 104, 397–404. [CrossRef]
46. Weik, F.; Hickson, R.E.; Morris, S.T.; Garrick, D.J.; Archer, J.A. Genetic Parameters for Maternal Performance Traits in Commercially
Farmed New Zealand Beef Cattle. Animals 2021, 11, 2509. [CrossRef]
47. Williams, J.L.; Łukaszewicz, M.; Bertrand, J.K.; Misztal, I. Genotype by Region and Season Interactions on Weaning Weight in
United States Angus Cattle. J. Anim. Sci. 2012, 90, 3368–3374. [CrossRef]
48. Casasús, I.; Sanz, A.; Villalba, D.; Ferrer, R.; Revilla, R. Factors Affecting Animal Performance during the Grazing Season in a
Mountain Cattle Production System. J. Anim. Sci. 2002, 80, 1638–1651. [CrossRef]
49. Janovick, N.A.; Russell, J.R.; Strohbehn, D.R.; Morrical, D.G. Productivity and Hay Requirements of Beef Cattle in a Midwestern
Year-Round Grazing System. J. Anim. Sci. 2004, 82, 2503–2515. [CrossRef]
50. Bowen, J.M.; Haskell, M.J.; Miller, G.A.; Mason, C.S.; Bell, D.J.; Duthie, C.A. Early Prediction of Respiratory Disease in Preweaning
Dairy Calves Using Feeding and Activity Behaviors. J. Dairy Sci. 2021, 104, 12009–12018. [CrossRef]
51. Becker, C.A.; Aghalari, A.; Marufuzzaman, M.; Stone, A.E. Predicting Dairy Cattle Heat Stress Using Machine Learning Techniques.
J. Dairy Sci. 2021, 104, 501–524. [CrossRef]
52. Uckardes, F.; Narinc, D.; Kucukonder, H.; Rathert, T.C. Application of Classification Tree Method to Determine Factors Affecting
Fertility in Japanese Quail Eggs. J. Anim. Sci. Adv. 2014, 4, 1017–1023. [CrossRef]
53. Kramer, E.; Cavero, D.; Stamer, E.; Krieter, J. Mastitis and Lameness Detection in Dairy Cows by Application of Fuzzy Logic.
Livest. Sci. 2009, 125, 92–96. [CrossRef]
54. Kamphuis, C.; Mollenhorst, H.; Heesterbeek, J.A.P.; Hogeveen, H. Detection of Clinical Mastitis with Sensor Data from Automatic
Milking Systems Is Improved by Using Decision-Tree Induction. J. Dairy Sci. 2010, 93, 3616–3627. [CrossRef]
55. Piwczyński, D.; Nogalski, Z.; Sitkowska, B. Statistical Modeling of Calving Ease and Stillbirths in Dairy Cattle Using the
Classification Tree Technique. Livest. Sci. 2013, 154, 19–27. [CrossRef]
Animals 2023, 13, 1956 16 of 16
56. Pietersma, D.; Lacroix, R.; Lefebvre, D.; Wade, K.M. Induction and Evaluation of Decision Trees for Lactation Curve Analysis.
Comput. Electron. Agric. 2003, 38, 19–32. [CrossRef]
57. Ortiz-Pelaez, Á.; Pfeiffer, D.U. Use of Data Mining Techniques to Investigate Disease Risk Classification as a Proxy for Compro-
mised Biosecurity of Cattle Herds in Wales. BMC Vet. Res. 2008, 4, 24. [CrossRef] [PubMed]
58. Shahinfar, S.; Page, D.; Guenther, J.; Cabrera, V.; Fricke, P.; Weigel, K. Prediction of Insemination Outcomes in Holstein Dairy
Cattle Using Alternative Machine Learning Algorithms. J. Dairy Sci. 2014, 97, 731–742. [CrossRef] [PubMed]
59. Montgomery, M.E.; White, M.E.; Martin, S.W. A Comparison of Discriminant Analysis and Logistic Regression for the Prediction
of Coliform Mastitis in Dairy Cows. Can. J. Vet. Res. 1987, 51, 495. [PubMed]
60. Yang, X.Z.; Lacroix, R.; Wade, K.M. Investigation into the Production and Conformation Traits Associated with Clinical Mastitis
Using Artificial Neural Networks. Can. J. Anim. Sci. 2000, 80, 415–426. [CrossRef]
61. Basarab, J.A.; Rutter, L.M.; Day, P.A. The Efficacy of Predicting Dystocia in Yearling Beef Heifers: II. Using Discriminant Analysis.
J. Anim. Sci. 1993, 71, 1372–1380. [CrossRef] [PubMed]
62. Pastell, M.E.; Kujala, M. A Probabilistic Neural Network Model for Lameness Detection. J. Dairy Sci. 2007, 90, 2283–2292.
[CrossRef]
63. Abell, C.E.; Johnson, A.K.; Karriker, L.A.; Millman, S.T.; Stalder, K.J. Using Classification Trees to Detect Lameness in Sows. Anim.
Ind. Rep. 2013, AS 659, ASL R2830.
64. Tambuyzer, T.; de Waele, T.; Meyfroidt, G.; Berghe, G.; Goddeeris, B.M.; Berckmans, D.; Aerts, J.-M. Algorithms of Biomarkers
for Monitoring Infection/Inflammation Processes in Pigs. In Proceedings of the Animal hygiene and sustainable livestock
production, XVth International Congress of the International Society for Animal Hygiene, Vienna, Austria, 3–7 July 2011; Volume
1, pp. 155–158.
65. Heirbaut, S.; Jing, X.P.; Stefańska, B.; Pruszyńska-Oszmałek, E.; Buysse, L.; Lutakome, P.; Zhang, M.Q.; Thys, M.; Vandaele, L.;
Fievez, V. Diagnostic Milk Biomarkers for Predicting the Metabolic Health Status of Dairy Cattle during Early Lactation. J. Dairy
Sci. 2023, 106, 690–702. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.