Indian Journal of Animal Research

  • Chief EditorK.M.L. Pathak

  • Print ISSN 0367-6722

  • Online ISSN 0976-0555

  • NAAS Rating 6.50

  • SJR 0.263

  • Impact Factor 0.5 (2023)

Frequency :
Monthly (January, February, March, April, May, June, July, August, September, October, November and December)
Indexing Services :
Science Citation Index Expanded, BIOSIS Preview, ISI Citation Index, Biological Abstracts, Scopus, AGRICOLA, Google Scholar, CrossRef, CAB Abstracting Journals, Chemical Abstracts, Indian Science Abstracts, EBSCO Indexing Services, Index Copernicus
Indian Journal of Animal Research, volume 56 issue 9 (september 2022) : 1158-1163

Comparison of Different Count Models for Investigation of Some Environmental Factors Affecting Stillbirth in Holsteins

Y. Gevrekçi1,*, Ö.İ. Güneri2, Ç. Takma1, A. Yeşilova3
1Department of Animal Science, Faculty of Agriculture, University of Ege, Izmir, Turkey.
2Department of Statistics, Faculty of Science, University of Muðla Sýtký Koçman, Muðla, Turkey.
3Department of Animal Science, Faculty of Agriculture, University of Yüzüncü Yýl, Van, Turkey.
Cite article:- Gevrekçi Y., Güneri Ö.İ., Takma Ç., Yeşilova A. (2022). Comparison of Different Count Models for Investigation of Some Environmental Factors Affecting Stillbirth in Holsteins . Indian Journal of Animal Research. 56(9): 1158-1163. doi: 10.18805/IJAR.BF-1415.
Background: The objective of this study is comparing different count data models for stillbirth data. In modeling this type of data, Poisson regression or alternative models can be preferred.

Methods: The poisson, negative binomial, zero-inflated poisson, zero-inflated negative binomial, poisson-logit hurdle and negative binomial-logit hurdle regressions were compared and used to examine the effects of the gender, parity and herd-year-season independent variables on stillbirth. Furthermore, the Log-Likelihood statistics, Akaike Information Criteria, Bayesian Information Criteria and rootogram graphs were used as comparison criteria for performance of the models. According to these criteria, Negative Binomial-Logit Hurdle Regression model was chosen as the best model. 

Result: The parameter estimates obtained by Negative Binomial-Logit Hurdle Regression model in relation to the effects of the gender, parity and herd-year-season independent variables on stillbirth were found to be significant (p<0.01). It was found that while stillbirth incidence was higher in males than females, it was found to decrease as the parity increased. As a result, the Negative Binomial Logit Hurdle model was found the best model for stillbirth count data with overdispersion.
Count data are generally right skewed and do not show normal distribution (Pittman et al., 2018). It can be expressed as observations consisting of non-negative integers that can take zero or greater than zero value and have a discrete distribution. In modeling this kind of data, poisson regression and other models can be preferred according to mean and variance.
       
Poisson regression model has equidispersion distribution with a single parameter, which mean is equal to its variance. However, equality of mean and variance rarely occurs in practice. The case where the variance is greater than the mean is called overdispersion and on the contrary, if it is smaller than the mean, it is called underdispersion (Cox, 1983). Performing the analysis without taking into account the overdispersion leads to underestimation of the standard errors of the parameters and misinterpretation.
       
Quasi-poisson, negative binomial (poisson gamma) and generalized poisson and generalized negative binomial regression models are used in overdispersion (Hilbe, 2007; Zeileis et al., 2007). When the dependent variable has zero values; zero-inflated poisson (ZIP), zero-inflated negative binomial (ZINB), zero-inflated generalized poisson (ZIGP), zero-inflated generalized negative binomial (ZIGNB), poisson hurdle (PH) and negative binomial hurdle (NBH) models are used (Greene, 1994; Famoye and Karan, 2006; Zhuo et al., 2008; Yeşilova et al., 2010).
       
Garaya et al., (2011) reported that many disciplines such as medicine (Böhnin et al., 1999), public health (Zhou and Tu, 2000), environmental science (Agarwal et al., 2002), agriculture (Hall, 2000) and manufacturing practices (Lambert, 1992) using such models in count data. There are various biological materials within agriculture area with Zero-Inflated and/or poisson distribution structure (Nehara et al., 2013; Kokate et al., 2014; Yeþilova and Ýnanç, 2016).
       
On the other hand, stillbirth and calving difficulty are economically important traits in terms of the profitability of the cattle enterprises. Therefore, many studies have been conducted on stillbirth (Eriksson et al., 2004; Gevrekçi, 2006; Gevrekçi et al., 2006; Zaborski et al., 2014; Takma et al., 2016). Meyer et al., (2001) reported that the mortality, calving season, gestation period, calving year, gender, parity and dystocia effects on Holstein’s stillbirth rates in the USA and all factors were found significant.
               
Stillbirth is important record for sustainable agriculture and economy. Therefore, the factors that effect on this variable must be known and controlled. Stillbirth is a count data and it has overdispersion and zero-inflated distribution.

In this study, 6 different count models which are described with overdispersion and zero-inflated are introduced for modelling this variable. Moreover, the effects of gender, parity and herd-year-season (hys) variables on stillbirth were investigated with these models.
The material for this study consists of Holstein calf birth records collected during 2003-2005 from 3980 herds belonging to members of US National Animal Breeders Association in Columbia and Missouri. The calves of 404460 Holstein cows with 15 different parities were scored as calf live (0) and stillbirth (1) if they were live or defined as death within 48 h from parturition, respectively. The months of collected birth records were classified as May-September (1, summer), October-April (2, winter) and the years were grouped as 2003 (1), 2004 (2) and 2005 (3). These groups were combined into a single herd-year-season categorical variable. Also, due to the lack of observations in the six and subsequent lactation records were evaluated under the sixth lactation.
       
In this study, stillbirth count data was analyzed by overdispersion and zero-inflated models; PR, NBR, ZIP, ZINB, PH and NBH. When the data has overdispersion, negative binomial regression (NBR) model is used with the log link function between the dependent variable and the independent variables vector (Hilbe, 2007). The Zero Inflation (ZI) occurs when data sets contain more zeros than expected. Zero Inflated Poisson (ZIP) model is the model that can handle the presence of zero excess in the observed data.
       
In this study, PR, NBR, ZIP, ZINB, PH and NBH count models were used. PH and NPH regression models are modeled in two different stages. The first stage is known as the transation stage and shows a binomial distribution. At this stage, the logit model is used to model non-zero counts versus zero counts. The second stage is known as the event stage, where PH and NBH use truncated Poisson and truncated negative binomial distributions, respectively, to model only observations greater than zero (Hilbe, 2007). In all regression models, parameter estimates are obtained using the maximum likelihood (ML) method and the Log Likelihood equations of count models are given in Table 1.
 

Table 1: Log likelihood of count models.


       
In the comparison of count models, Log-Likelihood (LL), Akaike Information Criteria (AIC), Bayes Information Criteria (BIC), statistics were used (Table 2).
 

Table 2: The formulas of comparison criteria.


       
In the model selection, with the largest LL value and the smallest AIC and BIC values is determined as the best model. Rootogram charts are another method used in model comparison. In the study all statistical analyses were using STATA 14 and R 4.0.4 programs.
In this study, the number of observations of live births (“0”) for the stillbirth dependent variable was found to be 376295 with 95.56% ratio, while the number of observations for still birth (“1”) was found to be 17476 with 4.4% ratio. It was observed that 95.56% of the data (376295 observations) took the value 0. This shows that the data set is suitable for zero value inflated models. In this study, the zero-inflation value is greater than 1.335, showing that zero inflation is effective.
       
The frequency and percentage values of the subgroups of the gender and parity independent variables are summarized in Table 3. The hys independent variable is not given in the table due to the large number of subgroups (n=16321).
 

Table 3: The frequencies and percentages of gender and parity.


       
In this study, distribution plot of stillbirth count data is given in Fig 1.
 

Fig 1: Frequency distribution plot of stillbirth data.


       
From Fig 1, it can be visualized that stillbirth data have positive skewed distribution. The results of the fitted criteria used in the selection of the model that best fits the data set are given in Table 4. With the comparison of all models, it was determined that the NBH model fits the data better than the other models with the largest LL value and the smallest AIC and BIC values. The smallest LL and the largest AIC and BIC values were observed with the least fit in the Poisson regression model. The mean of the dependent variable was found to be greater than its variance as 0.715 and 0.328, respectively. Pearson statistics c2 showing the inflation state in poisson regression was obtained as 3.767. Since this statistic value is greater than 1, there is overdispersion in the dependent variable.
 

Table 4: Model selection criteria.


 
Rootogram diagrams were used to visually compare the models used in the study (Fig 2).
 

Fig 2: Rootogram diagrams of models.


       
The bars in the rootogram diagrams show the difference in the square roots of the estimated and observed stillbirth numbers. The rootogram diagrams of the models used in this study support the comparison results reached with the fit criteria (Fig 2).
       
The parameter estimates of effects in NBH model which is accepted as the best model in the study are given in Table 5. All independent variables were found statistically significant. The incidence rate ratio values [Exp (b)] in the Logit section indicate that the change between the genders caused a decrease of 13.33% according to the condition of not observing stillbirth in calves. The changes on parity and hys were increased as 1.030 and 1.00004, for stillbirth counts, respectively. In the Truncated Negative Binomial (Log section) part, the incidence of stillbirth was found to be 25.9% less in female calves compared to male calves. It was observed that the incidence of stillbirth decreased by 53.8% as the parity increased. As hys changed, stillbirth incidence increased 1.00001 times.
 

Table 5: Parameter estimates of NBH regression model.


 
However, the gender, parity and hys effects do not provide detailed information between the levels of each categorical independent variable according to the dependent variable in Table 5. In this study, male calves were taken as the reference value for the gender independent variable and the 1st lactation value was taken as the reference value for the parity independent variable.
       
In Table 6, the significance values and coefficients of independent variables and the incidence rate ratio values [Exp (b)] on stillbirth are given by reference categories. The logit part revealed that the incidence of stillbirth in females is 14.7% higher than males in terms of not being observed. When the effect of parity was examined, it was determined that death births decreased as the parity increased. As a matter of fact, stillbirth in parity 2 decreased by 4.9% compared to parity 1, stillbirth in parity 3 decreased 4.1% compared to parity 1, stillbirth in parity 4 decreased by 2.6% compared to parity 1, stillbirth decreased by 1.3% compared to parity 1 and stillbirth in parity 6 decreased by 0.3% compared to parity 1. In the log part, stillbirths are 26.6% less in female calves than in males. Stillbirths in parity 2 decreased by 74.3% compared to parity 1, stillbirths in parity 3 decreased by 84.9% compared to parity 1, stillbirths in parity 4 decreased by 89.5% compared to parity 1, stillbirths in parity 5 decreased by 93.7% compared to parity 1 and stillbirths in parity 6 decreased by 95.2% compared to parity 1. Due to the large subgroup numbers of hys independent variable, they are not given in the Table 6.
 

Table 6: Parameter estimates of reference categories in NBH regression model.


       
The six different count models (PR, NBR, ZIP, ZINB, PH and NBH) were compared for analyzing stillbirth count data. NBH was chosen as the best model among these models.
       
In this study, it was observed that the rate of stillbirth in the first lactation was higher than the others. This result was found to be compatible with the literature (Meyer et al., 2001; Gevrekçi, 2006). Some researchers (Berglund and Philipsson, 1992; Steinbock et al., 1997) stated that stillbirth is less associated with excess birth weight and calving difficulty, while others (Berglund, 1996; Philipsson et al., 1998) reported that the rate of stillbirth is higher in first calving. Takma et al., (2016) modeled the effect of the gender, parity and herd-season independent variables on whether there was stillbirth in Holstein Friesian calves using logistic regression. They reported that the stillbirth rate decreased with the increase in the parity and that stillbirth rates in the summer period were higher than in the winter period. The effects of gender, parity and hys as independent variables were found significant in NBH. It has been revealed that the stillbirth count data is dependent on the gender characteristics of Holstein Friesian cows. In addition, with the increasing of the parity, the stillbirth has also increased.
Consequently, the parity, gender of calf, gestation length, the genetic potential of calf and management-feeding factors have effect on stillbirth which are economically important traits in dairy farming are affected by environmental conditions. It is important to know the categorical factors and the amount of effect of these factors in terms of controlling stillbirth in calves. Therefore, the number of stillbirths and the factors affecting these numbers should be accurately estimated with appropriate models. With the improvement programs to be implemented to reduce stillbirths in dairy cattle by using reliable estimates, it will be possible to prevent economic losses and increase the profitability of the enterprises.
The authors wish to thank Dr. Daniel Gianola, Dr. Kent Weigel and US National Association of Animal Breeders for providing data. TUBITAK (The Scientific and Technological Research Council of Turkey) and Ege University Research Institute are gratefully acknowledged for financial support.

  1. Agarwal, D.K., Gelfand, A.E., Pousty, S.C. (2002). Zero-inflated models with application to spatial count data. Environmental and Ecological Statistics. 9: 341-355. DOI: 10.1023/ A:1020910605990.

  2. Berglund, B. and Philipsson, J. (1992). Increasing Stillbirth Rate in the Swedish Friesian Population. Proceedings of the 43rd Annual Meeting EAAP. Madrid, Spain.

  3. Berglund, B. (1996). Ongoing Research on the Causes of Variation in Calving Performance and Stillbirths in Swedish Dairy Cattle. Interbull Bulletin. 12: 78-83. 

  4. Böhning, D., Dietz, E., Schlattmann, P. (1999). The zero inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. Journal of the Royal Statistical Society A. 162: 195-209. DOI: 10.1111/1467-985X.00130.

  5. Cox, D.R. (1983). Some remarks on overdispersion. Biometrika. 70(1): 269-274. DOI: 10.2307/2335966.

  6. Dalrymple, M.L., Hudson, I.L., Ford, R.P.K. (2003). Finite mixture, zero-inflated poisson and hurdle models with application to SIDS. Computational Statistics and Data Analysis. 41: 491-504. DOI: 10.1016/S0167-9473(02)00187-1.

  7. Eriksson, S. Näsholm, A., Johansson, K., Philipsson, J. (2004). Genetic parameters for calving difficulty, stillbirth and birth weight for Hereford and Charolais at first and later parities. Journal of Animal Science. 82: 375-383. DOI: 10.2527/ 2004.822375x.

  8. Famoye, F. and Karan, P.S. (2006). Zero-inflated generalized Poisson regression model with an application to domestic violence data. Journal of Data Science. 5(4): 117-130.

  9. Garaya, A.M., Hashimotob, E.M., Ortegab, E.M.M, Lachos, L.H. (2011). On estimation and influence diagnostics for zero-inflated negative binomial regression models. Computational Statistics and Data Analysis. 55(3): 1304-1318. DOI: 10.10 16/j.csda.2010.09.019.

  10. Gevrekçi, Y., Chang, Y., Kızılkaya, K., Gianola, D., Weigel, K., Akbaş, Y. (2006). Bayesian inference for calving ease and stillbirth in Holsteins using a bivariate threshold sire-maternal grandsire model. Proceedings of 8th World Congress of Genetics Applied to Livestock Production. Belo Horizonte, Brazil.

  11. Gevrekçi, Y. (2006). Estimation of genetic parameters of calving ease and stillbirth as threshold traits by using Gibbs Sampling. PhD Thesis. Ege University, Turkey.

  12. Greene, W.H. (1994). Accounting for excess zeros and sample selection in Poisson and negative binomial regression models. Working Paper. Department of Economics, New York University. pp. 94-10, 

  13. Hall, D.B. (2000). Zero inflated poisson and binomial regression with random effects: A case study. Biometrics. 56(4): 1030-1039.  

  14. Hilbe, J.M. (2007). Negative Binomial Regression. Cambridge University Press, Cambridge, UK. DOI: 10.1017/CBO9780511973420.

  15. Kokate, L.S., Singh, A., Banu, R., Gandhi, R.S., Chakravarty, A. K., Gupta, A.K., Sachdeva, G.K. (2014). Prediction of 305-day lactation milk yield based on bimonthly test day values in Karan Fries cattle. Indian Journal of Animal Research. 48(2):103-105. DOI: 10.5958/j.0976-0555.48.2.023.

  16. Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 34(1): 1-13. DOI: 10.1080/00401706.1992.10485228.

  17. Meyer, C.L., Berger, P.J., Koehler, K.J., Thompson, J.R., Sattler, C.G. (2001). Phenotypic trends in incidence of stillbirth for Holsteins in the United States. Journal of Dairy Science. 84: 515-523. DOI: 10.3168/jds.S0022-0302(01)74502-X.

  18. Nehara, M., Singh, A., Gandhi, R.S., Chakravarty, A.K., Gupta, A.K., Sachdeva, G.K. (2013). Phenotypic, genetic and environmental trends in milk yield and milk production efficiency traits in Karan Fries cattle. Indian Journal of Animal Research. 47: 402-06.

  19. Philipsson, J., Steinbock, L., Berglund, B. (1998). Considering stillbirths in the breeding program. Interbull Bulletin. 18: 25-27.

  20. Pittman, B., Buta, E., Krishnan-Sarin, S., O’Malley, S.S., Liss, T., Gueorguieva, R. (2018). Models for analyzing zero- inflated and overdispersed count data: an application to cigarette and marijuana use. Nicotine and Tobacco Research. 22(8): 1390-1398. DOI: 10.1093/ntr/nty072.

  21. Steinbock, L., Gates, P., Berglund, B., Philipsson, J. (1997). Direct and maternal genetic effects on stillbirths at different parities in Swedish Holsteins. Proceedings of 48th Annual Mtg. EAAP. Vienna, Austria. 

  22. Takma Ç., İşçi Güneri Ö., Gevrekçi Y. (2016). Investigation of stillbirth rate using Logistic Regression Analysis in Holstein Friesian calves. Journal of Agriculture Faculty of Ege University. 53: 245-250. DOI: 10.20289/zfdergi. 389278.

  23. Yeşilova, A., Kaydan, B., Kaya, Y. (2010). Modelling insect-egg data with excess zeros using zero-inflated regression models. Hacettepe Journal of Mathematics and Statistics. 39(2): 273-282. 

  24. Yeşilova, A. and İnanç, D.E. (2016). Modeling mite counts using Poisson and negative binomial regressions. Fresenius Environmental Bulletin. 25: 5062-5066. DOI: 10.18805/ ijare.v50i6.6674.

  25. Zaborski, D., Grzesiak, W., Kotarska, K., Szatkowska, I., Jedrzejczak, M. (2014). Detection of difficult calvings in dairy cows using boosted classification trees. Indian Journal of Animal Research. 48: 452-458.

  26. Zeileis, A., Kleiber, C., Jackman, S. (2007). Regression models for count data in R. Journal of Statistical Software. 27: 1-25. DOI: 10.18637/jss.v027.i08.

  27. Zhou, X. and Tu, W. (2000). Confidence intervals for the mean of diagnostic test charge data containing zeros. Biometrics. 56(4): 1118-1125. 

  28. Zhuo, L., Stacey, K., Lawrence, J.C., Lisa, K.H., Richard, H.L.M.O. (2008). Modeling motor vehicle crashes for street racers using zero-inflated models. Accident Analysis and Prevention. 40: 835-839. DOI: 10.1016/j.aap.2007.09.022.

Editorial Board

View all (0)