Descriptive statistics
 
Descriptive summary of milk production traits for Holstein Friesian is illustrated in Table 1 below. The highest value of mean was that of milk yield per 30 days which was 997.92 L. Fat had the lowest mean value of 1.59% and the highest coefficient of variation of 32.33% whereas Solid-not-fat had the lowest coefficient of variation of 5.55%.
 
Correlation matrix
 
Phenotypical association between constituents of milk and milk yield is shown in Table 2. The findings showed that there was a highly negative significant connection (p<0.01) between milk yield per day (MYD) with protein percentage (PP) and solid-not-fat (SNF). Milk yield per 30 days (MY30D) showed highly negative statistical correlation (p<0.01) with FP and PP. There was no association (p<0.05) between MY30D and lactose percentage (LP). MY30D showed a highly significant link (p<0.01) with SNF.
 
Amplified nucleotide sequence analysis
 
Fig 1 shows amplified PCR products of 
β-LG gene in Holstein Friesian cows used in this study. The amplicon size of 447bp was generated during the amplification.
 
Sequence analysis and alignment on 5174T>C of exon 4
 
The analysis and alignment of sequence 5174T>C of exon 4 is shown on Fig 2. Intron 3, exon 4 and intron 5 of 
β-LG gene, were sequenced and two SNPs were detected on exon 4 whereas three were found on intron 3 and no SNP was found on intron 5 (Fig 2A). A polymorphism was detected with nucleotide transition from thymine (T) to cytosine (C) at position 5174 of exon 4 when compared with the 
β-LG gene
 (accession number: X14710.1). Blast was used to find the pairwise alignments of DNA as highlighted in Fig 2B. The sequence alignment results showed 5174T>C as the SNP position (red line). Blast was used to determine the protein sequence alignment as indicated in Fig 2C. The results indicated nonsynonymous SNP as highlighted with a red box. Isoleucine (I) amino acid changed to valine (V) at position 882 was found by comparing it with the 
β-LG gene (acc. no. Np_776354.2).
 
Sequence analysis and alignments on 5123C>G of intron 3
 
The analysis and alignment of sequence 5123C>G of intron 3 is shown on Fig 3. A polymorphism was found with nucleotide transition from cytosine (C) to guanine (G) at position 5123 of intron 3 when compared with the 
β-LG gene (accession number: X14710.1) (Fig 3A). Blast was used to find the pairwise alignments of DNA as indicated in Fig 3B. The sequence alignment results showed 5123C>G as the location of the SNP as highlighted in red line.
Sequence analysis and alignment on 4982G>A of intron 3
 
Sequence analysis and alignment of 4982G>A of intron 3 is shown on Fig 4. Nucleotide transition from guanine (G) to adenine (A) polymorphism was detected at position 4982 of intron 3 when compared with the 
β-LG gene (accession number: X14710.1) (Fig 4A). Blast was used to find the pairwise alignment of DNA as indicated in Fig 4B. The sequence alignment results showed 4982G>A as the position of the SNP as highlighted with the red line.
 
Sequence analysis and alignments on 5099T>C of intron 3
 
Gene sequence analysis and alignments on 5099T>C of intron 3 is shown on Fig 5. A polymorphism was detected with nucleotide transition from thymine (T) to cytosine (C) at position 5099 of intron 3 when compared with the 
β-LG gene (accession number: X14710.1) (Fig 5A). Blast was used to find the pairwise alignment of DNA. As indicated in Fig 5B. The sequence alignment results showed 5099T>C as the SNP location as highlighted in the red line.
Sequence analysis and alignments on 5251C>T of exon 4
 
The analysis and alignment of sequence 5251C>T of exon 4 is shown on Fig 6. Polymorphism was detected with nucleotide transition from cytosine (C) to thymine (T) at position 5251 of exon 4 when compared with the 
β-LG gene (accession number: X14710.1) (Fig 6A). The sequence alignment results showed 5251C>T as the SNP position (red line) (Fig 6B). The protein sequence alignment as determined by Blast indicated nonsynonymous SNP as highlighted in red box (Fig 6C). Furthermore, a glycine (G) amino acid change to aspartic acid (D) at 852 position was detected when comparing experimental samples with the 
β-LG gene (acc. no. Np_776354.2).
 
Genotypic and allelic frequencies
 
The allelic and genotypic frequencies of 
β-LG gene locus in the Holstein Friesian population are shown in Table 3. Two alleles and two genotypes (homozygous and heterozygous) were noted for each SNP. For SNPs 5174T>C, 5123C>G, 4982G>A, 5099T>C and 5251C>T, allelic frequencies of T, C, G, T and C were higher than that of C, G, A, C and T respectively. Genotypic frequencies of CT, CC, AG, CT and TC were higher than the genotypic frequencies of TT, GC, GG, TT and CC for SNPs 5174T>C, 5123C>G, 4982G>A, 5099T>C and 5251C>T respectively. The Chi-square (x2) test for 5123C>G showed that genotypic and allelic frequencies were not significantly different from the expectations of Hardy-Weinberg (X2 = 1.23). The results indicate a constant genotypic and allelic frequencies of population from generation to generation. However, 5174T>C, 4982G>A, 5099T>C and 5251C>T SNPs were tested and displayed incredible genetic imbalance between alleles (P>0.05). The results indicate that from generation to generation, genotypic and allelic frequencies of population changes.
 
Polymorphism information analysis
 
The genetic diversity and polymorphism information analysis of the population are shown in Table 4. The homozygosity of the 
beta lactoglobulin gene was higher than the heterozygosity of it for single nucleotide polymorphisms 5174T>C, 5123C>G, 4982G>A, 5099T>C and 5251C>T with effective allele number (
Ne) of 1.92, 1.22, 1.92, 1.92 and 1.72 respectively. Polymorphisms information content (
PIC) indicated that there were high polymorphisms within the Holstein Friesian population for SNPs 5174T>C, 4982G>A, 5099T>C and 5251C>T. However, it showed that there were moderate polymorphisms within the Holstein Friesian population for SNP 5123C>G.
 
Association analysis of β-LG gene with milk production traits
 
Marker-traits association are displayed in Table 5. The results indicated that genotypes (TT and CT), (GG and AG) and (TT and CT) of SNPs 5174T>C, 4982G>A and 5099T>C respectively, were not significantly different from MY30D, PP and SNF (p>0.05). However, they had significant difference with MYD, FP and LP (p < 0.05), with genotypes TT, GG and TT performing better than CT, AG and CT respectively for MYD. Whereas CT, AG and CT performed higher than TT, GG and TT for FP respectively.  5123C>G SNP showed significant difference between CC and CG genotypes with MY30D, FP, SNF and LP (p<0.05). Genotype CC performed well on MY30D and SNF, while CG genotype had a high performance on FP and LP. This SNP showed non-significant difference between CC and CG genotypes with MYD and PP (p>0.05). Significant difference was found between CC and CT genotypes with FP, SNF and LP for SNP 5251C>T (p<0.05), with genotype CC doing well on FP than genotype TC and TC performing better than TT for SNF and LP. However, CC and CT genotypes were not significantly different from MYD, MY30D and PP (p>0.05).
       
The ability to anticipate how one feature will change in response to selection for another makes understanding the relationships between traits vital for improving the quantity and quality of milk produced by dairy animals (
El-Moghazy et al., 2015). There was a negative relationship between milk yield per day with protein % and SNF. An increase in milk yield per 30 days was noted with a decrease on fat % and protein %, but with an increase in SNF. Milk yield had no relation with lactose % and milk yield per day had no relationship with fat %. The findings of this study agree with the study that was conducted by 
El-Moghazy et al. (2015) who discovered that SNF was positively correlated with milk yield of Egyptian Buffaloes, however, this study also disagrees with the same study that found that fat, protein and lactose were positively correlated with milk yield. The findings of the study conducted by 
Alphonsus and Essien (2012) who stated that SNF, fat and protein were not significantly correlated with total milk yield of Friesian × Bunaji and Bunaji cows of Nigeria disagreed with the findings of the present study. This study agrees with the study conducted by 
Yoon et al., (2004) which stated that milk yield was negatively associated with protein and fat of Holstein cows in Korea. The difference between this study and other studies might be because of the use of different species, breed and environment. The result of this study implies that decreasing protein % and SNF increases milk yield per day, furthermore, fat % and lactose % does not affect milk yield per day. An increase in SNF increases milk yield per 30 days, whereas increasing fat and protein leads to a decrease in milk yield per 30 days. Lactose % does not have any effect on milk yield per 30 days.
       
Findings of the current study revealed 2 nonsynonymous novel SNPs 5174T>C and 5251C>T. This study also noted 3 other novel SNPs 5123C>G, 4982G>A and 5099T>C. A single nucleotide polymorphism (1810C>T) in exon 3 in 
β-LG gene of Chinese Holstein cows was discovered by 
Alim et al., (2015) who investigated DNA polymorphisms in the 
β-LG gene associated with milk production characteristics in Holstein dairy cattle in China. 
Mancini et al., (2013) found a SNP (C>A) at position 968 of upstream gene variant of 
β-LG gene on Italian Brown cattle in Italy. 
Yang et al., (2012) investigated polymorphism in exon 4 of 
β-LG gene different B precursor and its relationship with milk production traits and protein formation in Chinese Holstein and identified 3 nonsynonymous SNPs (5239C>A, 5240A>C, 5305C>T), meaning that three SNPs caused amino acid changes. Disagreement might be because of the differential expression of genes which impacts animal’s production traits. The results of this study suggest that SNPs 5174T>C and 5251C>T causes an amino acid change from isoleucine to valine and glycine to aspartic acid, respectively, which affects structure and function of the protein, meaning that the new protein formed will cause a change in the relationship between the genotypes and the traits. The population used was under Hardy-Weinberg equilibrium (HWE) for SNP 5123C>G. However, it was not under HWE for SNPs 5174T>C, 4982G>A, 5099T>C and 5251C>T. The results of the study that was conducted by 
Alim et al., (2015) indicated that chi-square test for SNP 1810C>T showed all genotypic frequencies in the population to fall under Hardy-Weinberg equilibrium indicating that allele frequencies stayed the same across generations. 
Yang et al., (2012) reported that after chi-square test the 3 SNPs (5240A>C, 5239C>A, 5305C>T) were not under Hardy-Weinberg equilibrium. This study indicate that the studied population is under HWE implying that the allelic and genotypic frequency for SNP 5123C>G of 
β-LG gene on Holstein Friesian cows does not change from generation to generation. However, population studied was not under HWE for SNPs 5174T>C, 4982G>A, 5099T>C and 5251C>T, implying that for these SNPs genotypic and allelic frequency changes from generation to generation.
       
Marker trait association findings for SNPs 5174T>C and 5099T>C indicated that there was no connection between genotype TT and CT with milk yield per 30 days, protein % and SNF statistically. For SNP 4982G>A genotype GG and AG had no association with milk yield per 30 days, protein % and SNF. Genotype CC and CG of SNP 5123C>G had no relationship with milk yield per day and protein %. Marker trait association results for SNP 5251C>T indicated no relationship between genotype CC and CT with protein % and milk yield. Relationship of 
β-LG gene polymorphism with fat, protein and milk yield in Holstein Friesian cattle in Egypt was investigated by 
Zaglool et al., (2016), who found 3 genotypes (AA, AB and BB) and discovered AA had more protein % and milk yield, while BB genotype recorded higher fat %, the results are not in line with the ones of this study. This study for SNPs 5123C>G on milk yield and 5251C>T on fat % agrees with the study that was done by 
Hristov et al., (2011) who found 2 genotypes AA and AB of 
β-LG gene in Bulgarian Black Pied cattle, that revealed BB genotype to have the highest effect on milk yield and fat %. For SNPs 5124T>C, 4982G>A and 5099T>C on SNF and SNP 5251C>T on milk yield, the current study agrees with that of 
Tolenkhomba et al., (2014) that revealed two genotypes AB and BB that had no significant impact on milk yield and SNF of Sahiwal cattle breeds of India. The results of the current study are in contradiction with the ones of the study conducted by 
Dogru (2015) who investigated
 β-lactoglobulin genetic variations in Brown-Swiss dairy cows and their relationship with quality traits and milk yield in Turkey and found no significant association between different genotypes (AA, AB and BB) of 
β-LG gene and milk production constituents. The difference in the current study might be due to different environmental conditions and breeds used.
       
TT genotype for SNPs 5174T>C and 5099T>C of 
β-LG gene might be utilised as genetic marker when enhancing milk yield per day and lactose %, whereas
genotype CT might be used to improve fat %. Genotype CC of SNPs 5123C>G might be used to increase milk yield per 30 days and SNF, while CG be used to improve lactose % and fat %. TC for SNP 5251C>T might be used as a genetic marker to increase SNF and lactose%, whereas CC be used to improve fat%. GG genotype for SNP 4982G>A of 
β-LG gene might be utilised as genetic marker when enhancing milk yield per day and lactose %, whereas genotype AG might be used to improve fat %.