Pipeline to Investigate Genomic Erosion Indices for the Conservation of Agrobiodiversity

A
Anupama Roy1,2
L
Lashika Meena1
N
Nilesh Joshi3
M
Mir Asif Iquebal1
D
Dinesh Kumar1
S
Sarika Jaiswal1,*
1Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110 012, India.
2The Graduate School, ICAR-IARI, New Delhi-110 012, India.
3Pulse Chickpea Laboratory, Division of Genetics, ICAR-IARI, New Delhi-110 012, India.
  • Submitted26-10-2024|

  • Accepted14-11-2025|

  • First Online 09-12-2025|

  • doi 10.18805/BKAP811

With the disappearance of harvested species, varieties and breeds, a wide range of unharvested species is also disappearing, the scale of loss is extensive with respect to habitat loss, habitat configuration, overgrazing, and over exploitation of the species over the past centuries has led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding and accumulation of harmful mutations in the population. Still, genomic erosion estimates of modern-day populations lack concordance with the declining population sizes and conservation status of agrobiodiversity. A newly designed highly flexible, scalable, and only pipeline to compare the patterns of genomic erosion using samples from different period using the state-of-the-art bioinformatics tools and techniques to process whole genome re-sequenced data from historical samples and modern samples in order to produce the estimates of several genomic erosion indices that can be compared. Accordingly, extinction risks for different varieties and breeds can be identified and agricultural biodiversity conservation strategy plans can be framed.


  1. Bijlsma, R. and Loeschcke, V. (2012). Genetic erosion impedes adaptive responses to stressful environments. Evolutionary Applications. 5(2): 117-129. https:// doi.org/10.1111/j.1752-4571.2011.00214.x

  2. Bonfield, J.K., Marshall, J., Danecek, P., Li, H., Ohan, V., Whitwham, A., Keane, T. and Davies, R.M. (2021). HTSlib: C library for reading/writing high-throughput sequencing data. GigaScience. 10(2): giab007. https://doi.org/10.1093/gigascience/giab007

  3. Christiansen, H., Heindler, F.M., Hellemans, B. et al. (2021). Facilitating population genomics of non-model organisms through optimized experimental design for reduced representation sequencing. BMC Genomics. 22: 625. https://doi.org/10.1186/s1286 4-021-07917-3.

  4. Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: An ultra- fast all-in-one FASTQ preprocessor. Bioinformatics (Oxford, England). 34(17): i884–i890. https:// doi.org/10.1093/bioinformatics/bty560.

  5. Cingolani, P., Platts, A., Wang, l., Coon, M., Nguyen, T., Wang, L., Land, S.J., Lu, X. and Ruden, D.M. (2012). A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. 6(2): 80-92. https://doi.org/ 10.4161/fly.19695.

  6. Cooper, G.M., Stone, E.A., Asimenos, G., NISC Comparative Sequencing Program, Green, E.D., Batzoglou, S. and Sidow, A. (2005). Distribution and intensity of constraint in mammalian genomic sequence. Genome Research. 15(7): 901-913. https://doi.org/10.110 1/gr.3577405.

  7. Davydov, E.V., Goode, D.L., Sirota, M., Cooper, G.M., Sidow, A. and Batzoglou, S. (2010). Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Computational Biology. 6(12): e1001025. https://doi.org/10.1371/journal.pcbi. 1001025

  8. Díez-Del-Molino, D., Sánchez-Barreiro, F., Barnes, I., Gilbert, M. and Dalén, L. (2018). Quantifying temporal genomic erosion in endangered species. Trends in Ecology and Evolution. 33(3): 176-185. https:// doi.org/10.1016/j.tree.2017.12.002.

  9. Flynn, J.M., Hubley, R., Goubert, C., Rosen, J., Clark, A.G., Feschotte, C. and Smit, A.F. (2020). RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of the National Academy of Sciences. 117(17): 9451-9457.

  10. Haubold, B., Pfaffelhuber, P. and Lynch, M. (2010). mlRho - a program for estimating the population mutation and recombination rates from shotgun-sequenced diploid genomes. Molecular Ecology. 19 Suppl 1(Suppl 1): 277-284. https://doi.org/10.1111/j.136 5-294X.2009.04482.x.

  11. Kutschera, V.E., Kierczak, M., van der Valk, T. et al. (2022). GenErode: A bioinformatics pipeline to investigate genome erosion in endangered and extinct species. BMC Bioinformatics. 23: 228. https://doi.org/ 10.1186/s12859-022-04757-0.

  12. Lazaridis, I., Patterson, N., Mittnik, A., Renaud, G., Mallick, S., Kirsanow, K., Sudmant, P.H. et al. (2014). Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature. 513(7518): 409-413. https://doi.org/10.1038/nature 13673.

  13. Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:1303.3997.

  14. Li, H. and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England). 25(14): 1754- 1760. https://doi.org/10.1093/bioinformatics/btp324.

  15. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. and 1000 Genome Project Data Processing Subgroup (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics (Oxford, England). 25(16): 2078-2079. https://doi.org/10.1093/bioinformatics/ btp352  

  16. Mahalakshmi, L. (2022). Identification of genome-wide single nucleotide polymorphisms in indigenous cattle breeds of Tamil Nadu. Indian Journal of Animal Researchdoi: 10.18805/IJAR.B-4934.

  17. Saravanan, R., Murali, N., Thiruvenkadan, A.K. and Das, D. N. (2022). Comparative genome sequence analysis of bovine lymphocyte antigen BoLA-DRB3.2 alleles in Deoni and Ongole (Bos indicus) cattle breeds of India. Indian Journal of Animal Research. 56(7): 783-789. doi: 10.18805/IJAR.B-4357.

  18. Shafer, A.B., Wolf, J.B., Alves, P.C., Bergström, L., Bruford, M.W., Brännström, I., Colling, G. et al. (2015). Genomics and the challenging translation into conservation practice. Trends in Ecology and Evolution. 30(2): 78-87. https://doi.org/10.1016/ j.tree.2014.11.009

  19. Smit, A.F.A., Hubley, R. and Green, P. (2017). 1996-2010. RepeatMasker Open-3.0.

  20. van Oosterhout, C. (2020). Conservation genetics: 50 Years and counting. Conservation Letters. e12789. https://doi.org/10.1111/conl.12789.

Pipeline to Investigate Genomic Erosion Indices for the Conservation of Agrobiodiversity

A
Anupama Roy1,2
L
Lashika Meena1
N
Nilesh Joshi3
M
Mir Asif Iquebal1
D
Dinesh Kumar1
S
Sarika Jaiswal1,*
1Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110 012, India.
2The Graduate School, ICAR-IARI, New Delhi-110 012, India.
3Pulse Chickpea Laboratory, Division of Genetics, ICAR-IARI, New Delhi-110 012, India.
  • Submitted26-10-2024|

  • Accepted14-11-2025|

  • First Online 09-12-2025|

  • doi 10.18805/BKAP811

With the disappearance of harvested species, varieties and breeds, a wide range of unharvested species is also disappearing, the scale of loss is extensive with respect to habitat loss, habitat configuration, overgrazing, and over exploitation of the species over the past centuries has led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding and accumulation of harmful mutations in the population. Still, genomic erosion estimates of modern-day populations lack concordance with the declining population sizes and conservation status of agrobiodiversity. A newly designed highly flexible, scalable, and only pipeline to compare the patterns of genomic erosion using samples from different period using the state-of-the-art bioinformatics tools and techniques to process whole genome re-sequenced data from historical samples and modern samples in order to produce the estimates of several genomic erosion indices that can be compared. Accordingly, extinction risks for different varieties and breeds can be identified and agricultural biodiversity conservation strategy plans can be framed.


  1. Bijlsma, R. and Loeschcke, V. (2012). Genetic erosion impedes adaptive responses to stressful environments. Evolutionary Applications. 5(2): 117-129. https:// doi.org/10.1111/j.1752-4571.2011.00214.x

  2. Bonfield, J.K., Marshall, J., Danecek, P., Li, H., Ohan, V., Whitwham, A., Keane, T. and Davies, R.M. (2021). HTSlib: C library for reading/writing high-throughput sequencing data. GigaScience. 10(2): giab007. https://doi.org/10.1093/gigascience/giab007

  3. Christiansen, H., Heindler, F.M., Hellemans, B. et al. (2021). Facilitating population genomics of non-model organisms through optimized experimental design for reduced representation sequencing. BMC Genomics. 22: 625. https://doi.org/10.1186/s1286 4-021-07917-3.

  4. Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: An ultra- fast all-in-one FASTQ preprocessor. Bioinformatics (Oxford, England). 34(17): i884–i890. https:// doi.org/10.1093/bioinformatics/bty560.

  5. Cingolani, P., Platts, A., Wang, l., Coon, M., Nguyen, T., Wang, L., Land, S.J., Lu, X. and Ruden, D.M. (2012). A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. 6(2): 80-92. https://doi.org/ 10.4161/fly.19695.

  6. Cooper, G.M., Stone, E.A., Asimenos, G., NISC Comparative Sequencing Program, Green, E.D., Batzoglou, S. and Sidow, A. (2005). Distribution and intensity of constraint in mammalian genomic sequence. Genome Research. 15(7): 901-913. https://doi.org/10.110 1/gr.3577405.

  7. Davydov, E.V., Goode, D.L., Sirota, M., Cooper, G.M., Sidow, A. and Batzoglou, S. (2010). Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Computational Biology. 6(12): e1001025. https://doi.org/10.1371/journal.pcbi. 1001025

  8. Díez-Del-Molino, D., Sánchez-Barreiro, F., Barnes, I., Gilbert, M. and Dalén, L. (2018). Quantifying temporal genomic erosion in endangered species. Trends in Ecology and Evolution. 33(3): 176-185. https:// doi.org/10.1016/j.tree.2017.12.002.

  9. Flynn, J.M., Hubley, R., Goubert, C., Rosen, J., Clark, A.G., Feschotte, C. and Smit, A.F. (2020). RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of the National Academy of Sciences. 117(17): 9451-9457.

  10. Haubold, B., Pfaffelhuber, P. and Lynch, M. (2010). mlRho - a program for estimating the population mutation and recombination rates from shotgun-sequenced diploid genomes. Molecular Ecology. 19 Suppl 1(Suppl 1): 277-284. https://doi.org/10.1111/j.136 5-294X.2009.04482.x.

  11. Kutschera, V.E., Kierczak, M., van der Valk, T. et al. (2022). GenErode: A bioinformatics pipeline to investigate genome erosion in endangered and extinct species. BMC Bioinformatics. 23: 228. https://doi.org/ 10.1186/s12859-022-04757-0.

  12. Lazaridis, I., Patterson, N., Mittnik, A., Renaud, G., Mallick, S., Kirsanow, K., Sudmant, P.H. et al. (2014). Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature. 513(7518): 409-413. https://doi.org/10.1038/nature 13673.

  13. Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:1303.3997.

  14. Li, H. and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England). 25(14): 1754- 1760. https://doi.org/10.1093/bioinformatics/btp324.

  15. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. and 1000 Genome Project Data Processing Subgroup (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics (Oxford, England). 25(16): 2078-2079. https://doi.org/10.1093/bioinformatics/ btp352  

  16. Mahalakshmi, L. (2022). Identification of genome-wide single nucleotide polymorphisms in indigenous cattle breeds of Tamil Nadu. Indian Journal of Animal Researchdoi: 10.18805/IJAR.B-4934.

  17. Saravanan, R., Murali, N., Thiruvenkadan, A.K. and Das, D. N. (2022). Comparative genome sequence analysis of bovine lymphocyte antigen BoLA-DRB3.2 alleles in Deoni and Ongole (Bos indicus) cattle breeds of India. Indian Journal of Animal Research. 56(7): 783-789. doi: 10.18805/IJAR.B-4357.

  18. Shafer, A.B., Wolf, J.B., Alves, P.C., Bergström, L., Bruford, M.W., Brännström, I., Colling, G. et al. (2015). Genomics and the challenging translation into conservation practice. Trends in Ecology and Evolution. 30(2): 78-87. https://doi.org/10.1016/ j.tree.2014.11.009

  19. Smit, A.F.A., Hubley, R. and Green, P. (2017). 1996-2010. RepeatMasker Open-3.0.

  20. van Oosterhout, C. (2020). Conservation genetics: 50 Years and counting. Conservation Letters. e12789. https://doi.org/10.1111/conl.12789.
In this Article
Published In
Bhartiya Krishi Anusandhan Patrika

Editorial Board

View all (0)