Bioinformatics analysis was employed to identify drought-resilient genes, focusing on a selected group of ten gene symbols in
Oryza sativa L. The findings summarised in Table 1 highlight several rice genes critically associated with tolerance to water deficit. The table lists 10 specific genes demonstrated to play significant roles in mediating the plant’s adaptive responses to drought-induced stress. The identified gene symbols include LOC4323843, LOC4337576, LOC4343219, LOC4347708, LOC4346 460, LOC4345942, LOC4327045, LOC4325652, LOC4 334350 and LOC4326935. All identified genes were classified as protein-coding and verified under the Reference Sequence (RefSeq) database’s categories (validated or model), ensuring the accuracy of their reference sequence information.
Each gene occupies a unique chromosomal position and is annotated with an individual locus tag, such as OSNPB_010971100 for LOC4323843, OSNPB_ 0501 07900 for LOC4337576, OSNPB_070476900 for LOC43 43219, OSNPB_090537700 for LOC4347708, OSNPB_ 090133600 for LOC4346460, OSNPB_ 080499800 for LOC4345942, OSNPB_010785700 for LOC4327045, OSN PB_010184900 for LOC4325652, OSNPB_030786400 for LOC4334350 and OSNPB_ 010702500 for LOC4326935. All analysed genes originate from the
Oryza sativa Japonica group and each is associated with several alternative designations recorded across scientific references and databases, such as DI19-5, OsDi19-5, OsJ_004810 and P0518C01.8 for LOC4323843; DI1, DI19-6, OsDi19-6 and OsJ_016081 for LOC4337576; cdsp32, OsCDSP32 and OsJ_24223 for LOC4343219; PAP2 for LOC4346460; OJ1118_A06.9 for LOC4345942; R1G1A, bip104, SSRP1-A and OsJ_00664 for LOC4325652; and rDRP1 and rab25 for LOC4326935; signifying the terminological variations in gene naming for drought tolerance (Table 1).
The genomic context analysis of Chromosome 1 - NC_089035.1, Chromosome 5 - NC_089039.1, Chromos ome 7 - NC_089041.1, Chromosome 9 - NC_089043.1, Chromosome 9 - NC_089043.1, Chromosome 8 - NC_ 089042.1, Chromosome 1 - NC_089035.1, Chromosome 1 - NC_089035.1, Chromosome 3 - NC_089037.1 and Chromosome 1 - NC_089035.1 indicated a high density of drought-responsive loci distributed along proximal as well as distal regions. Multiple candidate genes were identified, notably including transcription factor family members, which show strong associations with tolerance to abiotic stresses (Fig 1A-1J).
The observed clustering of stress-responsive genes within syntenic chromosomal regions indicates the presence of potential regulatory hubs supporting adaptive functions. Furthermore, conserved promoter motifs and cis regulatory sequences were identified upstream of these candidate loci, underscoring their likely involvement in transcriptional control during water-deficit stress. Collectively, chromosome 1 emerges as a pivotal genomic region governing several drought-resilience pathways in rice.
Ten rice genes encoding proteins implicated in drought stress resistance were further characterised by mapping gene identifiers to corresponding symbols and compiling details regarding their gene sequences and protein products. Transcript sizes ranged from 994 nucleotides in LOC4323843 to 2529 nucleotides in LOC4325652. At the protein level, polypeptide lengths spanned from 202 amino acids (LOC4323843) to 641 amino acids (LOC4325652). These variations demonstrate extensive structural and functional diversity among drought-associated genes and their encoded proteins, with findings summarised in Table 2.
The genomic context analysis of genomic sequences LOC4323843, LOC4337576, LOC4343219, LOC4347708, LOC4346460, LOC4345942, LOC4327045, LOC4325652, LOC4334350 and LOC4326935 revealed that these loci are positioned within drought-responsive regions and encode transcripts containing predicted regulatory and functional domains linked to stress adaptation. The constructed gene models display distinct exon–intron structures, generating transcripts that are translated into proteins carrying conserved motifs associated with abiotic stress signalling pathways. Functional annotation further indicates that these products may operate as transcriptional regulators or stress-responsive proteins, thereby contributing to cellular stability under conditions of water scarcity. Additionally, surrounding genomic regions harbour other stress-related genes, indicating potential co-regulation or coordinated expression patterns during drought exposure (Fig 2A-2J).
The examined gene data were cross-referenced with standard databases to retrieve complete mRNA and protein sequences, along with source sequence details and UniProt accession numbers. Analyses demonstrated that each gene possesses dual identifiers corresponding to both its mRNA transcript and encoded protein, such as NP_001359124.1, NP_001406959.1, P_001409084.1, P_015651317.1, NP_001390665.1, XP_015648389.1, XP_015622164.1, NP_001388399.1, XP_015628802.1 and NP_001393222.1. This highlights the reliability of molecular characterisation for these genes within genomic repositories (Table 3).
The results in Table 4 and Table 5 depict that the source sequence for every gene is also documented, for instance, AK069516 and AU057067 for LOC4323843; AP014961, AY335486 and EG709698 for LOC4337576; AU172639, CB653462, CI167083, CI286584 and CI688566 for LOC4343219; A0A0P0XPV8, A2Z3I6, A3C107, Q69JX7 and Q940D3 for LOC4347708; CA754409, CB678401 and CI199149 for LOC4346460; AB117991, AK066832 and AK068959 for LOC4325652; and AY333185 for LOC43 26935. The UniProt Knowledgebase’s (UniProtKB) TrEMBL database also contains corresponding entries, which are automatically generated protein sequences, thereby broadening the scope of the analysis and facilitating monitoring of predicted protein models. Such integration improves the dependability of datasets used in functional annotation and diagnostic assessments of genes. Moreover, these findings represent a critical step in clarifying links between nucleotide sequence and protein architecture, ultimately supporting the development of computational diagnostic tools for drought stress resilience.
A phylogenetic tree was generated using alignment scores for NC_089035.1, NC_089039.1, NC_089041.1, NC_089043.1, NC_089043.1, NC_089042.1, NC_08903 5.1, NC_089035.1, NC_089037.1 and NC_089035.1, revealing a close evolutionary association between the genomic sequences. The clustering arrangement demonstrates strong sequence conservation, indicating probable functional similarity and shared evolutionary origins. Branch length measurements displayed limited divergence, reinforcing the hypothesis that these loci trace back to a common ancestral gene. Furthermore, the grouping of these sequences within the same phylogenetic clade emphasises their likely contribution to preserving conserved biological roles, potentially linked to mechanisms of drought tolerance. This conserved evolutionary signature offers additional support for the functional relevance of these genomic loci in
Oryza sativa L. (Fig 3A-3J).
Plants adapt to environmental challenges by modulating their physiological processes and developmental programs through genome-wide changes in gene expression. Epigenetic regulators, including deoxyribonucleic acid (DNA) methylation and demethylation, are thought to exert critical influence in this process
(Wang et al., 2011). The degree of drought tolerance exhibited by any plant species is largely determined by the presence and efficiency of genetic mechanisms underlying adaptation
(Kim et al., 2020). Bioinformatics-based platforms and analytical tools are highly valuable for evaluating plant responses to abiotic stressors (
Neelapu and Chaitanya, 2024).
High-throughput technologies, such as RNA sequencing (RNA-seq), enable detailed profiling of differential gene expression, offering insights into genes central to stress resilience. The integration of bioinformatics, genomics and next-generation sequencing provides a deeper understanding of molecular pathways responsible for tolerance to diverse stress conditions. Such knowledge can be strategically applied to accelerate the breeding of stress-resilient crops and to enhance both yield performance and crop quality
(Mu et al., 2022).
This research applied an integrative bioinformatics framework to uncover multiple drought-associated candidate genes, including regulatory transcription factors and functional enzymes, using publicly accessible rice transcriptome datasets. The outcomes support the prevailing model in which drought resilience arises from a complex genetic network controlling traits such as stomatal regulation, deep root systems and cellular equilibrium
(Kumar et al., 2015). The capacity to interrogate multi-omics datasets computationally facilitates systematic dissection of this multifactorial trait.
A key strength of the approach was the incorporation of co-expression network methodologies, notably weighted gene co-expression network analysis (WGCNA). This enabled identification of complete gene modules strongly associated with drought-related traits, extending beyond traditional differential expression analysis. Central hub genes within these modules, such as protein kinases mediating signalling events, were recognised as high-priority targets due to their potential to orchestrate the expression of numerous downstream genes
(Ambrosino et al., 2020).
In addition, exploitation of rice pangenome datasets made it possible to detect allelic diversity among these candidate loci across drought-tolerant and drought-susceptible cultivars. This step is especially important, as it connects genetic polymorphisms with functional adaptation, thereby equipping plant breeders with precise molecular markers for targeted selection
(Gao et al., 2019).
Thus, while bioinformatics offers a powerful hypothesis-generating engine, the functional validation of predicted genes remains essential. The candidates identified here should be meticulously tested by utilising clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 gene editing to generate knockouts and/or overexpression lines in pertinent genetic backgrounds for substantiating their role in drought resilience by means of exhaustive phenotypic and physiological evaluation
(Razzaq et al., 2021).
Future studies should aim to integrate additional layers of biological data to enhance prediction accuracy. This encompasses incorporating epigenomic data to capture how DNA methylation and histone modifications control stress memory, as well as metabolomic profiles for linking genetic regulation with physiological outcomes
(Zitnik et al., 2019). Moreover, machine learning models trained on multi-omics data from field-grown plants under diverse drought regimes hold strong potential for estimating the most effective gene combinations for breeding
(Crossa et al., 2025).
Bioinformatics tools enable the identification of stress-responsive genes, molecular markers and regulatory elements, whereas AI techniques strengthen predictive modelling, inference of gene regulatory networks and actual plant monitoring. Together, these innovations are vital for formulating stress-resilient plant varieties capable of thriving under increasingly extreme environmental conditions caused by global climate change and anthropogenic pressures. Practical applications encompass estimating drought-resistant gene variants, determining salt-tolerant crop cultivars and enabling real-time monitoring of plant health under extreme temperatures by means of AI-driven phenomics platforms
(Zhang et al., 2024).