Analysis of SSR Characterization in Transcriptome for Pampass Grass Cortaderia selloana

R
Rina Wu1
Y
Yujing Liu1
B
Bo Xu1
X
Xiuli Zhang1
S
Shiyu Zuo1
Y
Yanhua Wu1,*
1Liaoning Agricultural Vocational and Technical College, School of Landscape Architecture, Yingkou, China.
  • Submitted09-07-2025|

  • Accepted03-08-2025|

  • First Online 16-09-2025|

  • doi 10.18805/LRF-886

Background: The perennial herbaceous pampas grass (Cortaderia selloana: Poaceae) integrates ecological and ornamental functions, serving both environmental rehabilitation and landscape enhancement. Its robust stress tolerance also positions it as a valuable genetic resource for improving resilience in other plant species. The study aimed to analyze characteristics of simple sequence repeats (SSRs) in the transcriptome of pampass grass (Cortaderia selloana) to lay a foundation for developing molecular markers for this species.

Methods: Using C. selloana leaves, high-throughput sequencing was performed on the NovaSeq × Plus platform. SSR loci were identified from transcriptome data using MISA software, followed by analysis of their distribution and sequence characteristics.

Result: A total of 333.731 reference sequences (unigenes) were obtained, among which 20.684 SSR loci were mined (frequency of occurrence 6.20%, average distribution distance 7.47 kb). SSR repeat motifs were dominated by mononucleotides (7781 SSRs, accounting for 37.62%), trinucleotides (8232 SSRs, 39.80%), and dinucleotides (4064 SSRs, 19.65%). Among 171 repeat motifs, A/T mononucleotide motifs predominated (91.34%), followed by AG/CT dinucleotide motifs (60.58%). SSR repeats were mostly distributed within the 6-15 range, with numbers of all six nucleotide repeat types progressively decreasing as motif repeat count increased. SSR lengths were mostly concentrated in the 12-40 bp range, with SSRs within this length range accounting for 56.90% of all SSR loci. Among these, 3329 SSR loci (16.10%) exhibited higher polymorphism potential (length > 20 bp). SSR loci in C. selloana exhibit high frequency, diverse repeat motifs, high polymorphism potential and considerable potential for research on genetic diversity, molecular identification of cultivars, and development of novel molecular markers in this species.

The perennial herbaceous pampas grass (Cortaderia selloana: Poaceae) is widely used in landscape design and the cut flower industry because of its tall, upright growth habit and dense, ornamental panicles. Beyond its aesthetic value, however, C. selloana is remarkably stress resistant, and tolerates cold, salinity, drought, and waterlogging. Besides, it can survive in both moist and arid soils (Zhang, 2021). Consequently, this species has considerable potential for use in ecological restoration. Because its extensive root system penetrates 30-50 cm into soil, it can be used to stabilize slopes and reduce erosion, making it an ideal species to plant for revegetating embankments and riparian zones. This species also has considerable potential for saline-alkali soil remediation, because it actively secretes organic acids and adsorbs ions through its roots, lowering soil salinity and improving aggregate structure. It has become a pioneer species for vegetation restoration in saline-affected areas of northern and northwestern China. It is also highly adaptable to wetland environments, purifies water by absorbing nutrients (e.g., nitrogen and phosphorus), and plays an important role in rehabilitating degraded aquatic ecosystems. Thus, C. selloana integrates ecological and ornamental functions, serving both environmental rehabilitation and landscape enhancement. Its robust stress tolerance also positions it as a valuable genetic resource for improving resilience in other plant species. However, as a cross-pollinating species with a complex genetic background, research on germplasm identification, genetic diversity analysis, and molecular-assisted breeding for C. selloana is limited.
       
Traditional morphological markers are more susceptible to environmental influences, whereas simple sequence repeat (SSR) molecular markers have become important in plant genetic research because of their codominant nature, high polymorphism, and technical simplicity (Sun et al., 2022; Raghu et al., 2019). They are widely applied in the construction of genetic linkage maps, in-depth investigation of germplasm genetic diversity, analysis of genetic relationships and differentiation and population structure studies (Yan et al., 2020). Currently, SSR primer development using transcriptome data has been successfully implemented for multiple plant species, such as Vigna radiata (greengram) (Das et al., 2023), Pisum sativum (pea) (Singh et al., 2022), Elymus sibiricus (Siberian wildrye) (Zhang et al., 2019), Dactylis glomerata (orchardgrass) (Huang et al., 2015) and Melilotus albus (white sweetclover) (Yan et al., 2017).
               
Given the incomplete genomic annotation of C. selloana,  mining SSR loci in functional gene regions based on transcriptome sequencing is an efficient approach. This study analyzes the composition and sequence characteristics of SSR loci distributed across the C. selloana transcriptome. Objectives are to lay the foundation for germplasm evaluation, genetic map construction and molecular-assisted breeding. These advances will facilitate the effective use of C. selloana in ecological restoration applications.
Plant materials
 
The experiment was conducted from March 2024 to March 2025 in the greenhouse of the College of Landscape Architecture, Liaoning Agricultural Vocational and Technical College, and at Shanghai Majorbio Bio-Pharm Technology Co., Ltd. Cortaderia selloana seeds were purchased from Beijing Huayuanli Agricultural Technology Co. Seeds were sown in pots for germination and maintained in a greenhouse under uniform cultivation practices. At the 6-8-leaf stage, the fourth or fifth fully expanded leaves were collected from each seedling, pooled, flash-frozen in liquid nitrogen and stored for subsequent analysis.
 
Experimental procedures
 
Transcriptome library construction, sequencing, and assembly
 
Total RNA was extracted from C. selloana leaves using Tiangen RNA extraction kits. RNA concentration and purity were assessed via NanoDrop 2000 spectrophotometry, while integrity was verified by agarose gel electrophoresis. RNA quality numbers were determined using an Agilent 5300 system. After passing quality control, libraries underwent high-throughput sequencing on a NovaSeq × Plus platform (Shanghai Majorbio Bio-Pharm Technology Co., Ltd.). Raw reads were filtered, quality-controlled and de novo assembled to generate unigenes, which served as reference sequences for subsequent analyses.
 
SSR screening and statistical analysis
 
SSR loci within unigenes were identified using MISA (http://pgrc.ipk-gatersleben.de/misa/). Minimum repeat thresholds were set as follows: mononucleotides (10 repeats), and di-/tri-/tetra-/penta-/hexanucleotides (6, 5, 5, 5, and 5 repeats, respectively). Compound SSRs were defined as two SSR loci separated by < 100 bp. MISA output was collated using Microsoft Excel 2021 for sequence characterization and data visualization.
SSR analysis of the Cortaderia selloana transcriptome
 
High-throughput sequencing generated 333,731 non-redundant unigenes of average length 463.04 bp and total length 154,530,802 bp  (Table 1). Screening these unigenes identified 20,684 SSR loci, representing an SSR frequency of 6.20%. Among these, 18,242 sequences contained SSR loci (SSR incidence 5.47%). Additionally, we detected 1197 unigene sequences harboring compound SSR loci (0.36% of total unigenes) and 987 sequences containing ≥ 2 SSR loci (0.30% of total unigenes).

Table 1: Analysis of SSR in transcriptome of Cortaderla Selloana ‘Rosea’.


 
Types of SSR repeat motifs
 
SSR repeat motifs ranged from mono- to hexanucleotides, with significant variation in abundances. Mononucleotide (7781; 37.62% of all SSRs) and trinucleotide repeats (8232; 39.80%) predominated, followed by dinucleotide repeats (4064; 19.65%). Tetra- to hexanucleotide repeats each accounted for < 2.00%, while compound SSRs constituted 1197 motifs (5.79%). Distribution patterns revealed pentanuc- leotide repeats to have the longest average distribution distance (1136.26 kb) but the lowest frequency (0.04%). Conversely, trinucleotide repeats had the shortest average distance (18.77 kb) and highest frequency (2.47%) (Table 2).

Table 2: SSR repeat types, numbers and distribution characteristics in transcriptome.


 
Base composition and proportion of SSR repeat motifs
 
A total of 20,684 SSRs comprising 171 repeat motifs were detected in the C. selloana transcriptome, with an overall frequency of occurrence of 6.20% (Table 3). Numbers of six-nucleotide repeat types (mono, di, tri, tetra, penta, and hexa) gradually increased, with 2, 4, 10, 31, 41, and 83 types, respectively (frequencies ranging 0.01%-2.13%). However, the total number of SSR loci trended downward.

Table 3: Repetition and number of different repeat types.


       
In terms of base composition, mononucleotide repeats had the fewest types (2), with A/T being the dominant type, forming 7107 loci, accounting for 91.34% of this motif category. Dinucleotide repeats included 4 types, mainly AG/CT, which formed 2462 loci (60.58% of dinucleotide motifs). The CG/CG type was the least frequent, with only 196 loci (4.82%). Trinucleotide repeats comprised 10 types, with CCG/CGG (2488 loci, 30.22%) being the most abundant, followed by AGG/CCT (1443 loci, 17.53%) and AGC/CTG (1082 loci, 13.14%). Least frequent was ACT/AGT (87 loci, 1.06%). Tetranucleotide repeats included 31 types, with AAAG/CTTT (42 loci, 12.80%) the dominant type, then AAAC/GTTT (35 loci, 10.67%). Remainding types accounted for < 10%, with the least frequent being ACGT/ACGT and ACCC/GGGT (1 locus each, 0.31% each). Pentanucleotide repeats had 41 types, with AGAGG/CCTCT (27 loci, 19.85%) and AAAAAG/CTTTTT (25 loci, 18.38%) being the most numerous. Hexanucleotide repeats had the highest number of types (83), with AAAAAAG/CTTTTT (14 loci, 9.79%) being most abundant (Table 3).

Repeat count distribution of SSR motifs
 
An inverse relationship existed between repeat count and motif abundance across all six nucleotide repeat types (Table 4). Key distribution patterns emerged. Mononuc-leotides were primarily distributed at 6-15 repeats (92.94% of mono-SSRs). Dinucleotides were dominated by 6-10 repeats (81.23% of di-SSRs). Trinucleotides were concentrated at 1-10 repeats (98.99% of tri-SSRs). Collectively, these dominant repeat ranges accounted for 90.32% of all SSR loci. Tetra-, penta-, and hexanucleotides exhibited preferential low-repeat distributions (≤ 6 repeats), with numbers of their main repeat motifs being 324, 109, and 110, respectively (1.57%, 0.53%, and 0.53% of all SSRs, respectively).

Table 4: Repeat times of different SSR motifs.


 
Evaluation of SSR availability
 
After filtering out fragments of length < 12 bp, the lengths of SSRs with different motifs in C. selloana were analyzed (Table 5). The length of dinucleotide SSRs was mainly concentrated at 12-20 bp (66.78% of all dinucleotide SSRs). The length of trinucleotide SSRs was 12-20 bp (75.66% of all trinucleotide SSRs). For tetra-, penta-, and hexanuc-leotide repeats, SSR lengths were 1-20, 21-40, and 21-40 bp, respectively (56.40%, 86.76%, and 76.92% of their respective total numbers). Nucleotide lengths of compound SSRs were mainly concentrated at 20-40, 41-60, and 61-80 bp (25.65%, 21.55%, and 17.88% of all compound SSRs, respectively).

Table 5: Length distribution of SSR different repeat types in Cortaderla selloana.


       
SSR lengths were mainly concentrated in the range of 12-40 bp, with 11,769 sequences (56.90% of all SSRs (20,684)). Among these SSR loci, sequences of 12-20 bp length were most numerous (9127, 44.13%), followed by those of 21-40 bp length (2217, 10.72%). Some 1,112 sequences were > 40 bp length (5.38% of all SSRs), with the most numerous being the 15-bp trinucleotide repeat (4640 in total, 22.43%), then 18 bp and 12 bp (7.68% and 6.04% of all SSRs, respectively). There were 3329 SSRs of length > 20 bp (16.10% of all SSRs). We speculate that these longer sequences may have greater polymorphism potential.
       
With the rapid advancement of high-throughput sequencing, reference genomes for numerous plant species have been assembled. However, non-model ornamental grasses-particularly those with complex chromosome ploidy and limited molecular biology research-still lack comprehensive genomic resources. Current transcriptome sequencing of C. selloana relies on references from related species within the family, resulting in significantly lower data yields and fewer annotated unigenes compared with more commonly studied plants.
       
We obtained 333,731 C. selloana unigenes through transcriptome sequencing. From these, 20,684 SSR loci were identified. With a frequency of occurrence of 5.47%, frequency of appearance of 6.20%, and average distribution distance of 7.47 kb, the SSR loci appear to be sparsely distributed in the transcriptome, possibly related to the genomic characteristics of gramineous plants. Li et al., (2023) used MISA software to search for SSR loci in 130,393 unigene sequences of the Narenga porphyrocoma transcriptome; from 14,233 unigenes, 16,372 SSR loci were obtained, with a frequency of appearance of 10.92%  Mao et al., (2022) searched SSR loci in the 3rd generation full-length transcriptome of Psammochloa villosa and found 93,563 SSR repeat sequences distributed in 56,824 unigenes (appearance frequency 50.83%). In the full-length transcriptome of Littledalea racemosa with 30,624 unigene sequences, 14,089 SSR repeat sequences were searched using MISA software (frequency of appearance 46.01%) (Fu et al., 2025). Yin et al. (2024) searched for SSR loci in 214,676 unigene sequences of the Hordeum brevisubulatum transcriptome, and detected 24,877 SSRs in 21,618 unigene sequences (frequency of appearance 11.59%). Wang et al., (2022) performed transcriptome sequencing on an Agropyron mongolicum hybrid and from 110,115 unigenes obtained 5620 SSR loci after locus searching (frequency of appearance 5.10%) (Wang et al., 2022). Compared with these gramineous plants, the SSR frequency of appearance in C. selloana is higher than for A. mongolicum (5.10%), but lower than for P. villosa, L. racemosa, N. porphyrocoma and H. brevisubulatum. The frequency of occurrence of SSR loci in C. selloana differs from transcri-ptomes of other gramineous plants. Inherent differences in gene structure of species may explain these differences, although factors such as the size of the analysis database, SSR retrieval tools, and parameter settings of retrieval conditions cannot be excluded (Fu et al., 2021).
       
Regarding repeat types, C. selloana SSRs were dominated by mononucleotide (37.62%) and trinucleotide repeats (39.80%), with dinucleotides secondary (19.65%). Tetra- to hexanucleotide repeats collectively comprised < 3% of all SSRs. This distribution aligns with patterns in other Poaceae species L. racemosa (mono, 36.88%; tri, 39.34%; di, 21.13%) and H. brevisubulatum (mono, 50.90%;  tri, 28.78%; di, 17.48%) (Yin et al., 2021) -confirming the prevalence of low-order repeats in grasses. Notably, C. selloana exhibited a significantly higher proportion of trinucleotide repeats (39.80%) than H. brevisubulatum (28.78%), suggesting enhanced variation potential in coding regions and greater utility for polymorphic marker development.
       
In terms of motif composition, the dominant motif types in C. selloana are consistent with those of Gramineae. Among the 171 repeat motif types detected in the C. selloana transcriptome, the mononucleotide A/T type accounts for the highest proportion (91.34%); the dinucleotide is dominated by AG/CT (60.58%), and the trinucleotide CCG/CGG is also high-frequency type (30.22%). This base preference is related to the codon usage preference of Gramineae plants and the high mutation rate in AT-enriched regions of the transcriptome. The conclusion that the dominant motif in dinucleotide repeats of SSR loci is AG/CT and the dominant motif in trinucleotide repeats of C. selloana (CCG/CGG) is consistent with results for other Gramineae plants such as Teosinte (Li et al., 2023) and Coix lacryma-jobi (Ouyang et al., 2021). This observation also supports research reporting the dominant repeat motifs of dinucleotides and trinucleotides in SSRs of most monocotyledonous plants to be AG/CT and CCG/CGG, respectively (Kantety et al., 2002). However, dominant motifs in plants such as Diospyros rhombifolia (Wang et al., 2022) and Brassica campestris chinensis var. purpuraria differ (Xi et al., 2022), possibly because of species specificity or search conditions. The repeat motif types of nucleotides of different species vary, and the dominant motifs in different repeat types also change. This is conducive to development of SSR molecular markers. Among them, the CCG/CGG motif also has the highest proportion in L. racemosa (31.32%) and N. porphyrocoma (20.26%), indicating its evolutionary conservatism in monocotyledons.
       
High polymorphism-the core value of SSR markers-depends on repeat unit counts and the abundance of ≥ 20 bp fragments. Generally, SSRs ≥ 20 bp exhibit high polymorphism, and those of 12-20 bp show moderate polymorphism. In C. selloana, SSR repeat counts predominantly ranged 6-15 (92.94% of total loci), with lengths concentrated in 12-40 bp (56.90%). Notably, 3329 SSR loci (16.10%) exceeded 20 bp. These longer repeats have enhanced polymorphism potential because of extended motif repetitions, making them promising targets for developing highly informative molecular markers.
               
Study limitations must be acknowledged. First, the SSR frequency in C. selloana (6.20%) is significantly lower than that of P. villosa (50.83%) and L. racemosa (46.01%), possibly because of sequencing depth or unigene assembly completeness. Second, tetra- to hexanucleotide repeats constituted < 3% of all SSRs, constraining development of complex polymorphic markers. Future efforts could integrate genomic sequencing data to mine genomic SSRs for enhanced marker density, and validate high-polymorphism loci through primer screening. These advances will support cultivar identification, genetic linkage map construction, and stress-resistance gene localization in C. selloana.
This study provides the first systematic characterization of SSR distribution patterns in the transcriptome of C. selloana, yielding three principal findings. The 20,684 identified SSR loci (frequency 6.20%) were dominated by mononucleotide (37.62%) and trinucleotide repeats (39.80%). Dominant motifs (A/T (mono-), AG/CT (di-) and CCG/CGG (tri-)) align with Poaceae evolutionary traits. High polymorphism potential for marker development exists, with 3329 SSR loci (16.10%) exceeding 20 bp length.
       
The widespread distribution and rich diversity of SSR loci in C. selloana represent a robust foundation for efficient marker development. Significant potential exists to apply these results in research on genetic diversity, molecular identification of cultivars, and development of novel molecular markers in this species.
RN W, YH W and YJ L designed the experiment, RN W, B X and SY Z performed the experiments and wrote the manuscript, XL Z revised the article. This study was supported by the Science and Technology Department General Project of Liaoning Province (2023-MSLH-303 and 2023-MSLH-305), Yingkou Enterprise Doctor Dual Innovation Program Project (YKSCJH2024-022) and the Research Project of Liaoning Agricultural Vocational and Technical College (LnzkB202320). The authors would like to express their gratitude to EditSprings for the expert linguistic services provided.
 
Disclaimers

The views and conclusions expressed in this article are solely those of the authors and do not necessarily represent the views of their affiliated institutions. The authors are responsible for the accuracy and completeness of the information provided, but do not accept any liability for any direct or indirect losses resulting from the use of this content.
 
Informed consent
 
The collection of plant seeds conforms to China’s regulatory standards and current laws.
The authors declare that there are no conflicts of interest regarding the publication of this article. No funding or sponsor- ship influenced the design of the study, data collection, analysis, decision to publish, or preparation of the manuscript.

  1. Das, T.R. and Baisakh, B. (2023). SSR- marker assisted evaluation of genetic diversity in greengram [Vigna radiata (L.) Wilcezk]. Legume Research. 48(7): 1110-1116. doi: 10. 18805/ LR-5151

  2. Fu, G., Liu, Y.P., Su, X. (2021). Analysis of SSR characteristics for Elsholtzia densa Benth. based on transcriptome data. Acta Botanica Boreali-Occidentalia Sinica. 41(4): 654-663. 

  3. Fu, G., Liu, Y.P., SU, X., et al. (2025). Analysis of SSR characterization in full-length transcriptome and development of SSR molecular markers for Little dalea racemosa. Acta Prataculturae Sinica. 34(7): 107-119.

  4. Huang, L.K., Yan, H.D., Zhao, X.X., et al. (2015). ldentifying differentially expressed genes under heat stress and developing molecular markers in orchard grass (Dactylis glomerata L.) through transcriptome analysis. Molecular Ecology Resources. 15(6): 1497-1509.

  5. Kantety, R.V., LaRota. M., Matthews, D.E., et al. (2002). Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Molecular Biology. 48(5/6): 501-510.

  6. Li, H.B., Wu, Y., Zhu, K., Gui, Y.Y., Wei, J.J., Zhou, H. (2023). Analysis of SSR, SNP sequence features and phylogeny of the transcriptome of Narenga porphyrocoma (Hance) Bor. Journal of Southern Agriculture. 54(3): 849-858.

  7. Mao, X.R., Liu, Y.P., Su, X., et al. (2022). Characteristics analysis of simple sequence repeat (SSR) loci in Psammochloa villosa (Poaceae) based on transcriptome data. Acta Agrestia Sinica. 30(8): 1990-2001. 

  8. Ouyang, Y., Li, L.L., Shi, H.Y., Zhao, Y., Zhu, F.Z., Yin, Y.Y., Wang, L.Z. (2021). Transcriptome sequencing and gene function annotation of Coix larchryma jobi L. var. ma-yuen Stapf. Central South Pharmacy. 19(7): 1286-1293. doi: 10.7539/j.issn.1672- 2981.2021. 07.004.

  9. Raghu, R., Ravikumar, R.L., Subramanya., Sunil, A.E. (2019). Cross transferability of Chickpea genic SSR markers developed from Fusarium wilt resistance loci to orphan legumes. Legume Research. 44(4): 388-400. doi: 10.18805/LR-4119.

  10. Singh, S., Singh, B., Sharma, V.R., Kumar, M., Sirohi, U. (2022). Assessment of genetic diversity and population structure in pea (Pisum sativum L.) germplasm based on morphological traits and SSR markers. Legume Research. 45(6): 683-688. doi: 10.18805/LR-4751.

  11. Sun, L.J., He, J.J., Wang, J. et al. (2022). Development of SSR markers based on full-length transcriptome sequencing and genetic diversity analysis of Halogeton glomeratus. Acta Prataculturae Sinica. 31(8): 199-210.

  12. Wang, X.Y., Fu, B.Z., Zhang, Z.Q., Mi, F.G. (2022). SSR and SNP characteristics analysis in Agropyron cristatum×A. desertorum cv. Hycrest-Mengnong based on transcriptome sequencing. Molecular Plant Breeding. 1-8.

  13. Xi, D.D., Gao, L., Li, X.F., Zhu, Y.Y., Zhu, H.F. (2022). Development and identification of SSR molecular markers based on transcriptome sequencing of caitai. Molecular Plant Breeding. https://kns.cnki.net/kcms/detail/46.1068.S. 20220916.1333.050.html.

  14. Yan, R.J., Schnabel, K.E., Rowden, A.A., et al. (2020). Population structure and genetic connectivity of squat lobsters (Munida Leach, 1820) associated with vulnerable marine ecosystems in the southwest Pacific Ocean. Frontiers in Marine Science. 6: 791.

  15. Yan, Z.Z., Wu, F., Luo, K., Zhao, Y.F., Yan. Q., Zhang, Y.F., Wang, Y.R., and Zhang, J.Y. (2017). Cross-species transferability of EST-SSR markers developed from the transcriptome of Melilotus and their application to population genetics research. Sci. Rep. 7(1): 17959.

  16. Yin, H., Li, B., Zhang, M.Z, et al. (2021). Analysis of SSR sequence characteristics of Hordeum brevisubulatum transcriptome. Molecular Plant Breeding. 8: 1-9. 

  17. Yin, H., Li, B., Zhang, M.Z., Fu, W.F., Sun, X., Gong, L. and Cui, G.W. (2024). Analysis of SSR sequence characteristics of Hordeum brevisubulatum transcriptome. Molecular Plant Breeding. 22(19): 6352-6358. 

  18. Zhang, Y.S. (2021). Study on tissue culture and ramet propagation of Cortaderia selloana ‘Pumila’ and Cortaderia selloana ‘Silver Comet’. Master Thesis, Guiyang, Gui Zhou, China. 

  19. Zhang, Z.Y, Xie, W.G., Zhao, Y.Q., et al. (2019). EST-SSR marker development based on RNA-sequencing of E.sibiricus and its application for phylogenetic relationships analysis of seventeen Elymus species. BMC Plant Biology. 19(1): 2.

Analysis of SSR Characterization in Transcriptome for Pampass Grass Cortaderia selloana

R
Rina Wu1
Y
Yujing Liu1
B
Bo Xu1
X
Xiuli Zhang1
S
Shiyu Zuo1
Y
Yanhua Wu1,*
1Liaoning Agricultural Vocational and Technical College, School of Landscape Architecture, Yingkou, China.
  • Submitted09-07-2025|

  • Accepted03-08-2025|

  • First Online 16-09-2025|

  • doi 10.18805/LRF-886

Background: The perennial herbaceous pampas grass (Cortaderia selloana: Poaceae) integrates ecological and ornamental functions, serving both environmental rehabilitation and landscape enhancement. Its robust stress tolerance also positions it as a valuable genetic resource for improving resilience in other plant species. The study aimed to analyze characteristics of simple sequence repeats (SSRs) in the transcriptome of pampass grass (Cortaderia selloana) to lay a foundation for developing molecular markers for this species.

Methods: Using C. selloana leaves, high-throughput sequencing was performed on the NovaSeq × Plus platform. SSR loci were identified from transcriptome data using MISA software, followed by analysis of their distribution and sequence characteristics.

Result: A total of 333.731 reference sequences (unigenes) were obtained, among which 20.684 SSR loci were mined (frequency of occurrence 6.20%, average distribution distance 7.47 kb). SSR repeat motifs were dominated by mononucleotides (7781 SSRs, accounting for 37.62%), trinucleotides (8232 SSRs, 39.80%), and dinucleotides (4064 SSRs, 19.65%). Among 171 repeat motifs, A/T mononucleotide motifs predominated (91.34%), followed by AG/CT dinucleotide motifs (60.58%). SSR repeats were mostly distributed within the 6-15 range, with numbers of all six nucleotide repeat types progressively decreasing as motif repeat count increased. SSR lengths were mostly concentrated in the 12-40 bp range, with SSRs within this length range accounting for 56.90% of all SSR loci. Among these, 3329 SSR loci (16.10%) exhibited higher polymorphism potential (length > 20 bp). SSR loci in C. selloana exhibit high frequency, diverse repeat motifs, high polymorphism potential and considerable potential for research on genetic diversity, molecular identification of cultivars, and development of novel molecular markers in this species.

The perennial herbaceous pampas grass (Cortaderia selloana: Poaceae) is widely used in landscape design and the cut flower industry because of its tall, upright growth habit and dense, ornamental panicles. Beyond its aesthetic value, however, C. selloana is remarkably stress resistant, and tolerates cold, salinity, drought, and waterlogging. Besides, it can survive in both moist and arid soils (Zhang, 2021). Consequently, this species has considerable potential for use in ecological restoration. Because its extensive root system penetrates 30-50 cm into soil, it can be used to stabilize slopes and reduce erosion, making it an ideal species to plant for revegetating embankments and riparian zones. This species also has considerable potential for saline-alkali soil remediation, because it actively secretes organic acids and adsorbs ions through its roots, lowering soil salinity and improving aggregate structure. It has become a pioneer species for vegetation restoration in saline-affected areas of northern and northwestern China. It is also highly adaptable to wetland environments, purifies water by absorbing nutrients (e.g., nitrogen and phosphorus), and plays an important role in rehabilitating degraded aquatic ecosystems. Thus, C. selloana integrates ecological and ornamental functions, serving both environmental rehabilitation and landscape enhancement. Its robust stress tolerance also positions it as a valuable genetic resource for improving resilience in other plant species. However, as a cross-pollinating species with a complex genetic background, research on germplasm identification, genetic diversity analysis, and molecular-assisted breeding for C. selloana is limited.
       
Traditional morphological markers are more susceptible to environmental influences, whereas simple sequence repeat (SSR) molecular markers have become important in plant genetic research because of their codominant nature, high polymorphism, and technical simplicity (Sun et al., 2022; Raghu et al., 2019). They are widely applied in the construction of genetic linkage maps, in-depth investigation of germplasm genetic diversity, analysis of genetic relationships and differentiation and population structure studies (Yan et al., 2020). Currently, SSR primer development using transcriptome data has been successfully implemented for multiple plant species, such as Vigna radiata (greengram) (Das et al., 2023), Pisum sativum (pea) (Singh et al., 2022), Elymus sibiricus (Siberian wildrye) (Zhang et al., 2019), Dactylis glomerata (orchardgrass) (Huang et al., 2015) and Melilotus albus (white sweetclover) (Yan et al., 2017).
               
Given the incomplete genomic annotation of C. selloana,  mining SSR loci in functional gene regions based on transcriptome sequencing is an efficient approach. This study analyzes the composition and sequence characteristics of SSR loci distributed across the C. selloana transcriptome. Objectives are to lay the foundation for germplasm evaluation, genetic map construction and molecular-assisted breeding. These advances will facilitate the effective use of C. selloana in ecological restoration applications.
Plant materials
 
The experiment was conducted from March 2024 to March 2025 in the greenhouse of the College of Landscape Architecture, Liaoning Agricultural Vocational and Technical College, and at Shanghai Majorbio Bio-Pharm Technology Co., Ltd. Cortaderia selloana seeds were purchased from Beijing Huayuanli Agricultural Technology Co. Seeds were sown in pots for germination and maintained in a greenhouse under uniform cultivation practices. At the 6-8-leaf stage, the fourth or fifth fully expanded leaves were collected from each seedling, pooled, flash-frozen in liquid nitrogen and stored for subsequent analysis.
 
Experimental procedures
 
Transcriptome library construction, sequencing, and assembly
 
Total RNA was extracted from C. selloana leaves using Tiangen RNA extraction kits. RNA concentration and purity were assessed via NanoDrop 2000 spectrophotometry, while integrity was verified by agarose gel electrophoresis. RNA quality numbers were determined using an Agilent 5300 system. After passing quality control, libraries underwent high-throughput sequencing on a NovaSeq × Plus platform (Shanghai Majorbio Bio-Pharm Technology Co., Ltd.). Raw reads were filtered, quality-controlled and de novo assembled to generate unigenes, which served as reference sequences for subsequent analyses.
 
SSR screening and statistical analysis
 
SSR loci within unigenes were identified using MISA (http://pgrc.ipk-gatersleben.de/misa/). Minimum repeat thresholds were set as follows: mononucleotides (10 repeats), and di-/tri-/tetra-/penta-/hexanucleotides (6, 5, 5, 5, and 5 repeats, respectively). Compound SSRs were defined as two SSR loci separated by < 100 bp. MISA output was collated using Microsoft Excel 2021 for sequence characterization and data visualization.
SSR analysis of the Cortaderia selloana transcriptome
 
High-throughput sequencing generated 333,731 non-redundant unigenes of average length 463.04 bp and total length 154,530,802 bp  (Table 1). Screening these unigenes identified 20,684 SSR loci, representing an SSR frequency of 6.20%. Among these, 18,242 sequences contained SSR loci (SSR incidence 5.47%). Additionally, we detected 1197 unigene sequences harboring compound SSR loci (0.36% of total unigenes) and 987 sequences containing ≥ 2 SSR loci (0.30% of total unigenes).

Table 1: Analysis of SSR in transcriptome of Cortaderla Selloana ‘Rosea’.


 
Types of SSR repeat motifs
 
SSR repeat motifs ranged from mono- to hexanucleotides, with significant variation in abundances. Mononucleotide (7781; 37.62% of all SSRs) and trinucleotide repeats (8232; 39.80%) predominated, followed by dinucleotide repeats (4064; 19.65%). Tetra- to hexanucleotide repeats each accounted for < 2.00%, while compound SSRs constituted 1197 motifs (5.79%). Distribution patterns revealed pentanuc- leotide repeats to have the longest average distribution distance (1136.26 kb) but the lowest frequency (0.04%). Conversely, trinucleotide repeats had the shortest average distance (18.77 kb) and highest frequency (2.47%) (Table 2).

Table 2: SSR repeat types, numbers and distribution characteristics in transcriptome.


 
Base composition and proportion of SSR repeat motifs
 
A total of 20,684 SSRs comprising 171 repeat motifs were detected in the C. selloana transcriptome, with an overall frequency of occurrence of 6.20% (Table 3). Numbers of six-nucleotide repeat types (mono, di, tri, tetra, penta, and hexa) gradually increased, with 2, 4, 10, 31, 41, and 83 types, respectively (frequencies ranging 0.01%-2.13%). However, the total number of SSR loci trended downward.

Table 3: Repetition and number of different repeat types.


       
In terms of base composition, mononucleotide repeats had the fewest types (2), with A/T being the dominant type, forming 7107 loci, accounting for 91.34% of this motif category. Dinucleotide repeats included 4 types, mainly AG/CT, which formed 2462 loci (60.58% of dinucleotide motifs). The CG/CG type was the least frequent, with only 196 loci (4.82%). Trinucleotide repeats comprised 10 types, with CCG/CGG (2488 loci, 30.22%) being the most abundant, followed by AGG/CCT (1443 loci, 17.53%) and AGC/CTG (1082 loci, 13.14%). Least frequent was ACT/AGT (87 loci, 1.06%). Tetranucleotide repeats included 31 types, with AAAG/CTTT (42 loci, 12.80%) the dominant type, then AAAC/GTTT (35 loci, 10.67%). Remainding types accounted for < 10%, with the least frequent being ACGT/ACGT and ACCC/GGGT (1 locus each, 0.31% each). Pentanucleotide repeats had 41 types, with AGAGG/CCTCT (27 loci, 19.85%) and AAAAAG/CTTTTT (25 loci, 18.38%) being the most numerous. Hexanucleotide repeats had the highest number of types (83), with AAAAAAG/CTTTTT (14 loci, 9.79%) being most abundant (Table 3).

Repeat count distribution of SSR motifs
 
An inverse relationship existed between repeat count and motif abundance across all six nucleotide repeat types (Table 4). Key distribution patterns emerged. Mononuc-leotides were primarily distributed at 6-15 repeats (92.94% of mono-SSRs). Dinucleotides were dominated by 6-10 repeats (81.23% of di-SSRs). Trinucleotides were concentrated at 1-10 repeats (98.99% of tri-SSRs). Collectively, these dominant repeat ranges accounted for 90.32% of all SSR loci. Tetra-, penta-, and hexanucleotides exhibited preferential low-repeat distributions (≤ 6 repeats), with numbers of their main repeat motifs being 324, 109, and 110, respectively (1.57%, 0.53%, and 0.53% of all SSRs, respectively).

Table 4: Repeat times of different SSR motifs.


 
Evaluation of SSR availability
 
After filtering out fragments of length < 12 bp, the lengths of SSRs with different motifs in C. selloana were analyzed (Table 5). The length of dinucleotide SSRs was mainly concentrated at 12-20 bp (66.78% of all dinucleotide SSRs). The length of trinucleotide SSRs was 12-20 bp (75.66% of all trinucleotide SSRs). For tetra-, penta-, and hexanuc-leotide repeats, SSR lengths were 1-20, 21-40, and 21-40 bp, respectively (56.40%, 86.76%, and 76.92% of their respective total numbers). Nucleotide lengths of compound SSRs were mainly concentrated at 20-40, 41-60, and 61-80 bp (25.65%, 21.55%, and 17.88% of all compound SSRs, respectively).

Table 5: Length distribution of SSR different repeat types in Cortaderla selloana.


       
SSR lengths were mainly concentrated in the range of 12-40 bp, with 11,769 sequences (56.90% of all SSRs (20,684)). Among these SSR loci, sequences of 12-20 bp length were most numerous (9127, 44.13%), followed by those of 21-40 bp length (2217, 10.72%). Some 1,112 sequences were > 40 bp length (5.38% of all SSRs), with the most numerous being the 15-bp trinucleotide repeat (4640 in total, 22.43%), then 18 bp and 12 bp (7.68% and 6.04% of all SSRs, respectively). There were 3329 SSRs of length > 20 bp (16.10% of all SSRs). We speculate that these longer sequences may have greater polymorphism potential.
       
With the rapid advancement of high-throughput sequencing, reference genomes for numerous plant species have been assembled. However, non-model ornamental grasses-particularly those with complex chromosome ploidy and limited molecular biology research-still lack comprehensive genomic resources. Current transcriptome sequencing of C. selloana relies on references from related species within the family, resulting in significantly lower data yields and fewer annotated unigenes compared with more commonly studied plants.
       
We obtained 333,731 C. selloana unigenes through transcriptome sequencing. From these, 20,684 SSR loci were identified. With a frequency of occurrence of 5.47%, frequency of appearance of 6.20%, and average distribution distance of 7.47 kb, the SSR loci appear to be sparsely distributed in the transcriptome, possibly related to the genomic characteristics of gramineous plants. Li et al., (2023) used MISA software to search for SSR loci in 130,393 unigene sequences of the Narenga porphyrocoma transcriptome; from 14,233 unigenes, 16,372 SSR loci were obtained, with a frequency of appearance of 10.92%  Mao et al., (2022) searched SSR loci in the 3rd generation full-length transcriptome of Psammochloa villosa and found 93,563 SSR repeat sequences distributed in 56,824 unigenes (appearance frequency 50.83%). In the full-length transcriptome of Littledalea racemosa with 30,624 unigene sequences, 14,089 SSR repeat sequences were searched using MISA software (frequency of appearance 46.01%) (Fu et al., 2025). Yin et al. (2024) searched for SSR loci in 214,676 unigene sequences of the Hordeum brevisubulatum transcriptome, and detected 24,877 SSRs in 21,618 unigene sequences (frequency of appearance 11.59%). Wang et al., (2022) performed transcriptome sequencing on an Agropyron mongolicum hybrid and from 110,115 unigenes obtained 5620 SSR loci after locus searching (frequency of appearance 5.10%) (Wang et al., 2022). Compared with these gramineous plants, the SSR frequency of appearance in C. selloana is higher than for A. mongolicum (5.10%), but lower than for P. villosa, L. racemosa, N. porphyrocoma and H. brevisubulatum. The frequency of occurrence of SSR loci in C. selloana differs from transcri-ptomes of other gramineous plants. Inherent differences in gene structure of species may explain these differences, although factors such as the size of the analysis database, SSR retrieval tools, and parameter settings of retrieval conditions cannot be excluded (Fu et al., 2021).
       
Regarding repeat types, C. selloana SSRs were dominated by mononucleotide (37.62%) and trinucleotide repeats (39.80%), with dinucleotides secondary (19.65%). Tetra- to hexanucleotide repeats collectively comprised < 3% of all SSRs. This distribution aligns with patterns in other Poaceae species L. racemosa (mono, 36.88%; tri, 39.34%; di, 21.13%) and H. brevisubulatum (mono, 50.90%;  tri, 28.78%; di, 17.48%) (Yin et al., 2021) -confirming the prevalence of low-order repeats in grasses. Notably, C. selloana exhibited a significantly higher proportion of trinucleotide repeats (39.80%) than H. brevisubulatum (28.78%), suggesting enhanced variation potential in coding regions and greater utility for polymorphic marker development.
       
In terms of motif composition, the dominant motif types in C. selloana are consistent with those of Gramineae. Among the 171 repeat motif types detected in the C. selloana transcriptome, the mononucleotide A/T type accounts for the highest proportion (91.34%); the dinucleotide is dominated by AG/CT (60.58%), and the trinucleotide CCG/CGG is also high-frequency type (30.22%). This base preference is related to the codon usage preference of Gramineae plants and the high mutation rate in AT-enriched regions of the transcriptome. The conclusion that the dominant motif in dinucleotide repeats of SSR loci is AG/CT and the dominant motif in trinucleotide repeats of C. selloana (CCG/CGG) is consistent with results for other Gramineae plants such as Teosinte (Li et al., 2023) and Coix lacryma-jobi (Ouyang et al., 2021). This observation also supports research reporting the dominant repeat motifs of dinucleotides and trinucleotides in SSRs of most monocotyledonous plants to be AG/CT and CCG/CGG, respectively (Kantety et al., 2002). However, dominant motifs in plants such as Diospyros rhombifolia (Wang et al., 2022) and Brassica campestris chinensis var. purpuraria differ (Xi et al., 2022), possibly because of species specificity or search conditions. The repeat motif types of nucleotides of different species vary, and the dominant motifs in different repeat types also change. This is conducive to development of SSR molecular markers. Among them, the CCG/CGG motif also has the highest proportion in L. racemosa (31.32%) and N. porphyrocoma (20.26%), indicating its evolutionary conservatism in monocotyledons.
       
High polymorphism-the core value of SSR markers-depends on repeat unit counts and the abundance of ≥ 20 bp fragments. Generally, SSRs ≥ 20 bp exhibit high polymorphism, and those of 12-20 bp show moderate polymorphism. In C. selloana, SSR repeat counts predominantly ranged 6-15 (92.94% of total loci), with lengths concentrated in 12-40 bp (56.90%). Notably, 3329 SSR loci (16.10%) exceeded 20 bp. These longer repeats have enhanced polymorphism potential because of extended motif repetitions, making them promising targets for developing highly informative molecular markers.
               
Study limitations must be acknowledged. First, the SSR frequency in C. selloana (6.20%) is significantly lower than that of P. villosa (50.83%) and L. racemosa (46.01%), possibly because of sequencing depth or unigene assembly completeness. Second, tetra- to hexanucleotide repeats constituted < 3% of all SSRs, constraining development of complex polymorphic markers. Future efforts could integrate genomic sequencing data to mine genomic SSRs for enhanced marker density, and validate high-polymorphism loci through primer screening. These advances will support cultivar identification, genetic linkage map construction, and stress-resistance gene localization in C. selloana.
This study provides the first systematic characterization of SSR distribution patterns in the transcriptome of C. selloana, yielding three principal findings. The 20,684 identified SSR loci (frequency 6.20%) were dominated by mononucleotide (37.62%) and trinucleotide repeats (39.80%). Dominant motifs (A/T (mono-), AG/CT (di-) and CCG/CGG (tri-)) align with Poaceae evolutionary traits. High polymorphism potential for marker development exists, with 3329 SSR loci (16.10%) exceeding 20 bp length.
       
The widespread distribution and rich diversity of SSR loci in C. selloana represent a robust foundation for efficient marker development. Significant potential exists to apply these results in research on genetic diversity, molecular identification of cultivars, and development of novel molecular markers in this species.
RN W, YH W and YJ L designed the experiment, RN W, B X and SY Z performed the experiments and wrote the manuscript, XL Z revised the article. This study was supported by the Science and Technology Department General Project of Liaoning Province (2023-MSLH-303 and 2023-MSLH-305), Yingkou Enterprise Doctor Dual Innovation Program Project (YKSCJH2024-022) and the Research Project of Liaoning Agricultural Vocational and Technical College (LnzkB202320). The authors would like to express their gratitude to EditSprings for the expert linguistic services provided.
 
Disclaimers

The views and conclusions expressed in this article are solely those of the authors and do not necessarily represent the views of their affiliated institutions. The authors are responsible for the accuracy and completeness of the information provided, but do not accept any liability for any direct or indirect losses resulting from the use of this content.
 
Informed consent
 
The collection of plant seeds conforms to China’s regulatory standards and current laws.
The authors declare that there are no conflicts of interest regarding the publication of this article. No funding or sponsor- ship influenced the design of the study, data collection, analysis, decision to publish, or preparation of the manuscript.

  1. Das, T.R. and Baisakh, B. (2023). SSR- marker assisted evaluation of genetic diversity in greengram [Vigna radiata (L.) Wilcezk]. Legume Research. 48(7): 1110-1116. doi: 10. 18805/ LR-5151

  2. Fu, G., Liu, Y.P., Su, X. (2021). Analysis of SSR characteristics for Elsholtzia densa Benth. based on transcriptome data. Acta Botanica Boreali-Occidentalia Sinica. 41(4): 654-663. 

  3. Fu, G., Liu, Y.P., SU, X., et al. (2025). Analysis of SSR characterization in full-length transcriptome and development of SSR molecular markers for Little dalea racemosa. Acta Prataculturae Sinica. 34(7): 107-119.

  4. Huang, L.K., Yan, H.D., Zhao, X.X., et al. (2015). ldentifying differentially expressed genes under heat stress and developing molecular markers in orchard grass (Dactylis glomerata L.) through transcriptome analysis. Molecular Ecology Resources. 15(6): 1497-1509.

  5. Kantety, R.V., LaRota. M., Matthews, D.E., et al. (2002). Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Molecular Biology. 48(5/6): 501-510.

  6. Li, H.B., Wu, Y., Zhu, K., Gui, Y.Y., Wei, J.J., Zhou, H. (2023). Analysis of SSR, SNP sequence features and phylogeny of the transcriptome of Narenga porphyrocoma (Hance) Bor. Journal of Southern Agriculture. 54(3): 849-858.

  7. Mao, X.R., Liu, Y.P., Su, X., et al. (2022). Characteristics analysis of simple sequence repeat (SSR) loci in Psammochloa villosa (Poaceae) based on transcriptome data. Acta Agrestia Sinica. 30(8): 1990-2001. 

  8. Ouyang, Y., Li, L.L., Shi, H.Y., Zhao, Y., Zhu, F.Z., Yin, Y.Y., Wang, L.Z. (2021). Transcriptome sequencing and gene function annotation of Coix larchryma jobi L. var. ma-yuen Stapf. Central South Pharmacy. 19(7): 1286-1293. doi: 10.7539/j.issn.1672- 2981.2021. 07.004.

  9. Raghu, R., Ravikumar, R.L., Subramanya., Sunil, A.E. (2019). Cross transferability of Chickpea genic SSR markers developed from Fusarium wilt resistance loci to orphan legumes. Legume Research. 44(4): 388-400. doi: 10.18805/LR-4119.

  10. Singh, S., Singh, B., Sharma, V.R., Kumar, M., Sirohi, U. (2022). Assessment of genetic diversity and population structure in pea (Pisum sativum L.) germplasm based on morphological traits and SSR markers. Legume Research. 45(6): 683-688. doi: 10.18805/LR-4751.

  11. Sun, L.J., He, J.J., Wang, J. et al. (2022). Development of SSR markers based on full-length transcriptome sequencing and genetic diversity analysis of Halogeton glomeratus. Acta Prataculturae Sinica. 31(8): 199-210.

  12. Wang, X.Y., Fu, B.Z., Zhang, Z.Q., Mi, F.G. (2022). SSR and SNP characteristics analysis in Agropyron cristatum×A. desertorum cv. Hycrest-Mengnong based on transcriptome sequencing. Molecular Plant Breeding. 1-8.

  13. Xi, D.D., Gao, L., Li, X.F., Zhu, Y.Y., Zhu, H.F. (2022). Development and identification of SSR molecular markers based on transcriptome sequencing of caitai. Molecular Plant Breeding. https://kns.cnki.net/kcms/detail/46.1068.S. 20220916.1333.050.html.

  14. Yan, R.J., Schnabel, K.E., Rowden, A.A., et al. (2020). Population structure and genetic connectivity of squat lobsters (Munida Leach, 1820) associated with vulnerable marine ecosystems in the southwest Pacific Ocean. Frontiers in Marine Science. 6: 791.

  15. Yan, Z.Z., Wu, F., Luo, K., Zhao, Y.F., Yan. Q., Zhang, Y.F., Wang, Y.R., and Zhang, J.Y. (2017). Cross-species transferability of EST-SSR markers developed from the transcriptome of Melilotus and their application to population genetics research. Sci. Rep. 7(1): 17959.

  16. Yin, H., Li, B., Zhang, M.Z, et al. (2021). Analysis of SSR sequence characteristics of Hordeum brevisubulatum transcriptome. Molecular Plant Breeding. 8: 1-9. 

  17. Yin, H., Li, B., Zhang, M.Z., Fu, W.F., Sun, X., Gong, L. and Cui, G.W. (2024). Analysis of SSR sequence characteristics of Hordeum brevisubulatum transcriptome. Molecular Plant Breeding. 22(19): 6352-6358. 

  18. Zhang, Y.S. (2021). Study on tissue culture and ramet propagation of Cortaderia selloana ‘Pumila’ and Cortaderia selloana ‘Silver Comet’. Master Thesis, Guiyang, Gui Zhou, China. 

  19. Zhang, Z.Y, Xie, W.G., Zhao, Y.Q., et al. (2019). EST-SSR marker development based on RNA-sequencing of E.sibiricus and its application for phylogenetic relationships analysis of seventeen Elymus species. BMC Plant Biology. 19(1): 2.
In this Article
Published In
Legume Research

Editorial Board

View all (0)