Banner

Dual-classification of Alfalfa Genetic Materials and Salt Ion Types Based on Physiological Responses under Controlled Temperatures

U
Ugur Ozkan1,*
1Department of Field Crops, Faculty of Agriculture, Ankara University, Diskapi, Ankara.
  • Submitted18-06-2025|

  • Accepted15-08-2025|

  • First Online 10-09-2025|

  • doi 10.18805/LRF-880

Background: In this study, a machine learning based approach was developed to classify alfalfa genetic material (one cultivar and three synthetic genotypes) and different salt ion types [sodium chloride (NaCl), calcium chloride (CaCl2) and potassium chloride (KCl) according to the physiological responses of these plants under three controlled temperatures.

Methods: The raw dataset included germination features; germination energy (GE), germination percentage (GP), germination index (GI), mean germination days (MGD), root length (RL), plumule length (PL), fresh weight (FW), dry weight (DW) and seedling vigor (SV). Model performance was assessed using ten-fold cross-validation. Classification models were trained on the dataset using Multilayer Perception (MLP), K-nearest Neighbors (KNN), Decision Tree, Random Forest, Gradient Boosted Trees (GBT), Extreme Gradient Boosting (XGBoost) and Tree Ensemble algorithms. The performance of the models was evaluated on the test subset using statistical measures, such as accuracy, error rate and Cohen’s kappa coefficient.

Result: Tree Ensemble algorithm had the highest accuracy rates of 99.60% and 92.20% for the classification of alfalfa genetic materials and salt ion types, respectively. Random forest and XGBoost, followed by accuracy rates of 99.50% and 91.20% for the classification of alfalfa genetic materials and salt ion types, respectively. All Cohen’s kappa values were above 85.00%, indicating that distinction between classes was successfully achieved at a high level of reliability. These findings indicate that the alfalfa genetic material can be classified more accurately than salt ion types. DW and PL emerged as the most important features for the classification of alfalfa genetic materials and salt ion types, respectively.

Alfalfa is one of the most important perennial forage legumes that are cultivated as predominantly cross-pollinated, tetraploid (2n = = 32) species (Ozkan and Benlioglu, 2021), even though a diploid form exists. Alfalfa is the most cultivated forage legume worldwide owing to its adaptability, high yield potential, good quality and resistance to numerous cuttings (Bhattarai et al., 2020). Türkiye, especially Eastern Türkiye, is one of the putative centers of diversity and origin suggested for alfalfa, which has important economic and cultural roles in Anatolian agriculture (Wang and Sakiroglu, 2021). In addition, more than 600 thousand ha were cultivated and 19 million tons of alfalfa were produced in Türkiye in 2024 (Turkstat, 2025a). Salinity and its stress are major abiotic factors that limit crop production, especially in arid and semiarid regions (Rasool et al., 2013). In Türkiye, approximately 15187 km² of arable land are classified as saline or sodic soils under the FAO salinity classification framework (Abrol et al., 1988), constituting 5.48% of the total cultivable area. Butcher et al. (2016) projected that salinization may affect ~50% of global arable land by 2050, thereby threatening agricultural sustainability and creating food insecurity risks.
       
This may cause an ionic imbalance that limits water uptake by plants and impairs nutrient ion homeostasis (Szabolcs, 1989). Bertrand et al. (2015) pointed out that alfalfa is moderately tolerant to salinity compared to other forage plants. Munns and Tester (2008) also indicated that legumes have high salt tolerance. Various types of salt ions negatively impact alfalfa by causing a decrease in root and shoot length, as well as a reduction in both fresh and dry weight (Ozkan, 2025). Notwithstanding, significant heterogeneity exists in salinity stress tolerance across Medicago sativa cultivars and genotypes (Huang et al., 2018; Benabderrahim et al., 2020). Sodium chloride (NaCl), a neutral salt, has been widely studied and is considered to be the predominant salt used in salinity stress research on plants (Li et al., 2010; Ashrafi et al., 2018). However, despite their prevalence, relatively little attention has been paid to the effects of other salts, such as calcium and potassium, on crops such as alfalfa (Bhattarai et al., 2020). Understanding how these alternative salts influence plant physiology could provide a more comprehensive view of salinity stress and inform strategies to improve crop resilience.
       
With its use in a broad range of applications in the field of agriculture, machine learning (ML) has become ever more significant and efficient recently. With conventional approaches, the analysis of complicated data arising at several phases of agricultural production becomes challenging; hence, the use of ML algorithms becomes unavoidable. Because they can process enormous volumes of data, identify trends in data and support decision-making processes based on these patterns, ML algorithms have evolved into potent tools in agriculture (Gerdan Koc et al., 2024). Particularly artificial neural networks, these algorithms can produce significant models from data and enable various tasks including classification, prediction and optimization to be effectively finished using these models (Assani et al., 2023; Molina Menéndez and Parraga Alava, 2024). Classification is integral to all scientific disciplines, serving as a crucial tool for organizing knowledge across various applications. It facilitates the identification and differentiation of diverse entities, phenomena and concepts, thereby providing a structured framework (Rahman et al., 2018) In this regard, ML algorithms help to manage agricultural operations in a more  sustainable, regulated and efficient way (Polat et al., 2025).
               
Numerous studies have been conducted on the selection of alfalfa under stressful conditions (Badran et al., 2015; Zhang et al., 2017; Song et al., 2019; Xia et al., 2020; Xia et al., 2021). Nevertheless, none of these studies have investigated the features that are most important and influential during the germination phase. Moreover, other studies aimed at classifying alfalfa genetic materials using ML algorithms used image data, satellite images, spectral features and/or vegetation indices and did not employ the analysis of physiological features (Chandel et al., 2021; Zhang et al., 2023; Zhao et al., 2023; Ma et al., 2024). This study aims to identify gaps in the literature and fill this knowledge gap. These are (1) the non-use of physiological features in ML algorithms for classifying alfalfa genetic materials during the selection phase, where it was desired to observe different types of salt (NaCl, CaCl2 and KCl) and their effects and (2) the lack of analysis of feature importance affecting genotype selection during the germination stage of alfalfa. 
Alfalfa genetic materials and stress treatments
 
Four non-dormant alfalfa genetic materials were used: a cultivar CUF-101 and three synthetic genotypes; Genotype 3 (Gen3), Genotype 9 (Gen9) and Genotype 16 (Gen16). CUF-101 (fall dormancy class 9) was selected as a reference cultivar due to its established growth characteristics. The synthetic genotypes, derived from 192 Peruvian clones and adapted to Urfa Province, Türkiye, were selected based on shared non-dormancy characteristics, seedling vigor, seed yield, cutting yield and mature plant performance to ensure comparability with CUF-101.
       
A laboratory experiment was conducted using three controlled temperatures [(Tmin (18oC), Tmid (25oC) and Tmax (32oC)] and six salinity treatments of three salt ion types [(100 and 200 mM of sodium chloride (NaCl), calcium chloride (CaCl2) and potassium chloride (KCl), n = 6)] at the Department of Field Crops, Faculty of Agriculture, Ankara University between 2023-2024. CUF-101 and three alfalfa synthetic genotypes were evaluated under all combinations of temperature and salinity. Treatments were designated as Tmin, Tmid, Tmax for temperature and NaCl, CaCl2  and KCl for salt ion type.
       
Three replicates of 50 seeds per alfalfa genetic materials were placed between three moistened filter paper sheets, with solution volumes equivalent to three times the dry mass of the paper and supplemented with 5 ml of salt solution per paper. Seeds were incubated in sealed plastic bags in a growth chamber at controlled temperature (Tmin, Tmid, Tmax) under a 12/12 h light/dark photoperiod for 10 days. The filter papers were replaced every two days to prevent salt accumulation and contamination. Treatments included three salt ion types, alfalfa genetic materials, which are a cultivar and three synthetic genotypes. Germination was recorded upon radicle emergence (≥2 mm) (Bhattarai et al., 2020). Germination energy (GE), germination percentage (GP), germination index (GI), mean germination days (MGD), root length (RL), plumule length (PL), fresh weight (FW), dry weight (DW) and seedling vigor (SV) were assessed during the germination process to classify salt ion types and alfalfa genetic materials. Germinated seeds were recorded after 10 days of exposure to stress conditions. After determining GE, GP and GI, 20 normal seedlings were randomly chosen to measure RL, PL, FW and DW. RL, PL, FW and DW were measured using a millimeter ruler and analytical balance, in the same order.
 
Data collection
 
In this study, three controlled temperatures (Tmin, Tmid, Tmax, n=3), six salinity treatments of three salt ion types (100 and 200 mM NaCl, CaCl2 and KCl for each, n=6) and four alfalfa genetic materials (CUF-101 and three alfalfa synthetic genotypes, n = 4) were used as treatments with three replicates (nrow =216). In addition to this, multiple biological replicates (per treatment-genetic material combination, n=3) to ensure robustness against individual plant variability. This study uses Synthetic Minority Over-sampling Technique (SMOTE) as a data augmentation technique to boost underrepresented class sample count. By interpolating between the feature vectors of closest neighbors inside that class, SMOTE creates new synthetic samples for the minority class instead of copying current examples. Designed from the nearest neighbors, these synthetic samples improve class representation and support more efficient training of classification models, also enhancing model robustness (Chawla et al., 2002).
 
Machine learning classification algorithms
 
Seven machine learning classification algorithms were selected for their proven efficacy in handling high-dimensional physiological data and complex feature interactions. Multilayer Perception (MLP), K-nearest Neighbors (KNN), Decision Tree, Random Forest, Gradient Boosted Trees (GBT), Extreme Gradient Boosting (XGBoost) and Tree Ensemble algorithms used in this study. They also presented, detailed and formulated (Eq. 1-8) below. Algorithms were implemented using all of the steps were written in Python with libraries of NumPy, SciPy, Seaborn, Pandas, Matplotlib and Scikit-learn and ran on a Jupyter notebook.
       
Multilayer Perceptron
(MLP) is one of the most basic and widely used types of artificial neural networks that is used for classification, regression and time series forecasting (Rumelhart et al., 1986). MLP consists of an input layer, one or more hidden layers and an output layer. The neurons in each layer are fully connected to all neurons in the previous layer and the learning process is carried out by updating the weights (Haykin, 1994). The network is trained using backpropagation and gradient descent algorithms. The goal is to optimize a loss function (e.g., cross entropy or MSE) that minimizes the difference between the true value and the prediction. MLP offers a powerful approach with the ability to model nonlinear functions. The output of each neuron is calculated as follows:


 
Where:
χ = Input features,
wi = Learned weights,
b = Bias term,
φ (.) =  Activation function (e.g. ReLU, sigmoid, tanh).
z = Output of the neuron.
       
In supervised learning applications including classification and regression, K-nearest neighbors (KNN) is a nonparametric, instance-based algorithm. Based on the similarity between data points, this lazy learning approach generates real-time predictions without an explicit training phase (Halder et al., 2024) without using a distance metric-most usually the Euclidean distance the algorithm finds the "k" closest training instances for a new input and projects the output based on their labels (Fix, 1985). The Euclidean distance between a new input x’ and each x'  is computed to identify the closest neighbors, so enabling prediction through proximity-based inference from a training set {(x1, y1)..., (xn, yn)}. This metric is especially effective when working with continuous and normalized dataset. Euclidean distance between two data points is calculated by the following formula:


 
Where
x' = (x'1, x'2, ..., x'n) = Feature vector of the new observation.
x' = (x'1, x'2, ..., x'n) = The features of the ith sample in the training set, in indicates the number of features.
n =  Indicates the number of features,
d (x' , xi) = Represents the Euclidean distance between the new observation and the training sample.
       
Decision trees (DT) are hierarchical, rule-based models used in supervised learning for both classification and regression. The model recursively splits data into subsets based on feature values, forming a tree structure where decisions are made at internal nodes and final predictions at leaf nodes. In classification tasks, splitting is typically guided by criteria such as information gain or the Gini index (Quinlan, 1986; Breiman, 2001). Gini Index measures the irregularity (impurity) in the dataset and is calculated with the following formula:


 
Where 
c = The number of class labels.
Pi = The percentage of class in a node.
       
Random Forest
is a tree-based ensemble learning algorithm used for classification and regression tasks. It constructs multiple decision trees using data subsets and the bagging method, combining their outputs to improve model performance (Breiman, 2001).  Each tree votes for a class and the final prediction is determined by majority vote. The predicted class label is the class suggested by the largest number of trees:


Where
C= Class set.
T =  Total number of trees.
ht(x) : t = Decision tree prediction.
|| (.) = Representation function; returns 1 if condition is true, 0 otherwise.
ŷ = Class with the most votes (most predicted).
       
Gradient boosted trees (GBT) is boosting is an iterative ensemble method that combines multiple weak learners, classifiers that perform slightly better than random guessing into a strong predictor (Schapire, 1990; Polikar, 2012). Unlike bagging, which treats all training instances equally via bootstrap sampling, boosting focuses more on instances misclassified by previous learners. Gradient Boosting, a specific form of this approach, builds decision trees sequentially to minimize prediction error. Boosted Trees, resulting from this combination, offer improved classification performance (Gupte et al., 2014). The fundamental concept is that, using the loss function, every new model progressively lowers the mistakes of past models; hence, the prediction function is updated iteratively as follows:
 
 
Where
Fm (x) = The updated prediction.
hm (x) = Weak learner trained on the negative gradient of the loss function.
γm = Learning rate controlling each new learner.
       
Extreme gradient boosting (XGBoost) is a high-performance ensemble algorithm based on Gradient Boosted Decision Trees (GBDT), optimized for speed, scalability and efficiency (Brownlee, 2016; Chen and Guestrin, 2016). Taylor expansion for more accurate loss optimization and includes an L2 regularization term to control overfitting. Its support for parallel computing, hyper parameter tuning and resource efficiency makes it effective for large-scale datasets (Zhuo et al., 2025). The formulas for this algorithm are as follows:  
 
            
Where
Loss function between true value yi and prediction yi. 
fm : m= New decision tree learned in mth iteration,
Ω (f) = Regularization term for model complexity.
       
XGBoost uses the quadratic Taylor series expansion to optimize the above objective function, thus providing more efficient and precise optimization:

Where



             

           
  
w2= Regularization term.
Tj =  Number of leaves in the tree.
wj = Score of the jth leaf.
y, λ = Regularization controlled.
       
Tree ensemble learning is a modeling approach based on ensemble learning strategies, where multiple decision trees work together. The aim of this algorithm is to overcome the limited capacity of a single decision tree by combining multiple models, thus increasing both accuracy and generalization ability. This approach encompasses both bagging (e.g. Random Forest) and boosting (e.g. Gradient Boosting, XGBoost) strategies (Zhou, 2025). The general formula is as follows:


Where,
y(x) = Final prediction of the model. 
hm(x) = mth decision tree (or weak learner). 
ym = m. weight of the mth model.
M = Total number of trees.
 
Model evaluation
 
Cohen’s Kappa, accuracy, recall, precision, specificity and F1-score were among the performance measures computed for the classification of alfalfa genetic materials and salt ion types. Ten-fold cross-validation was used to assess model performance (Fig 1); hence, the dataset was split into ten equal parts: nine parts were used for training and one part for testing in every iteration (MacCallum et al., 1992). Average of the measurements over all folds produced final results. The feature importance of the algorithm that achieved the best results in the classification based on alfalfa genetic material and salt ions was specified (Fig 3).

Fig 1: Ten-fold cross validation.


 
Po = Relative observed agreement among raters.
Pe = Hypothetical probability of the match occurring by chance.
       
Accuracy is a metric that indicates the system’s ability to predict. It can be defined as follows: 

         
TP: True positive forecasts the model’s correct positive class.
TN: Correctly forecasts the negative class from true negative.
FP: False positive, results from a model misreading the positive class.
FN: False negative predicts erroneous negative class.


True negative rate is another term for specificity.

      
The precision or predicted position value (PPV) is a metric that quantifies the accuracy of the system’s predictions.

 
F1-score is the comprehensive efficiency evaluation that is derived from the precision and recall measures.

Overall classification performance and confusion matrix of the alfalfa genetic materials and salt ion types
 
In the classifications made according to salt ion type, the model performance decreased slightly compared to the alfalfa genetic materials. However, high accuracy rates have generally been observed (Table 1).

Table 1: Accuracy, error, Cohen’s kappa value for alfalfa genetic materials and salt ion types for ML algorithms.


       
Tree Ensemble algorithm achieved peak accuracy of 99.60%, followed by Random Forest 99.50%, XGBoost 99.50%, GBT 99.00% and DT 97.80% for the classification of alfalfa genetic materials (Table 1). Tree Ensemble again outperformed with accuracy of 92.20%, followed by XGBoost (91.20%), DT (90.90%) and Random Forest (90.60%), in terms of salt ion type classification. KNN showed the lowest accuracy of 85.60%, aligning with MLP with accuracy of 83.10%, consistent with Pazoki and Pazoki (2011), who reported 86.48% accuracy for wheat cultivar classification (e.g., NaCl vs. KCl), demonstrating robustness to stress variability under treatments. This study’s approach, which was based on physiological responses, outperformed Ghamari (2012), who used morphological responses in chickpeas (79.00% accuracy). However, Ajaz and Hussain (2015) achieved a higher accuracy of 95.20% for wheat classification using MLP, likely because of simpler class distinctions. Tree-based ensembles consistently performed well, emphasizing their suitability for complex biological data involving interacting stressors (salt, temperature and genetics). The dominance of Tree Ensemble also aligns with its hybrid architecture that combines bagging (Random Forest) and boosting (XGBoost) to mitigate overfitting while capturing nonlinear interactions (Brownlee, 2016; Chen and Guestrin, 2016). In addition, the results show that these methods can successfully learn complex structures in biological data and improve classification accuracy.
       
The models clearly captured class-specific patterns, reflecting their capacity to generalize over stress variability, even if salinity stress-induced responses overlap with physiological responses. Near-perfect genetic material classification likely originates from fixed genomic differences, whereas salt ion confusion (e.g., NaCl vs. KCl) reflects shared ion-specific toxicity pathways. This emphasizes the strength and flexibility of the ML algorithms used in difficult biological classification problems. Along with overall accuracy, the model performance at the class level was assessed using a confusion matrix. The confusion matrices for alfalfa genetic material and salt ion type classifications are shown in Fig 2.

Fig 2: Confusion matrix of alfalfa genetic materials (a), Salt ion types (b).


       
Average confusion matrix obtained from the results of the ten-fold cross-validation was presented (Fig 2). The intraclass accuracy exceeded 98.9% for alfalfa genetic material (CUF-101, 99.50%; Gen3, 98.90%; Gen9, 99.40%; Gen16, 99.70%), with interclass confusion <0.60% (Fig 2). Misclassifications occurred primarily between Gen3-Gen9, reflecting their shared genetic background. CaCl2, KCl and NaCl, which were the salt ion types in this study, showed high but differential accuracies (89.30%, 91.90% and 90.70%, respectively). This indicates that the model can classify between alfalfa genetic materials with higher success rates than salt ion types. Cross-prediction errors occurred between CaCl2-KCl (5.50%) and CaCl2-NaCl (6.20%), indicating overlapping physiological responses in alfalfa genetic material. Shared osmotic stress pathways across chloride salts (Baha, 2022), in which NaCl/KCl induced near-identical reductions in GP and MGD. CaCl2 and NaCl also caused statistically similar RL and DW responses in alfalfa (Gao et al., 2023). These findings show that physiological processes whereby moderate salinity levels from various salt ion types may induce similar osmotic stress or ion toxicity effects (Quan et al., 2021), producing similar patterns in RL, DW and FW.
       
The models demonstrated a high capacity for classification of alfalfa genetic materials and salt ion types, with correct classification rates exceeding 90.00%. These findings indicate that the developed models exhibit balanced and reliable performance in terms of overall accuracy and class level. For alfalfa genetic material classification; Tree Ensemble, Random Forest and XGBoost algorithms achieved accuracy rates exceeding 99.00% with error rates below 0.50%. These algorithms yielded balanced results at the class level, with correct classification rates for each alfalfa genetic material exceeding 98.00%. This suggests that the alfalfa genetic material was physiologically more influential to be classified. Higher accuracy values in alfalfa genetic material classification (Table 1) showed that the models effectively learned patterns associated with alfalfa genetic material-specific responses, including weight values, germination features and growth dynamics. These inter genotypic variations provide models with robust decision boundaries that facilitate high-performance classifications. However, the accuracy rates decreased for salt ion type classification. Different salt ion types can induce partially overlapping physiological responses in plants. However, Tree Ensemble, XGBoost and Random Forest algorithms achieved satisfactory classification, with an accuracy of over 90% (Table 1). These results indicate that salinity-induced physiological responses have a learnable structure.

Class-wise evaluation of classification algorithms for alfalfa genetic materials and salt ion types
 
For each class, the performance of several ML algorithms was tested using alfalfa genetic materials and salt ion types. Tree-based ensemble algorithms, such as Random Forest, XGBoost and GBT, often have high accuracy, showing that they are good at handling complex data, as shown in Table 2. The models accurately classified alfalfa genetic material, showing differences in their genetic responses. However, they did not perform as well with salt ion types because of the similar stress reactions of different salt ions. However, the models were fairly accurate, suggesting that some responses to salt stress assist in the classification. Tables 2 and 3 provide a detailed look at how each algorithm classifies between the alfalfa genetic materials and salt ion types. Variations in genetic material are easier to classify because they show clear patterns. However, it is difficult to separate the effects of different salt ions. Overall, the models performed well and were balanced across classes. The algorithms were evaluated using overall accuracy and detailed metrics, such as recall, precision, specificity, F1-score and sensitivity for each class, as described by Koklu and Ozkan (2020), Kautz et al. (2017).

Table 2: Performance metrics of alfalfa genetic materials for ML algorithms.



Table 3: Performance metrics of salt ion types for ML algorithms.


       
The highest performance in alfalfa genetic material classification was obtained with Tree Ensemble, XGBoost and Random Forest algorithms (Table 2). In particular, Tree Ensemble algorithm achieved recall, precision and F1-score values greater than 99.00% for all alfalfa genetic materials. Similarly, the XGBoost and Random Forest algorithms also achieved high and balanced class success in alfalfa genetic materials, especially in Gen9 and CUF-101, with F1-score values above 0.995. Algorithms such as GBT and DT yielded similar results, whereas MLP and KNN were separated by their lower-class performance values, as in Pazoki and Pazoki (2011). When evaluated at the class level, most of the models reached precision, recall and F1-score values above 99.00% in alfalfa genetic material classification, whereas these values remained in the range of 89.00-92.00% in salt ion type classification.
       
Table 3 shows the performances of the algorithms at the class level for different salt ion types (CaCl2, KCl and NaCl). Tree Ensemble showed the most powerful performance here as well, achieving high class success rates with recall = 0.937 and F1-score = 0.929 for NaCl. In particular, the KCl was classified with very high accuracy by both XGBoost and Tree Ensemble (recall > 0.93). The class-level performance was slightly lower for salt ion types than for alfalfa genetic material classification. This variation could be explained by the physiological responses of different salt ion types, which often overlap and are less distinct than the more consistent and marked variations observed between alfalfa genetic materials. Supporting the results of this study, Gao et al. (2023) and Baha (2022) reported near-identical germination responses GP, MGD and RL, FW under the effects CaCl2 -NaCl and NaCl-KCl, in the same order. The higher misclassification between NaCl and KCl reflects shared Cl- toxicity pathways and osmotic effects, which dominate physiological responses at moderate salinity.
       
Feature importance analysis considering ten-fold cross-validation via normalized information gain revealed context-dependent predictive contributions (Fig 3). In alfalfa genetic material classification, DW showed maximal influence (29.20%), followed by GE (24.60%), GP (18.00%), SV (16.90%) and FW (16.50%), whereas GI, RL and PL showed minimal influence (<5.00% combined). Features linked to physiological responses during germination, such as DW and GE, are more important in alfalfa genetic material classification. Gao et al. (2023) stated that genetic variations showed stronger physiological responses with these features at the germination stage. Okumuş et al. (2024) noted that feature importance of forage pea cultivars’ germination features, such as DW, FW and RL, has been more noticeable than salt doses and controlled temperatures on some ML algorithms. DW, which was mentioned as total dry mass in their study, also one of most important features for selection on soybean genotypes under salinity and drought conditions by study of  de Oliveira et al. (2023). The consistency in the importance of DW across different legume species under stress conditions suggests its potential as a reliable indicator of stress tolerance. This finding could have significant implications for breeding programs aimed at developing cultivars that are more resilient. Germination features such as GI, MGD, RL and PL show lower influence in the classification of alfalfa genetic materials.

Fig 3: Average feature importance based on information gain ranking of alfalfa genetic materials (a) and Salt ion types (b).


       
In salt ion type classification, the influences of features via normalized information gain order differed. PL (13.20%) and RL (11.70%) showed maximal influences on salt ion types, followed by FW (11.00%) and DW (10.40%). These features reflect physiological adaptations and stress reactions caused by different salt ions. Though DW was not as dominant as in alfalfa genetic material classification, it remained active in model decisions. In the classification of salt ion types, focusing on length measurements during germination is more influential. These changes in feature importance offer new insights into model behavior and plant responses, showing which features are more important under different stress conditions (Molnar, 2020). The learning process prioritizes important features based on context, supporting the biological understanding of classification results.
 
Heat map analysis of classification alfalfa genetic materials under salt ion types and controlled temperatures
 
The goal of these analyses was to find out which synthetic genotypes are better or worse at handling certain types of stressful conditions from a classification point of view and to see how to stress-specific responses affect the model. Two separate heat map analyses were done to see how stress factors affect the classification performance of alfalfa genetic materials (Fig 4 and 5).

Fig 4: Heat map analysis of classification alfalfa genetic materials under salt ion types and controlled temperatures (NaCl, CaCl2, KCl) conditions.



Fig 5: Heat map of alfalfa genetic materials classification ability levels under temperatures (Tmin, Tmid, Tmax).


       
Heat map analysis of the model outputs revealed that alfalfa genetic material classification performance varied according to the salt ion type applied. Gen16, especially under the CaCl2 treatment, reached the highest number of correct classifications with 479 samples and was the synthetic genotype that could be separated most clearly by the model under these conditions. In general, alfalfa genetic material had a high level of classification performance under all three salt ion types.
       
Alfalfa genetic material classification performance under three different temperature conditions (Tmin, Tmid and Tmax) was comparatively analyzed and the reflections on model stability were presented (Fig 5). Gen16 achieved the highest classification accuracy with 487 samples in the Tmid condition. This indicates that the Tmid is the condition under which Gen16 is the most clearly classified synthetic genotype by the model. However, the classification accuracy decreased to 410 samples in the Tmin condition, suggesting that low temperatures both weaken biological responses and limit the model’ classification ability. Gen3 provided the best accuracy with 479 samples under Tmax, indicating that the model could successfully classify this synthetic genotype, even under high-temperature conditions. When both stress conditions were evaluated together, it was revealed that the classification performance varied not only depending on the alfalfa genetic material but also on the type of stressful conditions to which the alfalfa genetic material was exposed and some synthetic genotypes could be classified more clearly under certain conditions. This finding is consistent with studies showing that alfalfa genetic material responds to environmental stress in specific ways and that temperature tolerance varies depending on alfalfa genetic material (Parent and Tardieu, 2012; Basbag et al., 2017).
 
Decision trees based on alfalfa genetic materials and salt ion types
 
Decision trees enable the interpretation and intuitive evaluation of model decisions by visualizing features in the classification model (Blockeel et al., 2023; de Oliveira et al., 2023; Zhao et al., 2024). The decision tree structure shows how the model proceeds in alfalfa genetic material classification and which features are prioritized (Fig 6 and 7). The root node is DW, indicating that this response is the primary criterion for separating alfalfa genetic material. After the first division based on DW, the model was directed toward features such as FW, GE, GI, MGD and SV at the second level. This indicates that weight features play a significant role in the classification of alfalfa genetic material. Gen3 and Gen16 synthetic genotypes were separated through different sub-pathways; GE and GI were used in some branches, whereas FW and SV were used in other branches. At extreme nodes, the classification accuracy was quite high (close to 100%) and was achieved with few decision rules.

Fig 6: Decision tree based on alfalfa genetic materials.



Fig 7: Decision tree based on salt ion types.


               
Decision structure of the tree-based model was presented for salt ion type classification (CaCl2, NaCl and KCl) (Fig 7). The PL at the root node indicated this feature was the first decision criterion for salt ion type classification. After the initial PL split, the model focused on features like FW and MGD at the second level. This structure shows that physiological responses to salt stress are important for classification ability. In the decision path on the lower left branch, it was observed that the samples belonging to the KCl were classified with high accuracy. This suggests that more complex decision paths are required for the CaCl2 and that the model creates deeper structures to separate this class. 
This study introduces a novel ML framework for classifying alfalfa genetic material and salt-ion impacts using germination physiology under controlled temperatures. The results obtained from the ML algorithms revealed that alfalfa genetic materials can be classified more easily than salt ion types. The results also indicated that genetic differences produce distinct physiological signatures, enabling near-perfect classification, while salt-ion types (NaCl, CaCl2 and KCl) showed greater misclassification in this study. Genetic differences have been found to trigger various strong physiological responses in alfalfa during germination. Furthermore, while it is recognized that moderate salinity levels from different salt ions can lead to similar osmotic stress or ion toxicity effects, this complicates the classification of salt ion types even though low rates. Therefore, to influential classify the effects of the salt ion type, it is essential to assess the physiological responses of salt ions at higher salt concentrations. Dry weight (DW) was identified as the most important feature for alfalfa genetic material, whereas plumule (PL) were determined to be the primary feature for classifying salt ion types. Further research exploring higher salt ion types and concentrations and additional physiological responses could enhance our understanding of salt tolerance mechanisms in alfalfa and contribute to more effective breeding strategies. Stress intensities beyond osmotic thresholds (>200 mM) should be tested to amplify ion-specific responses and validate DW/PL as a selection feature in field trials. This approach may accelerate salt-resilient alfalfa breeding by identifying context-critical traits.
The present study was not supported by any institute.
 
Disclaimers
 
The authors are responsible for the accuracy and completeness of the information provided, but do not accept any liability for any direct or indirect losses resulting from the use of this content.
 
Data availability statement
 
Data is available when is requested from corresponding author.
The author declare that he has no conflicts of interest.

  1. Abrol, I., Yadav, J.S.P. and Massoud, F. (1988). Salt-affected soils and their management. Food and Agriculture Org. 39.

  2. Ajaz, R.H. and Hussain, L. (2015). Seed classification using machine learning techniques. Seed. 2(5): 1098-1102. 

  3. Ashrafi, E., Razmjoo, J. and Zahedi, M. (2018). Effect of salt stress on growth and ion accumulation of alfalfa (Medicago sativa L.) cultivars. Journal of Plant Nutrition. 41(7): 818-831. https://doi.org/10.1080/01904167.2018.1426017.

  4. Assani, N., Matić, P., Kaštelan, N. and Čavka, I.R. (2023). A review of artificial neural networks applications in maritime industry. IEEE Access. 11: 139823-139848. 10.1109/ACCESS.2023. 3341690.

  5. Badran, A., ElSherebeny, E. A. and Salama, Y. (2015). Performance of some alfalfa cultivars under salinity stress conditions. Journal of Agricultural Science. 7(10): 281. http://dx.doi. org/10.5539/jas.v7n10p281. 

  6. Baha, N. (2022). Comparative effects of osmotic and salt stresses on germination and seedling growth of alfalfa: Physiological responses involved. Agriculturae Conspectus Scientificus. 87(4): 311-319. 

  7. Basbag, M., Aydin, A. and Sakiroglu, M. (2017). Evaluating agronomic performance and investigating molecular structure of drought and heat tolerant wild alfalfa (Medicago sativa L.) collection from the Southeastern Turkey. Biochemical Genetics. 55: 63-76. https://doi.org/10.1007/s10528- 016-9772-7.

  8. Benabderrahim, M.A., Guiza, M. and Haddad, M. (2020). Genetic diversity of salt tolerance in tetraploid alfalfa (Medicago sativa L.). Acta Physiologiae Plantarum. 42(1): 5. https:// doi.org/10.1007/s11738-019-2993-8.

  9. Bertrand, A., Dhont, C., Bipfubusa, M., Chalifour, F.P., Drouin, P. and Beauchamp, C.J. (2015). Improving salt stress responses of the symbiosis in alfalfa using salt-tolerant cultivar and rhizobial strain. Applied Soil Ecology. 87: 108-117. https://doi.org/10.1016/j.apsoil.2014.11.008.

  10. Bhattarai, S., Biswas, D., Fu, Y.B. and Biligetu, B. (2020). Morphological, physiological and genetic responses to salt stress in alfalfa: A review. Agronomy. 10(4): 577. https://doi.org/ 10.3390/agronomy10040577.

  11. Blockeel, H., Devos, L., Frénay, B., Nanfack, G. and Nijssen, S. (2023). Decision trees: From efficient prediction to responsible AI. Frontiers in Artificial Intelligence. 6: 1124553. https:// doi.org/10.3389/frai.2023.1124553.

  12. Breiman, L. (2001). Random forests. Machine learning. 45: 5-32. https://doi.org/10.1023/A:1010933404324.

  13. Brownlee, J. (2016). How to develop your first xgboost model in python. Machine Learning Mastery. 

  14. Butcher, K., Wick, A.F., DeSutter, T., Chatterjee, A. and Harmon, J. (2016). Soil salinity: A threat to global food security. Agronomy Journal. 108(6): 2189-2200. https://doi.org/10. 2134/agronj2016.06.0368.

  15. Chandel, A.K., Khot, L.R. and Yu, L.X. (2021). Alfalfa (Medicago sativa L.) crop vigor and yield characterization using high-resolution aerial multispectral and thermal infrared imaging technique. Computers and Electronics in Agriculture. 182: 105999. https://doi.org/10.1016/j.compag.2021. 105999.

  16. Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research. 16: 321-357. https://doi.org/10.1613/jair.953.

  17. Chen, T. and Guestrin, C. (2016). A scalable tree boosting system. Proc 22nd ACM SIGKDD Int Conf Knowl Discov Data Min.

  18. de Oliveira, B.R., Zuffo, A.M., Steiner, F., Aguilera, J.G. and Gonzales, H.H.S. (2023). Classification of soybean genotypes during the seedling stage in controlled drought and salt stress environments using the decision tree algorithm. Journal of Agronomy and Crop Science. 209(5): 724-733. https:// doi.org/10.1111/jac.12654.

  19. Fix, E. (1985). Discriminatory analysis: nonparametric discrimination, consistency properties. USAF school of Aviation Medicine. (Vol. 1). 

  20. Gao, Z., Liu, J., Zhu, Q., Li, Q., Liu, J., Cui, Y., Mu, Y. and Rasheed, A. (2023). Effect of six single salt stresses on germination of alfalfa (Medicago sativa L.). Applied Ecology and Environmental Research. 21(6). http://dx.doi.org/10. 15666/aeer/2106_57715783. 

  21. Gerdan Koc, D., Koc, C., Polat, H.E. and Koc, A. (2024). Artificial intelligence-based camel face identification system for sustainable livestock farming. Neural Computing and Applications. 36(6): 3107-3124. https://doi.org/10.1007/ s00521-023-09238-w.

  22. Ghamari, S. (2012). Classification of chickpea seeds using supervised and unsupervised artificial neural networks. African Journal of Agricultural Research. 7(21): 3193-3201. doi: 10.5897/AJAR11.2071.

  23. Gupte, A., Joshi, S., Gadgul, P., Kadam, A. and Gupte, A. (2014). Comparative study of classification algorithms used in sentiment analysis. International Journal of Computer Science and Information Technologies. 5(5): 6261-6264. 

  24. Halder, R.K., Uddin, M.N., Uddin, M.A., Aryal, S. and Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: A comprehensive review and performance analysis of modifications. Journal of Big Data. 11(1): 113. https://doi.org/10.1186/s40537- 024-00973-y.

  25. Haykin, S. (1994). Intelligent signal processing. In Advances in Signal Processing for Nondestructive Evaluation of Materials  Springer. (pp. 1-12).

  26. Huang, K., Dai, X., Xu, Y., Dang, S., Shi, T., Sun, J. and Wang, K. (2018). Relation between level of autumn dormancy and salt tolerance in lucerne (Medicago sativa). Crop and Pasture Science. 69(2): 194-204. https://doi.org/10.1071/CP17121.

  27. Kautz, T., Eskofier, B.M. and Pasluosta, C.F. (2017). Generic performance measure for multiclass-classifiers. Pattern Recognition. 68: 111-125. 10.1016/j.patcog.2017.03.008.

  28. Koklu, M. and Ozkan, I.A. (2020). Multiclass classification of dry beans using computer vision and machine learning techniques. Computers and Electronics in Agriculture. 174. 10.1016/j.compag.2020.105507.

  29. Li, R., Shi, F., Fukuda, K. and Yang, Y. (2010). Effects of salt and alkali stresses on germination, growth, photosynthesis and ion accumulation in alfalfa (Medicago sativa L.). Soil Science and Plant Nutrition. 56(5): 725-733. https://doi.org/10. 1111/j.1747-0765.2010.00506.x.

  30. Ma, H., Zhao, W., Duan, W., Ma, F., Li, C. and Li, Z. (2024). Inversion model of soil salinity in alfalfa covered farmland based on sensitive variable selection and machine learning algorithms. Peer J. 12: e18186. https://doi.org/10.7717/peerj. 18186.

  31. MacCallum, R.C., Roznowski, M. and Necowitz, L.B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological bulletin. 111(3): 490. 

  32. Molina Menéndez, E. and Parraga Alava, J. (2024). Artificial Neural Networks for Classification Tasks: A Systematic Literature Review. Enfoque UTE: 1-10. https://doi.org/10.29019/enfo- queute.1058. 

  33. Molnar, C. (2020). Interpretable machine learning. Lulu. com. 

  34. Munns, R. and Tester, M. (2008). Mechanisms of salinity tolerance. Annu. Rev. Plant Biol. 59(1): 651-681. https://doi.org/ 10.1146/annurev.arplant.59.032607.092911

  35. Okumuş, O., Say, A., Eren, B., Demirel, F., Uzun, S., Yaman, M. and Aydın, A. (2024). Using machine learning algorithms to investigate the impact of temperature treatment and salt stress on four forage peas (Pisum sativum var. arvense L.). Horticulturae. 10(6): 656. https://doi.org/10.3390/ horticulturae10060656.

  36. Ozkan, U. (2025). Interactive effects of the temperature and salinity on germination of alfalfa (Medicago sativa L.). Legume Research - an International Journal. 48(5): 762-772. doi: 10.18805/LRF-839.

  37. Ozkan, U. and Benlioglu, B. (2021). Karyotypical Identification of Some Important Alfalfa (Medicago sativa L.) Lines in Turkey. Turkish Journal of Agriculture-Food Science and Technology. 9(4): 740-744. https://doi.org/10.24925/ turjaf.v9i4.740-744.4094.

  38. Parent, B. and Tardieu, F. (2012). Temperature responses of develop- mental processes have not been affected by breeding in different ecological areas for 17 crop species. New Phytologist. 194(3): 760-774. https://doi.org/10.1111/j. 1469-8137.2012.04086.x.

  39. Pazoki, A. and Pazoki, Z. (2011). Classification system for rain fed wheat grain cultivars using artificial neural network. African Journal of Biotechnology. 10(41): 8031-8038. doi: 10.5897/ AJB11.488.

  40. Polikar, R. (2012). Ensemble Machine Learning: Methods and Applications, chapter Ensemble Learning. In: Springer.

  41. Polat, H.E., Koc, D.G., Ertuğrul, Ö., Koç, C. and Ekinci, K. (2025). Deep learning based individual cattle face recognition using data augmentation and transfer learning. Journal of Agricultural Sciences. 31(1): 137-150. doi: 10.15832/ ankutbd.1509798. 

  42. Quan, X., Liang, X., Li, H., Xie, C., He, W. and Qin, Y. (2021). Identifi- cation and characterization of wheat germplasm for salt tolerance. Plants. 10(2): 268. https://doi.org/10.3390/ plants10020268.

  43. Quinlan, J.R. (1986). Induction of decision trees. Machine learning. 1: 81-106. 

  44. Rahman, S.A.Z., Mitra, K.C. and Islam, S.M. (2018). Soil classification using machine learning methods and crop suggestion based on soil series. 2018 21st international conference of computer and information technology (ICCIT).

  45. Rasool, S., Hameed, A., Azooz, M., Siddiqi, T. and Ahmad, P. (2013). Salt stress: Causes, types and responses of plants. Ecophysiology and responses of plants under salt stress. 1-24. https://doi.org/10.1007/978-1-4614-4747-4_1.

  46. Rumelhart, D.E., Hinton, G.E. and Williams, R.J. (1986). Learning representations by back-propagating errors. Nature. 323(6088): 533-536. 

  47. Schapire, R. E. (1990). The strength of weak learnability. Machine Learning. 5: 197-227. https://doi.org/10.1007/BF00116037.

  48. Song, Y., Lv, J., Ma, Z. and Dong, W. (2019). The mechanism of alfalfa (Medicago sativa L.) response to abiotic stress. Plant Growth Regulation. 89: 239-249. https://doi.org/10.1007/ s10725-019-00530-1.

  49. Szabolcs, I. (1989). Salt-affected soils. 

  50. Turkstat. (2025a). Periodic Gross Domestic Product. Turkish Statistical Institute. https://data.tuik.gov.tr/Bulten/Index?p= Donemsel-Gayrisafi-Yurt-Ici-Hasila-I.-Ceyrek:-OcakMart, -2024-53753.  Retrieved 28 April 2025.

  51. Wang, Z. and Sakiroglu, M. (2021). The origin, evolution and genetic diversity of alfalfa. In The Alfalfa Genome. Springer. (pp. 29-42). https://doi.org/10.1007/978-3-030-74466-3_3. 

  52. Xia, F., Wang, C., Li, Y., Yang, Y., Zheng, C., Fan, H. and Zhang, Y. (2021). Influence of priming with exogenous selenium on seed vigour of alfalfa (Medicago sativa L.). Legume Research-An International Journal. 44(9): 1124-1127. doi: 10.18805/LR-587.

  53. Xia, F., Wang, F., Wang, Y., Wang, C., Tian, R., Ma, J., Zhu, H. and Dong, K. (2020). Influence of boron priming on the antioxidant ability of alfalfa seeds. Legume Research-An International Journal. 43(6): 788-793. doi: 10.18805/LR-536.

  54. Zhang, H., Li, X., Nan, X., Sun, G., Sun, M., Cai, D. and Gu, S. (2017). Alkalinity and salinity tolerance during seed germination and early seedling stages of three alfalfa (Medicago sativa L.) cultivars. Legume Research-An International Journal. 40(5): 853-858. doi: 10.18805/lr.v0i0.8401.

  55. Zhang, J., Zhang, A., Liu, Z., He, W. and Yang, S. (2023). Multi-index fuzzy comprehensive evaluation model with information entropy of alfalfa salt tolerance based on LiDAR data and hyperspectral image data. Frontiers in Plant Science. 14: 1200501. https://doi.org/10.3389/fpls.2023.1200501.

  56. Zhao, G.y., Ohsu, K., Saputra, H.K., Okada, T., Suzuki, J., Kuwahara, Y. and Fujita, M. (2024). Enhancing interpretability of tree- based models for downstream salinity prediction: Decomposing feature importance using the Shapley additive explanation approach. Results in Engineering. 23: 102373. https:// doi.org/10.1016/j.rineng.2024.102373.

  57. Zhao, W., Ma, F., Yu, H. and Li, Z. (2023). Inversion model of salt content in alfalfa-covered soil based on a combination of UAV spectral and texture information. Agriculture. 13(8): 1530. https://doi.org/10.3390/agriculture13081530.

  58. Zhou, Z.H. (2025). Ensemble Methods: Foundations and Algorithms. CRC Press. 

  59. Zhuo, H., Li, T., Lu, W., Zhang, Q., Ji, L. and Li, J. (2025). Prediction model for spontaneous combustion temperature of coal based on PSO-XGBoost algorithm. Scientific Reports. 15(1): 2752. https://doi.org/10.1038/s41598-025-87035-2.

Editorial Board

View all (0)