Smart Detection of Macro Nutrient Deficiency in Soybean Plant using Convolutional Neural Network

1Department of Computer Science and Engineering, Government College of Engineering Srirangam, Trichy-620 012, Tamil Nadu, India.
  • Submitted19-06-2025|

  • Accepted25-10-2025|

  • First Online 12-11-2025|

  • doi 10.18805/LR-5534

Background: Legumes play an important role in improving soil quality and nutrition in humans. Soybean is one of the legumes which are rich in protein and oil content. Nutrients deficiency in soybean plant could affect the growth of the plants and might lead to loss in its yield. The developments in modern technologies such as computer vision and deep learning are being leveraged in identifying nutrient deficiencies in soybean plants.

Methods: Convolutional neural network with six feature extraction blocks and one classification block is developed to identify macro nutrient deficiency in soybean plants. Images are first collected, pre-processed and labeled. They are then split into training images and testing images in the ratio of 80:20. The proposed convolutional neural network model is trained over training images and the final trained model is tested with testing images.

Result: In detecting the deficiency of macro nutrients such as nitrogen, phosphorus and potassium in soybean plants, the testing results of the proposed convolutional neural network architecture achieved an accuracy of 97.43%. Accuracy comparison against existing models such as VGG16, ResNet50 and MobileNetV3 demonstrate that the proposed model effectively identifies nutrient deficiencies in soybean plants. Thus, the proposed system is designed to support farmers in making timely decisions and to contribute to food security by leveraging deep learning techniques.
Soybean, the most important legume crop that belongs to Fabaceae family is well known for its richness in protein (37%-48%) and oil (16%-21%) content (Bagale, 2021). Soybean plant requires fifteen nutrients for proper growth and development of the plant. These nutrients are classified into two categories namely macro nutrients and micro nutrients. Nitrogen, phosphorous, potassium, sulfur, calcium, magnesium are the macro nutrients needed for structural and functional growth of soybean plant. The micro nutrients needed for enzymatic and cellular regulation functions are copper, iron, manganese, zinc, boron, chloride, molybdenum and nickel. Thus, nutrient management is a primary and essential task in soybean cultivation (Lisciani et al., 2024).
       
This paper focuses on analyzing the deficiency of macro nutrients such as nitrogen, potassium and phosphorus in soybean plant. The symptoms of lack of these nutrients in soybean plant show sign in the color of the leaves or deterioration of growth. Nitrogen deficiency in soybean plant causes chlorosis in intermediate leaves and when the deficiency is severe, the older leaves might suffer intense chlorosis resulting in yellow coloration (Ma et al., 2010). Phosphorus deficiency in soybean plant causes stunted growth and older leaves may have dark green or bluish-green coloration (Li et al., 2010). Potassium deficiency in soybean plant leads to yellowing along the leaf margins starting at the tip and edges of the older leaves (Wang et al., 2015). These nutrient deficiencies could be identified by visual observation or laboratory analysis of leaf tissues. While visual diagnosis relies heavily on expert experience and is subjective, laboratory analysis provides accurate nutrient concentration data but is time-consuming, labour-intensive and dependent on proper sampling and infrastructure. Precision agriculture aims to improve crop yields by analyzing the potential data for decision making and ensures sustainable development. Recent developments in computer vision and deep learning techniques provide opportunities for automating nutrient deficiency detection in plant (Lavanya et al., 2022).
 
Related works
 
Iron deficiency in soybean plants was analyzed using visual images of soybean leaves by examining the dark green color index, canopy size and pixel ranges with the help of machine learning algorithms such as decision tree, random forest and AdaBoost. Adaboost identified iron deficiency in soybean leaves with an F1-score of 0.75 in identifying the iron deficiency chlorosis (Hassanijalilian et al., 2020). Deficiency of potassium macronutrient was studied with their collected dataset using image processing technique and convolutional deep neural network. It was a binary classification model with classes potassium deficiency or healthy which could achieve a precision of about 99% (Sartin et al., 2020).
       
Machine learning algorithms were employed in analyzing the secondary macronutrient content in soybean plants based on spectral information. Spectral images of soybean leaves were collected at the reproductive stage and macronutrient such as calcium, magnesium and sulfur levels were determined. Pearson correlation analysis and K-means clustering were applied to divide the genotype into clusters (Santana et al., 2024). Hyperspectral remote sensing has gained attention as a non-destructive method for observing nutrient deficiencies in crops. Potassium status in soybean plants were evaluated under three categories based on their severity levels using PCA and LDA. It was observed that spectral reflectance patterns are influenced by potassium availability, where SPD conditions led to notably higher reflectance in the visible spectrum, primarily due to decreased chlorophyll and pigment concentrations. At all the growth stages, PCA explained 100% variance to distinguish severe potassium deficiency and LDA achieved 70% accuracy in training phase and 59% accuracy in validation phase (Furlanetto et al., 2024).
       
A data-driven method was developed by analyzing the effects of varying nitrogen, magnesium and potassium levels on hydroponically grown soybeans. Nutrient profiling was conducted during various plant stages and chi-squared testing based feature selection techniques identified key predictors of water uptake. Random Forest performed best for nitrogen and magnesium treatments with R-square score of 0.63 and 0.81 respectively, while Support Vector Regression excelled in potassium treatment with R-square of 0.85. This work used SHapley Additive exPlanations (SHAP) to provide insights into nutrient contributions, enabling better understanding and optimization of hydroponic nutrient management (Dhal et al., 2024). Deep learning based object detection model, YOLOv8 was used to identify nitrogen, phosphorus and potassium deficiency in soybean plants. YOLOv8 was trained over 6020 RGB images and achieved a precision score ranging from 90.03% to 96.54% and potassium deficiency was detected with the highest accuracy. This work offered a fast, accurate and scalable approach to improve nutrient management in precision agriculture (Jeong et al., 2025).
       
Apart from this, there are studies related to nutrient deficiency identification in other plants. Nutrient deficiency in rice and banana leaves was identified using VGG-16 and Inception-v3 and got 93% accuracy for Inception-v3 model (Mkhatshwa et al., 2024). ConvNet based models were employed to identify six nutrient deficiencies in palm leaves and acquired an accuracy of 94% (Ibrahim et al., 2022). Calcium and Magnesium deficiencies were predicted using transfer learning based feature extraction with Inception-V3, ResNet50 and VGG16 in the tomato plants. Random forest and SVM were used for deficiency detection (Kusanur and Chakravarthi, 2021). VGG16 deep learning model was used to detect nitrogen, phosphorus and potassium deficiencies in Hydroponic Basil with an accuracy of 94% (Gul and Bora, 2023).
       
The manifestation of physical symptoms due to nutrient deficiencies in plants has motivated researchers in precision agriculture to investigate this area more extensively. Since there are much limited deep learning models to identify nutrient deficiency in soybean plant, this study focuses on detecting nitrogen, phosphorus and potassium macro nutrient deficiency which would help farmers in early decision making.
The working of the proposed system to identify nutrient deficiency in soybean plants is shown in Fig 1. As the first step, images are captured from the field, pre-processed to uniform size and are labeled according to the nutrient deficiency. Next, they are split into two groups in the ratio of 80:20. The 80% of soybean leaf images are used to train the CNN model and the 20% of soybean leaf images are used for testing the CNN model. Finally performance metrics are calculated from testing results of the trained model. This experiment was carried out during July 2024 to May 2025 at Government College of Engineering Srirangam, Trichy.

Fig 1: Flow diagram of the proposed system.


 
Dataset collection
 
The proposed model analyses the leaf images of the soybean plant to identify nutrient deficiency. Soybean leaf images are captured from the field and are labelled with the help of agricultural experts. The collected dataset contains 3309 images of soybean leaves grouped under four categories namely Nitrogen deficient (ND), Potassium deficient (KD), Phosphorus deficient (PD) and Healthy (HY). There are 817 images under Nitrogen deficient category, 805 images under Phosphorus deficient category, 835 images under Potassium deficient category and 852 images under Healthy category. Images from (Bevers et al., 2022) were referred for healthy and Potassium deficient category. Sample image in each category of nutrient deficiency and healthy leaf is presented in Fig 2.

Fig 2: Sample soybean leaf images.


 
Pre-processing collected images
 
Image pre-processing step enhances the quality of raw image data and improves the efficiency in subsequent image processing tasks (Maharana et al., 2022). Pre-processing operations viz., resizing, normalization and augmentation are done in the proposed system. The images captured from the fields may vary in size and resolution but, deep learning models employed to detect deficiency in soybean leaf images expect the input images to be of same size. For this experiment, all the images are resized to 256 × 256 pixels to be processed by convolutional neural network model.
       
After resizing, the pixel intensity ranges of the images are standardized to a particular range using z-score normalization that uses mean and standard deviation of the pixel intensity values. Next step of image pre-processing in this experiment is data augmentation. Image augmentation is done to expand the count and diversity of a training dataset for providing variations in real-world data, improving the generalization ability of CNN model. In this paper, data augmentation using geometric transformations such as rotation, flipping and cropping are carried out.
 
Architecture of the proposed CNN model
 
Deep learning models like Convolutional neural networks (CNN) are used to examine visual information like images and videos (El et al., 2024). CNN models are widely used in agriculture domain to classify disease in plants (Mostafa et al., 2025; Bhavani, 2025), weed detection (García-Navarrete et al., 2024; Hashemi-Beni et al., 2022), crop yield prediction (Kalmani et al., 2024), etc. This paper proposes a CNN model to identify nutrient deficiency in soybean plants.
       
The proposed CNN architecture has two major parts namely, feature extraction part and classification part. Feature extraction block analyses the image under study and extracts numeric features which is used by classification module. In this paper, feature extraction is done using convolution operation and average pooling operation. The classification module is developed with fully connected neural network layers. The extracted features from the input images are classified as Nitrogen deficient, Phosphorous deficient, Potassium deficient or Healthy in the output of this block.
       
Fig 3 depicts the architecture of the CNN model developed in this experiment. It has six blocks for feature extraction process. Each feature extraction block contains three parts namely convolution layer, batch normalization layer and average pooling layer.

Fig 3: Architecture of the proposed CNN model.


       
Batch normalization is introduced after applying activation function to the feature maps from the convolution layer. This is done in order to stabilize and accelerate the learning process of CNN. The output of the last block is flattened that is multiple 2D feature maps are converted to a single dimensional array to be served to the classification block developed with fully connected layers. In the proposed architecture, there are three dense layers with 64, 8 and 4 neurons respectively. The last four neurons in the output layer represent the four classes namely Nitrogen deficient, Phosphorous deficient, Potassium deficient or Healthy. The major operations done in the proposed model is explained below.
 
Convolution layer
 
Convolution operation is the heart of the CNN where filters are slide over the image to produce a feature map. In this operation, each pixel in the sliding window of the image is multiplied with the corresponding element in the kernel and finally all the multiplied values are summed up. This operation could be defined as in equation (1).

  
Where,
IM= The input feature map.
KM= The kernel.
Fmap= The output feature map obtained after the convolution operation.
       
In this paper kernel size M×N is 3×3 and stride of 1 is adopted. Thus the output dimension of feature map is given by equation (2).
 
                                                                                    Fmapdim = IMdim - KMdim + 1             ...(2)
                                             
Next, activation function is applied on each value of the feature map. In this paper, the activation function that eliminates negative value in the output is used and is given by equation (3).
 
Relu(x) = max (0, x)         ...(3)

Average pooling layer
 
The next important operation in CNN is pooling operation. This operation down-sizes the dimension of feature map by keeping the most important value of the input data. This paper utilizes average pooling which is given by equation (4).


Where,
M1×N1= The size of the window for pooling operation.
Average pooling sums up all the values within the window and divides it with the total number of values in that window.
 
Fully connected layer
 
The feature maps from the last feature extraction block of CNN are first flattened to a single dimensional array. This is fed to the feed forward neural network model developed with fully connected neurons. That is, every neuron in one layer has a direct connection to all neurons in the following layer. In our proposed model there are two three layers built using 64, 8 and 4 neurons respectively. The 4 neurons in the output layer exemplify the number of classes under study. The output of each neuron is given by equation (5) and (6).


  O = A(z)              ...(6)
                                                               
Where,
wi- Weight associated with ith neuron.
xi- Input given to the ith neuron.
X- Count of neurons in the previous layer.
b- Bias value.
A(z)- Activation function applied on z to generate the output O.
       
In this experiment, activation function (Relu) as given by equation (3), is used in the hidden layer. Final output layer makes use of softmax activation function which converts the numerical values into probability distribution. The neuron that has high probability gives the output of corresponding nutrient deficiency category. Softmax function is given by equation (7).


        
Where,
N= The number of classes and in this experiment n=4 as there are four classes.
 
Nutrient deficiency detection
 
The experiments for this paper were carried out in the runtime environment containing T4 GPU with 12 GB RAM. Programs to implement CNN architecture were done using Python programming language and keras library. After several fine tuning processes, the final values of hyper-parameters used to train the CNN model is given in Table (1).

Table 1: Hyper-parameters used to build CNN model.


       
After setting up the environment and hyper-parameters, training phase of the proposed CNN model is carried out with the 80% of images. Testing proceeds after building a successful CNN model. The trained CNN model is tested with the remaining 20% of images which was not used for training. The metrics used to evaluate the trained CNN model are precision, recall, F1-score and accuracy. Classwise precision, classwise recall and classwise F1-score for each nutrient deficiency are calculated using the equations (8), (9) and (10) respectively.

Precision of class k is given by:

Recall of class k is given by:


 
F1-score of class k is given by:

 
The macro average of each the metrics is calculated by summing up the classwise metric value and dividing it with number of classes (here it is 4). The number of images considered for testing in each class is considered as the weights to calculate weighted average of each metrics.
               
The overall percentage of accuracy is given by equation (11).

The proposed CNN architecture has six convolutional and average pooling blocks for feature extraction. With batch normalization between convolutional and average pooling layer, there will be some non-trainable parameters in the architecture. Table (2) gives the number of parameters in each block along with the number of trainable and non-trainable parameters. In this proposed CNN architecture, the amount of non-trainable parameters is negligible compared to trainable parameters.

Table 2: Parameters count in the proposed CNN model.


 
Training performance
 
The accuracy and loss in each epoch of CNN training and validation is recorded and plotted as a graph. Fig 4, displays the graph plotted with accuracy of the CNN model during training and validation and Fig 5, gives the corresponding training and validation loss. From these graphs it is observed that the training and validation process saturates at 46 epochs with an accuracy of 97% and loss of 0.0313 and thus the training was stopped at 50 epochs to avoid overfitting.

Fig 4: Accuracy comparison during training and validation.



Fig 5: Loss comparison during training and validation.


 
Testing performance
 
The evaluation metrics such as classwise precision, classwise recall, classwisee F1-score and accuracy is calculated using equation given in (8), (9), (10) and (11) respectively. First, confusion matrix is drawn to identify the number of correctly classified images and misclassified images and is given in Fig 6. From the confusion matrix it is observed that there are 3.07%, 1.86%, 2.99% and 2.35% misclassification rate for ND, PD, KD and HY classes respectively.

Fig 6: Confusion matrix for soybean macro nutrient deficiency detection.


       
Next, the evaluation metrics are calculated and showcased as evaluation report in Table 3. The classwise precision and classwise recall of each nutrient deficient category is between 96% and 98% while the harmonic mean (F1-score) of all the classes falls within 97% and 97.7%. It reflects that the proposed CNN model performs its task well in identifying nutrient deficiency. It is also observed that the overall accuracy is 97.43% and macro averaging and weighted averaging of all the metrics is also 97.43%. It proves that there are very less false predictions and the model is able to correctly identify the nutrient deficiency for most of the input test images.

Table 3: Proposed CNN model-evaluation report.


       
The CNN model proposed in this paper is studied against the existing state-of-art CNN models like VGG16, ResNet50 and MobileNetv3 by altering its final layer with four neurons. Fig 7, gives the comparison chart of their accuracy percentages. It is evident that the proposed CNN model performs equally well compared to the other pre-trained CNN models.

Fig 7: Proposed CNN model vs other deep learning models-Accuracy comparison.

Nutrient deficiency can hinder the development and yield of soybean plants and hence, this paper developed a CNN model to identify nutrient deficiency in soybean leaves. The input images were collected from the fields, labeled and are split into two groups for training and testing respectively. With appropriate group of images the newly built CNN model was trained to detect the nutrient deficiency and the CNN model trained was verified with the image group reserved for testing. The results obtained by the proposed CNN model for nutrient deficiency detection was evaluated for its precision, recall, F1-score and accuracy. The testing result exhibits that the proposed model works well in identifying the nutrient deficiency in soybean plant leading to an accuracy of 97.43%. This proposed CNN model could be used for early identification of nutrient deficiency in soybean plant and would be helpful for farmers in minimizing the production loss with early decision making.
None.
All authors report no conflicts of interest.  

  1. Bagale, S. (2021). Nutrient management for soybean crops. International Journal of Agronomy. 2021(1): 3304634.

  2. Bevers, N., Sikora, E.J. and Hardy, N.B. (2022). Pictures of diseased soybean leaves by category captured in field and with controlled backgrounds: Auburn soybean disease image dataset (ASDID) [Data set]. Zenodo. https://doi.org/ 10.5061/dryad.41ns1rnj3. 

  3. Bhavani, R. (2025). Detection of leaf diseases in soybean plant using autoencoder and multinomial logistic regression. Legume Research. 48(5): 876-884. doi: 10.18805/LR-5461.

  4. Dhal, S.B., Mahanta, S., Moore, J.M. and Kalafatis, S. (2024). Machine learning-based analysis of nutrient and water uptake in hydroponically grown soybeans. Scientific Reports. 14(1): 24337.

  5. El, S.M., Mothe, J. and Ivanovici, M. (2024). Images and CNN applications in smart agriculture. European Journal of Remote Sensing. 57(1): 2352386.

  6. Furlanetto, R.H., Crusiol, L.G.T., Nanni, M.R., Oliveira Junior, A. D. and Sibaldelli, R.N.R. (2024). Hyperspectral data for early identification and classification of potassium deficiency in soybean plants [Glycine max (L.) Merrill]. Remote Sensing. 16(11): 1900.

  7. García-Navarrete, O.L., Correa-Guimaraes, A. and Navas-Gracia, L.M. (2024). Application of convolutional neural networks in weed detection and identification: A systematic review.  Agriculture. 14(4): 568.

  8. Gul, Z. and Bora, S. (2023). Exploiting pre-trained convolutional neural networks for the detection of nutrient deficiencies in hydroponic basil. Sensors. 23(12): 5407.

  9. Hashemi-Beni, L., Gebrehiwot, A., Karimoddini, A., Shahbazi, A. and Dorbu, F. (2022). Deep convolutional neural networks for weeds and crops discrimination from UAS imagery.  Frontiers in Remote Sensing. 2022(3): 755939.

  10. Hassanijalilian, O., Igathinathane, C., Bajwa, S. and Nowatzki, J. (2020). Rating iron deficiency in soybean using image processing and decision-tree based models. Remote Sensing. 12(24): 4143.

  11. Ibrahim, S., Hasan, N., Sabri, N., Abu Samah, K.A.F. and Rahimi Rusland, M. (2022). Palm leaf nutrient deficiency detection using convolutional neural network (CNN). International Journal of Nonlinear Analysis and Applications. 13(1): 1949-1956.

  12. Jeong, M., Park, S., Kwon, S.M., Lim, K., Jung, D.R., Lee, H.S., Kim, H.J. and Shin, J.H. (2025). Rapid detection of soybean nutrient deficiencies with YOLOv8s for precision agriculture advancement. Scientific Reports. 2025(15): 13810.

  13. Kalmani, V.H., Dharwadkar, N.V. and Thapa, V. (2024). Crop yield prediction using deep learning algorithm based on CNN- LSTM with attention layer and skip connection. Indian Journal of Agricultural Research. 59(8): 1303-1311. doi: 10.18805/ IJARe.A-6300.

  14. Kusanur, V. and Chakravarthi, V.S., (2021). Using transfer learning for nutrient deficiency prediction and classification in tomato plant. International Journal of Advanced Computer Science and Applications. 12(10): 784-790.

  15. Lavanya, M., Devi, M.K. and Vani, M.S. (2022). Deep learning for identification of plantnutrient deficiencies. Journal of Pharmaceutical Negative Results. 13(9): 3284-3291.

  16. Li, X., Gai, J., Chang, W. and Zhang, C. (2010). Identification of phosphorus starvation tolerant soybean (Glycine max) germplasms. Frontiers of Agriculture in China. 2020(4): 272-279.

  17. Lisciani, S., Marconi, S. and Le Donne, C. (2024). Legumes and common beans in sustainable diets: nutritional quality, environmental benefits, spread and use in food preparations.  Frontiers in Nutrition. 11: 1385232.

  18. Ma, L., Fang, J., Chen, Y. and Gong, S. (2010). Color analysis of leaf images of deficiencies and excess nitrogen content in soybean leaves. Proceedings of International Conference on E-Product E-Service and E-Entertainment. IEEE. pp 1-3.

  19. Maharana, K., Mondal, S. and Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Proceedings of International Conference on Intelligent Engineering Approach. 3(1): 91-99.

  20. Mkhatshwa, J., Kavu, T. and Daramola, O. (2024). Analysing the performance and interpretability of CNN-based architectures for plant nutrient deficiency identification. Computation12(6): 113.

  21. Mostafa, A., Alnuaim, A., AlZubi, A.A. (2025). Utilizing convolutional neural networks for accurate detection of leaf diseases in fava beans. Legume Research. 48(3): 494-502. doi: 10.18805/LRF-823.

  22. Santana, D.C., Izabela, C.O., Sâmela, B.C., Paulo, H.M., Marcelo, C.M.T.F., João, L.D.S., Larissa, P.R.T. (2024). Classification of soybean genotypes as to calcium, magnesium and sulfur content using machine learning models and UAV- Multispectral sensor. Agri Engineering. 6(2): 1581-1593.

  23. Sartin, M., Alexandre C.R., Kappes, C. and Tercio A.S.F. (2020). Classifying the macronutrient deficiency in soybean leaf with deep learning. Proceedings of the XVII National Meeting on Artificial and Computational Intelligence. pp 638-649.

  24. Wang, X.G., Zhao, X.H., Jiang, C.J., Li, C.H., Cong, S., Wu, D., Chen, Y.Q., Yu, H.Q. and Wang, C.Y. (2015). Effects of potassium deficiency on photosynthesis and photoprotection mechanisms in soybean [Glycine max (L.) Merr.]. Journal  of Integrative Agriculture. 14(5): 856-863.

Smart Detection of Macro Nutrient Deficiency in Soybean Plant using Convolutional Neural Network

1Department of Computer Science and Engineering, Government College of Engineering Srirangam, Trichy-620 012, Tamil Nadu, India.
  • Submitted19-06-2025|

  • Accepted25-10-2025|

  • First Online 12-11-2025|

  • doi 10.18805/LR-5534

Background: Legumes play an important role in improving soil quality and nutrition in humans. Soybean is one of the legumes which are rich in protein and oil content. Nutrients deficiency in soybean plant could affect the growth of the plants and might lead to loss in its yield. The developments in modern technologies such as computer vision and deep learning are being leveraged in identifying nutrient deficiencies in soybean plants.

Methods: Convolutional neural network with six feature extraction blocks and one classification block is developed to identify macro nutrient deficiency in soybean plants. Images are first collected, pre-processed and labeled. They are then split into training images and testing images in the ratio of 80:20. The proposed convolutional neural network model is trained over training images and the final trained model is tested with testing images.

Result: In detecting the deficiency of macro nutrients such as nitrogen, phosphorus and potassium in soybean plants, the testing results of the proposed convolutional neural network architecture achieved an accuracy of 97.43%. Accuracy comparison against existing models such as VGG16, ResNet50 and MobileNetV3 demonstrate that the proposed model effectively identifies nutrient deficiencies in soybean plants. Thus, the proposed system is designed to support farmers in making timely decisions and to contribute to food security by leveraging deep learning techniques.
Soybean, the most important legume crop that belongs to Fabaceae family is well known for its richness in protein (37%-48%) and oil (16%-21%) content (Bagale, 2021). Soybean plant requires fifteen nutrients for proper growth and development of the plant. These nutrients are classified into two categories namely macro nutrients and micro nutrients. Nitrogen, phosphorous, potassium, sulfur, calcium, magnesium are the macro nutrients needed for structural and functional growth of soybean plant. The micro nutrients needed for enzymatic and cellular regulation functions are copper, iron, manganese, zinc, boron, chloride, molybdenum and nickel. Thus, nutrient management is a primary and essential task in soybean cultivation (Lisciani et al., 2024).
       
This paper focuses on analyzing the deficiency of macro nutrients such as nitrogen, potassium and phosphorus in soybean plant. The symptoms of lack of these nutrients in soybean plant show sign in the color of the leaves or deterioration of growth. Nitrogen deficiency in soybean plant causes chlorosis in intermediate leaves and when the deficiency is severe, the older leaves might suffer intense chlorosis resulting in yellow coloration (Ma et al., 2010). Phosphorus deficiency in soybean plant causes stunted growth and older leaves may have dark green or bluish-green coloration (Li et al., 2010). Potassium deficiency in soybean plant leads to yellowing along the leaf margins starting at the tip and edges of the older leaves (Wang et al., 2015). These nutrient deficiencies could be identified by visual observation or laboratory analysis of leaf tissues. While visual diagnosis relies heavily on expert experience and is subjective, laboratory analysis provides accurate nutrient concentration data but is time-consuming, labour-intensive and dependent on proper sampling and infrastructure. Precision agriculture aims to improve crop yields by analyzing the potential data for decision making and ensures sustainable development. Recent developments in computer vision and deep learning techniques provide opportunities for automating nutrient deficiency detection in plant (Lavanya et al., 2022).
 
Related works
 
Iron deficiency in soybean plants was analyzed using visual images of soybean leaves by examining the dark green color index, canopy size and pixel ranges with the help of machine learning algorithms such as decision tree, random forest and AdaBoost. Adaboost identified iron deficiency in soybean leaves with an F1-score of 0.75 in identifying the iron deficiency chlorosis (Hassanijalilian et al., 2020). Deficiency of potassium macronutrient was studied with their collected dataset using image processing technique and convolutional deep neural network. It was a binary classification model with classes potassium deficiency or healthy which could achieve a precision of about 99% (Sartin et al., 2020).
       
Machine learning algorithms were employed in analyzing the secondary macronutrient content in soybean plants based on spectral information. Spectral images of soybean leaves were collected at the reproductive stage and macronutrient such as calcium, magnesium and sulfur levels were determined. Pearson correlation analysis and K-means clustering were applied to divide the genotype into clusters (Santana et al., 2024). Hyperspectral remote sensing has gained attention as a non-destructive method for observing nutrient deficiencies in crops. Potassium status in soybean plants were evaluated under three categories based on their severity levels using PCA and LDA. It was observed that spectral reflectance patterns are influenced by potassium availability, where SPD conditions led to notably higher reflectance in the visible spectrum, primarily due to decreased chlorophyll and pigment concentrations. At all the growth stages, PCA explained 100% variance to distinguish severe potassium deficiency and LDA achieved 70% accuracy in training phase and 59% accuracy in validation phase (Furlanetto et al., 2024).
       
A data-driven method was developed by analyzing the effects of varying nitrogen, magnesium and potassium levels on hydroponically grown soybeans. Nutrient profiling was conducted during various plant stages and chi-squared testing based feature selection techniques identified key predictors of water uptake. Random Forest performed best for nitrogen and magnesium treatments with R-square score of 0.63 and 0.81 respectively, while Support Vector Regression excelled in potassium treatment with R-square of 0.85. This work used SHapley Additive exPlanations (SHAP) to provide insights into nutrient contributions, enabling better understanding and optimization of hydroponic nutrient management (Dhal et al., 2024). Deep learning based object detection model, YOLOv8 was used to identify nitrogen, phosphorus and potassium deficiency in soybean plants. YOLOv8 was trained over 6020 RGB images and achieved a precision score ranging from 90.03% to 96.54% and potassium deficiency was detected with the highest accuracy. This work offered a fast, accurate and scalable approach to improve nutrient management in precision agriculture (Jeong et al., 2025).
       
Apart from this, there are studies related to nutrient deficiency identification in other plants. Nutrient deficiency in rice and banana leaves was identified using VGG-16 and Inception-v3 and got 93% accuracy for Inception-v3 model (Mkhatshwa et al., 2024). ConvNet based models were employed to identify six nutrient deficiencies in palm leaves and acquired an accuracy of 94% (Ibrahim et al., 2022). Calcium and Magnesium deficiencies were predicted using transfer learning based feature extraction with Inception-V3, ResNet50 and VGG16 in the tomato plants. Random forest and SVM were used for deficiency detection (Kusanur and Chakravarthi, 2021). VGG16 deep learning model was used to detect nitrogen, phosphorus and potassium deficiencies in Hydroponic Basil with an accuracy of 94% (Gul and Bora, 2023).
       
The manifestation of physical symptoms due to nutrient deficiencies in plants has motivated researchers in precision agriculture to investigate this area more extensively. Since there are much limited deep learning models to identify nutrient deficiency in soybean plant, this study focuses on detecting nitrogen, phosphorus and potassium macro nutrient deficiency which would help farmers in early decision making.
The working of the proposed system to identify nutrient deficiency in soybean plants is shown in Fig 1. As the first step, images are captured from the field, pre-processed to uniform size and are labeled according to the nutrient deficiency. Next, they are split into two groups in the ratio of 80:20. The 80% of soybean leaf images are used to train the CNN model and the 20% of soybean leaf images are used for testing the CNN model. Finally performance metrics are calculated from testing results of the trained model. This experiment was carried out during July 2024 to May 2025 at Government College of Engineering Srirangam, Trichy.

Fig 1: Flow diagram of the proposed system.


 
Dataset collection
 
The proposed model analyses the leaf images of the soybean plant to identify nutrient deficiency. Soybean leaf images are captured from the field and are labelled with the help of agricultural experts. The collected dataset contains 3309 images of soybean leaves grouped under four categories namely Nitrogen deficient (ND), Potassium deficient (KD), Phosphorus deficient (PD) and Healthy (HY). There are 817 images under Nitrogen deficient category, 805 images under Phosphorus deficient category, 835 images under Potassium deficient category and 852 images under Healthy category. Images from (Bevers et al., 2022) were referred for healthy and Potassium deficient category. Sample image in each category of nutrient deficiency and healthy leaf is presented in Fig 2.

Fig 2: Sample soybean leaf images.


 
Pre-processing collected images
 
Image pre-processing step enhances the quality of raw image data and improves the efficiency in subsequent image processing tasks (Maharana et al., 2022). Pre-processing operations viz., resizing, normalization and augmentation are done in the proposed system. The images captured from the fields may vary in size and resolution but, deep learning models employed to detect deficiency in soybean leaf images expect the input images to be of same size. For this experiment, all the images are resized to 256 × 256 pixels to be processed by convolutional neural network model.
       
After resizing, the pixel intensity ranges of the images are standardized to a particular range using z-score normalization that uses mean and standard deviation of the pixel intensity values. Next step of image pre-processing in this experiment is data augmentation. Image augmentation is done to expand the count and diversity of a training dataset for providing variations in real-world data, improving the generalization ability of CNN model. In this paper, data augmentation using geometric transformations such as rotation, flipping and cropping are carried out.
 
Architecture of the proposed CNN model
 
Deep learning models like Convolutional neural networks (CNN) are used to examine visual information like images and videos (El et al., 2024). CNN models are widely used in agriculture domain to classify disease in plants (Mostafa et al., 2025; Bhavani, 2025), weed detection (García-Navarrete et al., 2024; Hashemi-Beni et al., 2022), crop yield prediction (Kalmani et al., 2024), etc. This paper proposes a CNN model to identify nutrient deficiency in soybean plants.
       
The proposed CNN architecture has two major parts namely, feature extraction part and classification part. Feature extraction block analyses the image under study and extracts numeric features which is used by classification module. In this paper, feature extraction is done using convolution operation and average pooling operation. The classification module is developed with fully connected neural network layers. The extracted features from the input images are classified as Nitrogen deficient, Phosphorous deficient, Potassium deficient or Healthy in the output of this block.
       
Fig 3 depicts the architecture of the CNN model developed in this experiment. It has six blocks for feature extraction process. Each feature extraction block contains three parts namely convolution layer, batch normalization layer and average pooling layer.

Fig 3: Architecture of the proposed CNN model.


       
Batch normalization is introduced after applying activation function to the feature maps from the convolution layer. This is done in order to stabilize and accelerate the learning process of CNN. The output of the last block is flattened that is multiple 2D feature maps are converted to a single dimensional array to be served to the classification block developed with fully connected layers. In the proposed architecture, there are three dense layers with 64, 8 and 4 neurons respectively. The last four neurons in the output layer represent the four classes namely Nitrogen deficient, Phosphorous deficient, Potassium deficient or Healthy. The major operations done in the proposed model is explained below.
 
Convolution layer
 
Convolution operation is the heart of the CNN where filters are slide over the image to produce a feature map. In this operation, each pixel in the sliding window of the image is multiplied with the corresponding element in the kernel and finally all the multiplied values are summed up. This operation could be defined as in equation (1).

  
Where,
IM= The input feature map.
KM= The kernel.
Fmap= The output feature map obtained after the convolution operation.
       
In this paper kernel size M×N is 3×3 and stride of 1 is adopted. Thus the output dimension of feature map is given by equation (2).
 
                                                                                    Fmapdim = IMdim - KMdim + 1             ...(2)
                                             
Next, activation function is applied on each value of the feature map. In this paper, the activation function that eliminates negative value in the output is used and is given by equation (3).
 
Relu(x) = max (0, x)         ...(3)

Average pooling layer
 
The next important operation in CNN is pooling operation. This operation down-sizes the dimension of feature map by keeping the most important value of the input data. This paper utilizes average pooling which is given by equation (4).


Where,
M1×N1= The size of the window for pooling operation.
Average pooling sums up all the values within the window and divides it with the total number of values in that window.
 
Fully connected layer
 
The feature maps from the last feature extraction block of CNN are first flattened to a single dimensional array. This is fed to the feed forward neural network model developed with fully connected neurons. That is, every neuron in one layer has a direct connection to all neurons in the following layer. In our proposed model there are two three layers built using 64, 8 and 4 neurons respectively. The 4 neurons in the output layer exemplify the number of classes under study. The output of each neuron is given by equation (5) and (6).


  O = A(z)              ...(6)
                                                               
Where,
wi- Weight associated with ith neuron.
xi- Input given to the ith neuron.
X- Count of neurons in the previous layer.
b- Bias value.
A(z)- Activation function applied on z to generate the output O.
       
In this experiment, activation function (Relu) as given by equation (3), is used in the hidden layer. Final output layer makes use of softmax activation function which converts the numerical values into probability distribution. The neuron that has high probability gives the output of corresponding nutrient deficiency category. Softmax function is given by equation (7).


        
Where,
N= The number of classes and in this experiment n=4 as there are four classes.
 
Nutrient deficiency detection
 
The experiments for this paper were carried out in the runtime environment containing T4 GPU with 12 GB RAM. Programs to implement CNN architecture were done using Python programming language and keras library. After several fine tuning processes, the final values of hyper-parameters used to train the CNN model is given in Table (1).

Table 1: Hyper-parameters used to build CNN model.


       
After setting up the environment and hyper-parameters, training phase of the proposed CNN model is carried out with the 80% of images. Testing proceeds after building a successful CNN model. The trained CNN model is tested with the remaining 20% of images which was not used for training. The metrics used to evaluate the trained CNN model are precision, recall, F1-score and accuracy. Classwise precision, classwise recall and classwise F1-score for each nutrient deficiency are calculated using the equations (8), (9) and (10) respectively.

Precision of class k is given by:

Recall of class k is given by:


 
F1-score of class k is given by:

 
The macro average of each the metrics is calculated by summing up the classwise metric value and dividing it with number of classes (here it is 4). The number of images considered for testing in each class is considered as the weights to calculate weighted average of each metrics.
               
The overall percentage of accuracy is given by equation (11).

The proposed CNN architecture has six convolutional and average pooling blocks for feature extraction. With batch normalization between convolutional and average pooling layer, there will be some non-trainable parameters in the architecture. Table (2) gives the number of parameters in each block along with the number of trainable and non-trainable parameters. In this proposed CNN architecture, the amount of non-trainable parameters is negligible compared to trainable parameters.

Table 2: Parameters count in the proposed CNN model.


 
Training performance
 
The accuracy and loss in each epoch of CNN training and validation is recorded and plotted as a graph. Fig 4, displays the graph plotted with accuracy of the CNN model during training and validation and Fig 5, gives the corresponding training and validation loss. From these graphs it is observed that the training and validation process saturates at 46 epochs with an accuracy of 97% and loss of 0.0313 and thus the training was stopped at 50 epochs to avoid overfitting.

Fig 4: Accuracy comparison during training and validation.



Fig 5: Loss comparison during training and validation.


 
Testing performance
 
The evaluation metrics such as classwise precision, classwise recall, classwisee F1-score and accuracy is calculated using equation given in (8), (9), (10) and (11) respectively. First, confusion matrix is drawn to identify the number of correctly classified images and misclassified images and is given in Fig 6. From the confusion matrix it is observed that there are 3.07%, 1.86%, 2.99% and 2.35% misclassification rate for ND, PD, KD and HY classes respectively.

Fig 6: Confusion matrix for soybean macro nutrient deficiency detection.


       
Next, the evaluation metrics are calculated and showcased as evaluation report in Table 3. The classwise precision and classwise recall of each nutrient deficient category is between 96% and 98% while the harmonic mean (F1-score) of all the classes falls within 97% and 97.7%. It reflects that the proposed CNN model performs its task well in identifying nutrient deficiency. It is also observed that the overall accuracy is 97.43% and macro averaging and weighted averaging of all the metrics is also 97.43%. It proves that there are very less false predictions and the model is able to correctly identify the nutrient deficiency for most of the input test images.

Table 3: Proposed CNN model-evaluation report.


       
The CNN model proposed in this paper is studied against the existing state-of-art CNN models like VGG16, ResNet50 and MobileNetv3 by altering its final layer with four neurons. Fig 7, gives the comparison chart of their accuracy percentages. It is evident that the proposed CNN model performs equally well compared to the other pre-trained CNN models.

Fig 7: Proposed CNN model vs other deep learning models-Accuracy comparison.

Nutrient deficiency can hinder the development and yield of soybean plants and hence, this paper developed a CNN model to identify nutrient deficiency in soybean leaves. The input images were collected from the fields, labeled and are split into two groups for training and testing respectively. With appropriate group of images the newly built CNN model was trained to detect the nutrient deficiency and the CNN model trained was verified with the image group reserved for testing. The results obtained by the proposed CNN model for nutrient deficiency detection was evaluated for its precision, recall, F1-score and accuracy. The testing result exhibits that the proposed model works well in identifying the nutrient deficiency in soybean plant leading to an accuracy of 97.43%. This proposed CNN model could be used for early identification of nutrient deficiency in soybean plant and would be helpful for farmers in minimizing the production loss with early decision making.
None.
All authors report no conflicts of interest.  

  1. Bagale, S. (2021). Nutrient management for soybean crops. International Journal of Agronomy. 2021(1): 3304634.

  2. Bevers, N., Sikora, E.J. and Hardy, N.B. (2022). Pictures of diseased soybean leaves by category captured in field and with controlled backgrounds: Auburn soybean disease image dataset (ASDID) [Data set]. Zenodo. https://doi.org/ 10.5061/dryad.41ns1rnj3. 

  3. Bhavani, R. (2025). Detection of leaf diseases in soybean plant using autoencoder and multinomial logistic regression. Legume Research. 48(5): 876-884. doi: 10.18805/LR-5461.

  4. Dhal, S.B., Mahanta, S., Moore, J.M. and Kalafatis, S. (2024). Machine learning-based analysis of nutrient and water uptake in hydroponically grown soybeans. Scientific Reports. 14(1): 24337.

  5. El, S.M., Mothe, J. and Ivanovici, M. (2024). Images and CNN applications in smart agriculture. European Journal of Remote Sensing. 57(1): 2352386.

  6. Furlanetto, R.H., Crusiol, L.G.T., Nanni, M.R., Oliveira Junior, A. D. and Sibaldelli, R.N.R. (2024). Hyperspectral data for early identification and classification of potassium deficiency in soybean plants [Glycine max (L.) Merrill]. Remote Sensing. 16(11): 1900.

  7. García-Navarrete, O.L., Correa-Guimaraes, A. and Navas-Gracia, L.M. (2024). Application of convolutional neural networks in weed detection and identification: A systematic review.  Agriculture. 14(4): 568.

  8. Gul, Z. and Bora, S. (2023). Exploiting pre-trained convolutional neural networks for the detection of nutrient deficiencies in hydroponic basil. Sensors. 23(12): 5407.

  9. Hashemi-Beni, L., Gebrehiwot, A., Karimoddini, A., Shahbazi, A. and Dorbu, F. (2022). Deep convolutional neural networks for weeds and crops discrimination from UAS imagery.  Frontiers in Remote Sensing. 2022(3): 755939.

  10. Hassanijalilian, O., Igathinathane, C., Bajwa, S. and Nowatzki, J. (2020). Rating iron deficiency in soybean using image processing and decision-tree based models. Remote Sensing. 12(24): 4143.

  11. Ibrahim, S., Hasan, N., Sabri, N., Abu Samah, K.A.F. and Rahimi Rusland, M. (2022). Palm leaf nutrient deficiency detection using convolutional neural network (CNN). International Journal of Nonlinear Analysis and Applications. 13(1): 1949-1956.

  12. Jeong, M., Park, S., Kwon, S.M., Lim, K., Jung, D.R., Lee, H.S., Kim, H.J. and Shin, J.H. (2025). Rapid detection of soybean nutrient deficiencies with YOLOv8s for precision agriculture advancement. Scientific Reports. 2025(15): 13810.

  13. Kalmani, V.H., Dharwadkar, N.V. and Thapa, V. (2024). Crop yield prediction using deep learning algorithm based on CNN- LSTM with attention layer and skip connection. Indian Journal of Agricultural Research. 59(8): 1303-1311. doi: 10.18805/ IJARe.A-6300.

  14. Kusanur, V. and Chakravarthi, V.S., (2021). Using transfer learning for nutrient deficiency prediction and classification in tomato plant. International Journal of Advanced Computer Science and Applications. 12(10): 784-790.

  15. Lavanya, M., Devi, M.K. and Vani, M.S. (2022). Deep learning for identification of plantnutrient deficiencies. Journal of Pharmaceutical Negative Results. 13(9): 3284-3291.

  16. Li, X., Gai, J., Chang, W. and Zhang, C. (2010). Identification of phosphorus starvation tolerant soybean (Glycine max) germplasms. Frontiers of Agriculture in China. 2020(4): 272-279.

  17. Lisciani, S., Marconi, S. and Le Donne, C. (2024). Legumes and common beans in sustainable diets: nutritional quality, environmental benefits, spread and use in food preparations.  Frontiers in Nutrition. 11: 1385232.

  18. Ma, L., Fang, J., Chen, Y. and Gong, S. (2010). Color analysis of leaf images of deficiencies and excess nitrogen content in soybean leaves. Proceedings of International Conference on E-Product E-Service and E-Entertainment. IEEE. pp 1-3.

  19. Maharana, K., Mondal, S. and Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Proceedings of International Conference on Intelligent Engineering Approach. 3(1): 91-99.

  20. Mkhatshwa, J., Kavu, T. and Daramola, O. (2024). Analysing the performance and interpretability of CNN-based architectures for plant nutrient deficiency identification. Computation12(6): 113.

  21. Mostafa, A., Alnuaim, A., AlZubi, A.A. (2025). Utilizing convolutional neural networks for accurate detection of leaf diseases in fava beans. Legume Research. 48(3): 494-502. doi: 10.18805/LRF-823.

  22. Santana, D.C., Izabela, C.O., Sâmela, B.C., Paulo, H.M., Marcelo, C.M.T.F., João, L.D.S., Larissa, P.R.T. (2024). Classification of soybean genotypes as to calcium, magnesium and sulfur content using machine learning models and UAV- Multispectral sensor. Agri Engineering. 6(2): 1581-1593.

  23. Sartin, M., Alexandre C.R., Kappes, C. and Tercio A.S.F. (2020). Classifying the macronutrient deficiency in soybean leaf with deep learning. Proceedings of the XVII National Meeting on Artificial and Computational Intelligence. pp 638-649.

  24. Wang, X.G., Zhao, X.H., Jiang, C.J., Li, C.H., Cong, S., Wu, D., Chen, Y.Q., Yu, H.Q. and Wang, C.Y. (2015). Effects of potassium deficiency on photosynthesis and photoprotection mechanisms in soybean [Glycine max (L.) Merr.]. Journal  of Integrative Agriculture. 14(5): 856-863.
In this Article
Published In
Legume Research

Editorial Board

View all (0)