Fruit quality is a critical factor throughout the supply chain, impacting waste reduction, economic viability and public health. Recognizing the limitations of traditional, often subjective, human-based quality assessments, this research proposeps the application of machine learning algorithms for a more objective and efficient solution. The use of machine learning and image processing will be used to find the various solutions in the food industry such as sorting, grading,
etc. (
Begum and Hazarika, 2022). We will leverage a variety of datasets, capturing the nuances of variations of mangoes and diseases found in mangoes, to rigorously evaluate the capacity of various machine learning models in accurately predicting key quality parameters. This investigation seeks to establish a robust and scalable framework for enhancing quality control practices within the agricultural and food sectors.
The advancements in the research focus on the classification, fruit identification, variety identification, maturity grading and defect detection. A key trend involves the increasing use of deep learning techniques, particularly Convolutional Neural Networks (CNNs), which have demonstrated remarkable accuracy in learning complex visual features directly from mango images (
Li and Chen, 2022 and
Patel and Gupta, 2023). Transfer learning, where models pre-trained on large image datasets are fine-tuned for mango-specific tasks, has also gained prominence due to its ability to achieve high performance with smaller datasets (
Reddy and Rao, 2020;
Joseph and Abraham, 2021).
Studies have explored a diverse range of supervised learning algorithms beyond deep learning. For instance, Random Forests (RF), Support Vector Machines (SVM), K-Nearest Neighbors (KNN) and Decision Trees have been employed for classification based on features extracted through traditional image processing techniques or even non-image data like spectral information from Near-Infrared Spectroscopy (NIR) (
Singh and Kumar, 2021;
Das and Ghosh, 2022;
Suresh and Mohan, 2024). These methods often provide interpretable models and can be effective depending on the specific classification task and the nature of the available data. Ensemble methods, which combine the predictions of multiple learning algorithms, have also been investigated to improve the robustness and accuracy of mango fruit classification systems (
Devi and Jain, 2024).
Furthermore, the focus of research has expanded to address real-world challenges such as variations in lighting, background complexity and the presence of noise in images. Preprocessing techniques like image enhancement, noise reduction and data augmentation are commonly applied to improve the quality of input data and the generalization ability of the classification models (
Bhat and Mir, 2023). The development of mobile-based applications for on-site mango classification indicates a growing interest in deploying these technologies for practical use in agriculture and supply chain management (
Joseph and Abraham, 2021). These advancements collectively highlight the significant potential of supervised learning to automate and enhance the efficiency and accuracy of mango fruit classification in various agricultural and commercial contexts.
The convolutional neural network (CNN) shows learning complex features, capturing local patterns and generalises in varying lighting conditions, backgrounds and mango conditions. The Vision Transformers (ViTs) work on large datasets and excels at modelling global dependencies.
The research combines the potentials of both the algorithms CNN and ViT to check whether fusion of CNN and ViT improves mango quality assessment accuracy or not and also to check the generalization across multiple public datasets representing mango ripeness levels and mango variants? The focus of the research work is to observe the classification problem on the various datasets related to mango fruits using supervised learning. The datasets are related to mangoes, variations observed in mangoes. The various machine learning and deep learning algorithms are tested on the datasets. The primary algorithms applied to these datasets include Convolutional Neural Networks (CNNs) and their derivatives. Furthermore, the Vision Transformer architecture is utilised for classification tasks. Feature extraction is accomplished through the implementation of ResNet, MobileNet and ShuffleNet algorithms. The performance metrics obtained from these algorithms are then compared to determine the most accurate predictor for each dataset. This study aims to focus on a robust, scalable, reproducible framework for mango quality assessment that can be implemented for self-repositories created to address the agricultural problems.
Related work
The task of classifying various fruits and variants of fruits, fruit diseases, is automated using computer vision and machine learning algorithms based on visual features. A significant body of research focuses on developing and comparing different image processing techniques and feature extraction methods for fruit recognition, often tested across publicly available and custom-built fruit image datasets of varying complexity. Comparative analyses across different fruit image datasets, considering factors such as image resolution, lighting conditions, fruit variety and background clutter, provide valuable insights into the robustness and generalizability of different classification approaches.
The transfer learning using Convolutional neural network applied on FIDS 30 dataset, which consists of mixed fruit images, gives 94.8% accuracy for classification and fruit detection
(Geerthik et al., 2024). On the same dataset, AlexNet algorithm gives 75% accuracy
(Geerthik et al., 2024). On the same dataset, FIDS 30, when a Recurrent Neural Network was applied, the obtained result was 98.47% (
Dhiman and Hu, 2021).
A public dataset available on the Kaggle website was used by many researchers to test the model. The Fruit 360 is a dataset available on the Kaggle website. This dataset consists of more than 40,000 images of various fruits organised in training and testing folders (
Oltean, 2025). A Convolutional Neural Network was applied on the photos of the fruits apple, lemon and mango from the Fruit-360 dataset gave 95% accuracy
(Bobde et al., 2021).
Fruits fresh and rotten fruit images are a publicly available dataset. The machine learning algorithm VGG 16 was used to extract features of specific fruits like apples, oranges and bananas. Apart from various classifiers tested on the dataset, the Support Vector Machine gave 99% accuracy
(Mehta et al., 2021). Another work on the fresh and rotten fruit images using VGG 16 and YOLOv5 was carried out on fruits like apples, oranges, bananas, grapes, tomatoes, onions, chilli and capsicum. The VGG 16 gave approximately 91% accuracy
(Akshi et al., 2024). The dataset of fresh and stale images of fruits and vegetables available publicly was tested with CNN, BiLSTM, CNN LSTM and CNN BiLSTM machine learning algorithms on selective fruits: apple, banana, tomato, bitter guard, capsicum and orange. The result obtained using CNN BiLSTM was better than the other three algorithms. The result of CNN BiLSTM is 97.76% (
Yuan and Alhudhaif, 2024).
Banana ripeness was tested on the banana-ripening-process dataset using a Deep CNN approach. The dataset was publicly available, containing more than 18,000 images. The variants YOLOv8n to YOLOv8x were tested on this dataset. The result obtained was 94.60% to 96.30% accuracy, respectively (
Aishwarya and Vinesh , 2023). Like on a banana, deep learning methods are also applied on grape leaf to detect diseases. Among the algorithms like DenseNet121, VGG19, VGG 16, IncepttionV3 and ResNet50V2, DenseNet121 achieved highest accuracy 99.86% (
Patil and More, 2025).
O.O. Abayomi-Alli, R. Damaševièius, S. Misra and A. Abayomi-Alli created a new dataset known as FruitQ. The dataset consists of images of 11 fruits, consisting of three freshness classes such as fresh, mildly rotten and fully rotten. A dataset contains more than nine thousand images. The various deep learning algorithms, such as ShuffleNet, SqueezeNet, EfficientNet, ResNet18 and MobileNet-V2, were tested on this dataset. On this dataset, the ResNet 18 gave better performance up to 99.80% accuracy (
Abayomi-Alli et al., 2024).
It was being observed that the ripeness level in mango fruit was tested mostly using self-repositories. Images of variants of Mangoes, such as Harumanis and Sala, observed in Malaysia, were collected to check the ripeness level. The classifier, Support Vector Machine and odour sensor were used to classify the images into ripe and unripe categories
(Huang et al., 2023). To improve the mango grading accuracy on ripeness level, various classifiers such as Random Forest, Gradient Boosting, Support Vector Machine (SVM), K Nearest Neighbourhood (KNN) and Gaussian Naïve Bayes were applied on the Mango dataset. This dataset was prepared with Himsagor mangoes found in Bangladesh. For the feature extraction, the CNN and VGG16 were applied. The test accuracy obtained is given in Table 1.
A dataset based on Alphanso mangoes from Mysore, Karnataka, was prepared and tested using various machine learning classifiers such as Threshold-based, Naïve Bayes, LDA, SVM, KNN and PNN. The hierarchical classification gave approximately 83% accuracy and single-shot multiclass classification gave 82% accuracy
(Raghavendra et al., 2020).
Crop yield prediction is quantified by deep learning algorithms where convetional methods are applied and results in the lower prediction error and a strong correlation between predicted and actual values
(Kalmani et al., 2025).
A study on mango plant disease detection using ConvNext and Vision Transformer (ViT) on the MangoFruitDDS dataset achieved 98.40% accuracy and on the MangoLeafBD dataset gave 99.87% accuracy. These datasets are publicly available and cover various diseases observed on mango fruit and mango leaves
(Alamri et al., 2025).