## Indian Journal of Agricultural Research

**Chief Editor**V. Geethalakshmi**Print ISSN**0367-8245**Online ISSN**0976-058X**NAAS Rating**5.60**SJR**0.293

**Chief Editor**V. Geethalakshmi**Print ISSN**0367-8245**Online ISSN**0976-058X**NAAS Rating**5.60**SJR**0.293

Frequency :

Bi-monthly (February, April, June, August, October and December)

Indexing Services :

BIOSIS Preview, ISI Citation Index, Biological Abstracts, Elsevier (Scopus and Embase), AGRICOLA, Google Scholar, CrossRef, CAB Abstracting Journals, Chemical Abstracts, Indian Science Abstracts, EBSCO Indexing Services, Index CopernicusIndian Journal of Agricultural Research, volume 56 issue 5 (october 2022) : 519-526

Wheat Crop Yield Estimation using Geomatics Tools in Saharanpur District

Ankush Kumar Gupta^{1}, Pramod Soni^{2}

**Submitted**05-07-2019|**Accepted**18-02-2021|**First Online**18-06-2021|

The agricultural crop production is usually estimated as a product of area under the crop and the average yield per unit area of the crop. There are several methods for crop yield estimation. The traditional method is based on crop cutting experiments (CCE), which involve sample surveys for the crop estimation. The crop production estimates are obtained by taking the product of crop acreage and the corresponding crop yield. The method is laborious, intensive, expensive and imprecise (Sukhatme and Panse, 1951).

Remote-sensing data acquired by satellite have a wide scope for agricultural applications owing to their synoptic and repetitive coverage. Since the early 1970s, Remote Sensing approach has shown great potential in agricultural domains all around the world for improving the agricultural statistics. In the past, spectral data acquired*via* satellite have been extensively utilized for crop yield modeling in various parts of the world. Dubey *et al*., 1994 found that yield obtained from crop-cutting sites was found to be linearly related to Ratio Vegetation Index (RVI) derived from Landsat MSS data of corresponding sites in Ludhiana and Patiala districts of Punjab. Prasad *et al*., 2006 considered NDVI, soil moisture, surface temperature and rainfall data of Iowa state, US, for 19 years for crop yield assessment and prediction using piecewise linear regression method with break-point. The predicted values were found to be very close to the observed values (R^{2}=0.78 for corn and R^{2}=0.86 for soybean). Rojas, 2007 reported the development of an operational Spectro-agrometeorological yield model for maize using a spectral index, the NDVI derived from Spot-vegetation, meteorological data obtained from the European Centre for Medium-Range Weather Forecast (ECMWF) model and crop-water status indicators estimated by the Crop-Specific Water Balance model (CSWB) for the six large maize-growing provinces in Kenya. When the Jack-knife re-sampling technique was applied the forecast capability of the model was improved with (R^{2}=0.81 and RMSE=0.359 t/ha). Wheat yield prediction using different agrometeorological indices, spectral index NDVI and trend predicted yield (TPY) were developed in Hoshiarpur and Rupnagar districts of Punjab by Bazgeer *et al*., 2006. It was found that Agromet-Spectral-Trend-Yield model could explain 96% (SE=87 kg/ha) and 91% (SE=146 kg/ha) of wheat yield variations for Hoshiarpur and Rupnagar districts, respectively.

Verma*et al*., 2011 predicted wheat yield in various districts of Haryana state for the period 2001-02 to 2007-08 with a linear regression model that showed an R^{2} value of 0.81 and standard error of 192.73. Iqbal *et al*., 2012 constructed a statistical model of wheat yield in the Punjab province, Pakistan by using multi-regression analysis. Bazgeer *et al*., 2007 used different meteorological variables with agrometeorological indices for wheat yield prediction in Hamedan district during 2003-04 and 2004-05. Verma *et al*., 2016 developed zonal weather models (R^{2} of 0.81) for district-level mustard yield estimation on an agro-climatic zone basis in Haryana State based on five weather variables, *i.e.*, maximum temperature, minimum temperature, rainfall, relative humidity and sunshine hours.

The objective of the present study is to prepare and evaluate a suitable statistical model for crop yield prediction using remote sensing data for Saharanpur district of Uttar Pradesh.

Remote-sensing data acquired by satellite have a wide scope for agricultural applications owing to their synoptic and repetitive coverage. Since the early 1970s, Remote Sensing approach has shown great potential in agricultural domains all around the world for improving the agricultural statistics. In the past, spectral data acquired

Verma

The objective of the present study is to prepare and evaluate a suitable statistical model for crop yield prediction using remote sensing data for Saharanpur district of Uttar Pradesh.

The present study was conducted over the Saharanpur district of Uttar-Pradesh, India (Fig 1). The district is situated between 29°34¢45² and 30°21'30"N latitude and 77°09'00" and 78°14'45"E longitude having an area of 3,689 km

Different meteorological parameters, satellite data and ground truth data were collected from various sources. The collected information helped in the identification of the best period for the remote sensing data acquisition and development of wheat crop prediction. Digital boundary map of Saharanpur district was prepared using the geographic information system. Remote sensing data from the Landsat-7 (2006-2013) and Landsat-8 (2013-2016) satellites were collected for the last ten years (2006-2016) during the flowering stage (maximum vegetation growth) of wheat crop (mainly the month of March) from United States Geological Survey (USGS). Total 11 images were uses for the present study.

Ground truth information (40 locations, 28

Initially, the district boundary mask of Saharanpur was generated using the topographic map with a scale 1:250,000 in Q-GIS. False colour composite (FCC) using spectral data from band-2 (Green), band-3 (Red) and band-4 (near infrared) was generated in ERDAS (Earth Resources Data Analysis System) IMAGINE software. Further, the ground truth information obtained from the GPS was used to identify the wheat crop plots. The GPS was taken to the plots and location of the plots in terms of longitude and latitudes were recorded. These locations were then identified on the FCC’s. The coordinates of each plot in terms of scan-line and column number were recorded to identify these plots on the normalized diûerence vegetation indices (Table 2) and area was calculated as given in Table 3.

Land use land cover (LULC) of the district were classified into six categories: (1) wheat crop (2) other crop, (3) urban, (4) forest (5) water and (6) sand. Supervised classification was carried out by preparing the separability curve. The bands 3, 5 and 6 showed the maximum separability; hence, they were used for supervised classification using maximum likelihood classification method. The accuracy assessment was carried out with a threshold of 85%. The total area for each class was calculated by counting the number of pixels and multiplying it with the spatial resolution of the image (30 m × 30 m). The relative deviation of the wheat crop was estimated by comparing it with the actual data obtained from the district headquarters.

Finally, the linear regression techniques were applied to formulate a suitable model. In the first step, the correlation analysis was performed to find out the preliminary relationship between spectral indices/agrometeorological and wheat crop yield. Next, the step-wise linear regression was performed to find the best set of variables that can effectively predict the wheat crop yield. In each step of regression, the variable having the non-significant coefficients (at a 95% Confidence Interval) were removed from further steps of regression. The general equation of regression is given below as:

Yield = *b*_{1} + *b*_{2} × *T*_{min} + *b*_{3} × *T*_{max} + *b*_{4} × *Rain* + *b*_{5 }× *SH* +* b*_{6 }× *GDD* + *b*_{7} × *TD* + *b*_{8} × *HTU* + *b*_{9 }× *NDVI*

Here,

The various agrometeorological variables and spectral indices from the year 2006 to 2016 are shown in Table 4. The ground truth information about wheat crop yield was not available for the year 2016-17.

**Land use and land cover classification**

Fig 3 shows the classified image of Saharanpur district. We can see that the wheat crop covers most of the district with some urban area in the central part and forests in the Northern part of the district.

Accuracy assessment was also performed after supervised classification. The classified image is accepted only when accuracy is above 85%. The Accuracy assessment report is shown in Table 5. In the table, the accuracy assessment is based on the comparison of two maps; one based on the analysis of remote sensing data known as classified map and second based on information derived from actual ground also known as the reference map. Reference total is true class on reference map and classified total is total class on classified map. Based on this, the accuracy of the wheat crop was found to be 85.71% with the overall average accuracy of 81.22%.

**Acreage estimation and relative deviation**

For each vegetation index (VI), using the respective range or the wheat crop, the thematic map is prepared with two layers, as shown in Fig 4. Wheat pixels were collected using the attribute table and the wheat acreage was calculated by multiplying the total number of pixels with the spatial resolution (30×30) of the input Image. The estimated acreage using various VIs are shown in Table 5. The NDVI shows the least relative deviation of 5.97, whereas TNDVI shows the maximum deviation of 9.79.

**Correlation analysis**

Initially, correlation coefficients between wheat yield and Spectral indices/agrometeorological variables were obtained, as shown in Fig 5. From the figure, we can see that the crop yield is strongly correlated to NDVI and minimum temperature, having correlation coefficients of 0.89 and -0.71, respectively. The yield has a feeble dependency on sunshine hours and temperature difference. Although rainfall does play an important role in the crop yield production, it is not very effective in the Saharanpur district as crop yield shows a small positive correlation of 0.54 with rainfall, probably due to availability of other water sources in the district.

**Stepwise linear regression **

**Step-1 of regression**

In the first step, all eight variables were included and multivariate linear regression was performed. The general equation of the regression is given below:

The best-fitted equation of regression is shown below.

The plot of observed and fitted wheat crop yield is shown in Fig 6 and the coefficients along with their uncertainty are given in Table 6. The t-stat is the coefficient divided by its standard deviation in the regression and the p-value represents the significance of the coefficient. The t-stat value of more than 1 and p-value of less is 0.05 is considered a good significance of the coefficient and corresponding variable.

With all the variables included in the regression, the model shows an RMSE of about 28.6 kg/ha and an R^{2} value of 0.993. However, from the Table 7, we can see that the coefficients b_{5} and b6 (coefficients of SH and GDD) have very small t-stat (less than 0.1) and stand out of other variables, therefore, in the next step of regression, they were removed from the analysis.

**Step-2 of regression**

In the second step, six variables were included after discarding SH and GDD and regression was performed. The general equation for the regression is:

After performing the linear regression, the best-fitted equation is given below.

The plot of observed and fitted crop yield is shown in Fig 7 and the coefficients with their uncertainty statistics are shown in Table 8.

By excluding the SH and GDD from regression, the model shows an improved RMSE of about 24.85 and R^{2} value of 0.993. However, from Table 8, we can see that the coefficients b_{1}, b_{4} and b_{6} (Constant, Rain and HTU) have small t-stat and p-value more than 0.05. Thus, they are not significant at a 95% Confidence Level; therefore, in the next step of regression, these variables were removed from the analysis.

**Step-3 of regression**

In the third step, only four variables were included after discarding the constant term, Rain and HTU and regression was performed. The general form of the regression equation is:

The best-fitted equation is given below.

The plot of observed and fitted crop yield is shown in Fig 8 and the coefficients with their uncertainty statistics are shown in Table 9.

Even by incorporating only four variables, the model shows an RMSE of about 49 and an R2 value of 0.95. Hence it explains the importance of NDVI and temperature in the estimation of wheat crop yield for Saharanpur district. The p-value of NDVI is the smallest and hence has the most significant coefficient in the regression. To further check the sole dependence of NDVI on wheat crop yield, another spectral yield model was evaluated by incorporating just the NDVI index.

**Spectral yield model**

In the spectral yield model, a linear relationship between crop yield and NDVI was assumed.

The best-fitted regression equation is given below:

The plot of observed and fitted crop yield is shown in Fig 9 and the coefficients with their uncertainty statistics are shown in Table 10.

We found an RMSE of about 88 and R^{2} of 0.808 by just incorporating NDVI. Hence although NDVI is the most important index for crop yield estimation, the performance of the model was improved by including temperature and its indices.

Fig 3 shows the classified image of Saharanpur district. We can see that the wheat crop covers most of the district with some urban area in the central part and forests in the Northern part of the district.

Accuracy assessment was also performed after supervised classification. The classified image is accepted only when accuracy is above 85%. The Accuracy assessment report is shown in Table 5. In the table, the accuracy assessment is based on the comparison of two maps; one based on the analysis of remote sensing data known as classified map and second based on information derived from actual ground also known as the reference map. Reference total is true class on reference map and classified total is total class on classified map. Based on this, the accuracy of the wheat crop was found to be 85.71% with the overall average accuracy of 81.22%.

For each vegetation index (VI), using the respective range or the wheat crop, the thematic map is prepared with two layers, as shown in Fig 4. Wheat pixels were collected using the attribute table and the wheat acreage was calculated by multiplying the total number of pixels with the spatial resolution (30×30) of the input Image. The estimated acreage using various VIs are shown in Table 5. The NDVI shows the least relative deviation of 5.97, whereas TNDVI shows the maximum deviation of 9.79.

Initially, correlation coefficients between wheat yield and Spectral indices/agrometeorological variables were obtained, as shown in Fig 5. From the figure, we can see that the crop yield is strongly correlated to NDVI and minimum temperature, having correlation coefficients of 0.89 and -0.71, respectively. The yield has a feeble dependency on sunshine hours and temperature difference. Although rainfall does play an important role in the crop yield production, it is not very effective in the Saharanpur district as crop yield shows a small positive correlation of 0.54 with rainfall, probably due to availability of other water sources in the district.

In the first step, all eight variables were included and multivariate linear regression was performed. The general equation of the regression is given below:

Yield = *b*_{1} + *b*_{2} × *T*_{min} + *b*_{3} × *T*_{max} + *b*_{4} × *Rain* + *b*_{5} × *SH* +*b*_{6} × *GDD* + *b*_{7} × *TD* + *b*_{8 }× *HTU* + *b*_{9} × *NDVI*

The best-fitted equation of regression is shown below.

Yield ={401.4-50.32×*T*min+12.99 × *T*max+7.69 × *Rain* +10.45 × *SH* +23.96 × *GDD* +21.93 × *TD*+1.09 × *HTU*+2348.7 × *NDVI*}

The plot of observed and fitted wheat crop yield is shown in Fig 6 and the coefficients along with their uncertainty are given in Table 6. The t-stat is the coefficient divided by its standard deviation in the regression and the p-value represents the significance of the coefficient. The t-stat value of more than 1 and p-value of less is 0.05 is considered a good significance of the coefficient and corresponding variable.

With all the variables included in the regression, the model shows an RMSE of about 28.6 kg/ha and an R

In the second step, six variables were included after discarding SH and GDD and regression was performed. The general equation for the regression is:

Yield =b_{1} + b_{2} × *T*_{min} + b_{3} × *T*_{max} + b_{4} × *Rain* + b_{5} × *TD*+ b_{6} × *HTU* + b_{7} × *NDVI*

After performing the linear regression, the best-fitted equation is given below.

Yield= {379.13 - 35.90 × *T*_{min} + 18.21 × *T*_{max} + 7.69 × *Rain* + 26.49 × *TD* + 1.71 × *HTU* + 2348.7 × *NDVI}*

The plot of observed and fitted crop yield is shown in Fig 7 and the coefficients with their uncertainty statistics are shown in Table 8.

By excluding the SH and GDD from regression, the model shows an improved RMSE of about 24.85 and R

In the third step, only four variables were included after discarding the constant term, Rain and HTU and regression was performed. The general form of the regression equation is:

Yield = b_{1} × *T*_{min} + b_{2} × *T*_{max} + b_{3} × *TD* + b_{4} × *NDVI*

The best-fitted equation is given below.

Yield = 37.66 × *T*_{min} + 25.51 × *T*_{max} +34.21 ×*TD* ×2728.5 ×*NDVI*

The plot of observed and fitted crop yield is shown in Fig 8 and the coefficients with their uncertainty statistics are shown in Table 9.

Even by incorporating only four variables, the model shows an RMSE of about 49 and an R2 value of 0.95. Hence it explains the importance of NDVI and temperature in the estimation of wheat crop yield for Saharanpur district. The p-value of NDVI is the smallest and hence has the most significant coefficient in the regression. To further check the sole dependence of NDVI on wheat crop yield, another spectral yield model was evaluated by incorporating just the NDVI index.

In the spectral yield model, a linear relationship between crop yield and NDVI was assumed.

Yield= b_{1} + b_{2} × *NDVI*

The best-fitted regression equation is given below:

Yield= 2009 + 2399.56 × *NDVI*

The plot of observed and fitted crop yield is shown in Fig 9 and the coefficients with their uncertainty statistics are shown in Table 10.

We found an RMSE of about 88 and R

The major conclusions of the present study are:

I. The acreage obtained from the NDVI showed the lowest relative deviation of 5.97% from actual acreage compared to other indices.

II. The stepwise regression revealed that Rainfall has a nonsignificant relationship in wheat crop yield prediction due to the dominance of groundwater sources availability.

III. The spectral yield model with only NDVI shows an R^{2} value of 0.808 and RMSE of 88. However, by incorporating the minimum and maximum temperature along with temperature difference index, the R2 is improved to 0.95 and RMSE to 49.

I. The acreage obtained from the NDVI showed the lowest relative deviation of 5.97% from actual acreage compared to other indices.

II. The stepwise regression revealed that Rainfall has a nonsignificant relationship in wheat crop yield prediction due to the dominance of groundwater sources availability.

III. The spectral yield model with only NDVI shows an R

The authors would like to thank Saharanpur district headquarter for providing various agrometeorological data. We would also like to thank USGS for providing land use land cover data.

- Bazgeer, S., Mahey, R.K., Sharma, P.K., Sood, A. and Sidhu, S.S. (2006). Pre-harvest wheat yield prediction using agromet-spectral-trend-yield models for Hoshiarpur and Rupnagar districts of Punjab. Journal of the Indian Society of Remote Sensing. 34(3): 269-277.
- Bazgeer, S., Kamali, G. and Mortazavi, A. (2007). Wheat yield prediction through agrometeorological indices for Hamedan, Iran. Desert. 12(1): 33-38.
- Dubey, R.P., Ajwani, N., Kalubarme, M.H., Sridhar, V.N., Navalgund, R.R., Mahey, R.K. and Na Rang, R.S. (1994). Pre-harvest wheat yield and production estimation for the Punjab, India. International Journal of Remote Sensing. 15(10): 2137-2144
- Iqbal, M.J., Ali, Z.U. and Ali, S.S. (2012). Agroclimatic modelling for estimation of wheat production in the Punjab Province, Pakistan. Proceedings of the Pakistan Academy of Sciences. 49(4): 241-249.
- Prasad, A.K., Chai, L., Singh, R.P. and Kafatos, M. (2006). Crop yield estimation model for Iowa using remote sensing and surface parameters. International Journal of Applied Earth Observation and Geoinformation. 8(1): 26-33
- Rojas, O. (2007). Operational maize yield model development and validation based on remote sensing and agro meteorological data in Kenya. International Journal of Remote Sensing. 28(17): 3775-3793
- Sukhatme, P.V. and Panse, V.G. (1951). Crop surveys in India-II. Journal of the Indian Society of Agricultural Statistics. 3(2): 95-168.
- Verma, U., Dabas, D.S., Hooda, R.S., Kalubarme, M.H., Yadav, M., Grewal, M.S. and Prawasi, R. (2011). Remote sensing based wheat acreage and spectral-trend-agrometeorological Yield Forecasting: Factor Analysis Approach. Statistics and Applications. 9(1): 1-13.
- Verma, U., Piepho, H.P., Goyal, A., Ogutu, J.O. and Kalubarme, M.H. (2016). Role of climatic variables and crop condition term for mustard yield prediction in Haryana. Int. J. Agric. Stat. Sci. 12: 45-51.

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Article

APC cover the cost of turning a manuscript into a published manuscript through peer-review process, editorial work as well as the cost of hosting, distributing, indexing and promoting the manuscript.

Submit your manuscript through user friendly platform and acquire the maximum impact for your research by publishing with ARCC Journals.

Join our esteemed reviewers panel and become an editorial board member with international experts in the domain of numerous specializations.

Filling the gap between research and communication ARCC provide Open Access of all journals which empower research community in all the ways which is accessible to all.

We provide prime quality of services to assist you select right product of your requirement.

Finest policies are designed to ensure world class support to our authors, members and readers. Our efficient team provides best possible support for you.

Follow us