Regional Variability of Drought-Crop Sensitivities Across Iowa Using Unsupervised Learning
1. Introduction
Drought is a significant and adverse natural disaster impacting agricultural productivity (Maybank et al., 1995; Sivakumar, 2005). It reduces agricultural productivity, alters food supply chains, and severely stresses farming populations economically (Lesk et al., 2016; Islam et al., 2025). Agricultural drought results from prolonged periods of minimal precipitation and increased atmospheric demand, leading to low soil moisture during critical stages of crop development (Bodner et al., 2015; Baydaroglu et al., 2024). Whereas atmospheric or hydrological deficits assess meteorological or hydrological droughts, agricultural drought is defined by its impact on crop health and productivity (Liu et al., 2016; Niyogi & Mishra, 2012; Zhao, 2024). Among the most agriculturally productive states in this region, Iowa consistently ranks at or near the top nationally for corn and soybean production (Egli, 2008; Grassini et al., 2015). Still, Iowa's climate has become increasingly variable, with regular drought events in recent years posing a significant threat to yield stability (Islam et al., 2024).
In the Midwest, drought has long been a subject of ongoing research; however, many studies assessing its impact on crops rely on combined data at the state or national level (Mallya et al., 2013; Yesilkoy et al., 2024a). These broad approaches often conceal important local differences resulting from variations in soil properties, topography, farming techniques, and microclimates (Carroll, 2012; Karlen et al., 2010). As such, they could not entirely depict the spatial complexity of drought-crop interactions. A uniform understanding of drought effects is insufficient for guiding adaptation measures in a state like Iowa, which is marked by especially productive agricultural land and drought-prone areas that suffer from lower water retention. Station-level or sub-regional studies offer a more realistic view of vulnerability and drought resilience (Horion et al., 2012). Climate adaptation and the development of region-specific risk management plans depend on identifying regional differences in drought responses (Prabhakar & Shaw, 2008; Pulwarty & Sivakumar, 2014; Tanir et al., 2024).
Researchers and policymakers evaluate and track drought using several meteorological and satellite-derived indices, each customized to represent various aspects of the phenomenon (AghaKouchak et al., 2015; Islam et al., 2022) for data analytics and communication purposes (Yesilkoy et al., 2024b). The commonly used method for evaluating deviations from average precipitation over short- to long-term intervals is the Standardized Precipitation Index (SPI) (Cancelliere et al., 2007; Naresh Kumar et al., 2009). On the other hand, the Standardized Precipitation-Evapotranspiration Index (SPEI) combines temperature into its evaluation by offering a metric of climatic water balance, particularly relevant during warming trends (Tirivarombo et al., 2018).
While the Evaporative Demand Drought Index (EDDI) evaluates atmospheric demand for moisture and helps identify fast-onset or "flash droughts," the Palmer Drought Severity Index (PDSI) reflects long-term moisture balance and is often used in agricultural settings (Qing et al., 2022; Yu et al., 2019). The Crop Moisture Index (CMI) offers an immediate evaluation of root-zone soil moisture, especially during important phenological phases, while the Normalized Difference Vegetation Index (NDVI), obtained from satellite imagery, indicates vegetation greenness and serves as an indirect measure of drought stress (Kyratzis et al., 2017; X. Wu et al., 2024). With its relevance depending on crop type, geographic location, and period inside the growing season, each measure captures different temporal and physical aspects of drought.
However, considering the availability of these methods, a primary challenge is determining which drought indicators are most relevant for specific crops and regions. The main crops grown in Iowa, Corn and Soybean, have different growth seasons and physiological responses to water stress (V. Mishra & Cherkauer, 2010). While soybeans are often more sensitive to late-season circumstances, notably during pod filling, Corn is often more vulnerable to moisture deficits in early to mid-summer, especially during the tasseling and silking processes (VerCauteren & Hygnstrom, 2011).
Previous studies show that whereas longer-term indices, such as SPEI-6 or PDSI, may correlate more closely with Soybean yield variability, short-term indices, like SPI-3 or EDDI, may more faithfully reflect the drought conditions affecting Corn (Arshad et al., 2023; SM et al., 2025; D. Wu et al., 2015). It is currently unknown how much these relationships vary spatially across Iowa. Furthermore, most of the present studies rely on correlation or regression analyses, assuming a constant connection throughout the study area, thereby limiting their ability to expose spatial variance in crop responses to drought (Maestrini & Basso, 2018; A. K. Mishra et al., 2015).
Advances in machine learning provide novel methods for addressing disaster monitoring (Li et al., 2023), forecasting (Krajewski et al., 2021), and quantifying drought risk and exposure (Prodhan et al., 2022). Clustering algorithms, along with other unsupervised learning methods, aid in classifying data points based on natural similarities without requiring labeled outcomes (Greene et al., 2008). Clustering in drought-crop modeling can identify underlying geographical patterns of drought susceptibility by locating sites with equivalent sensitivity to many drought indicators (Murthy et al., 2015). These trends can influence regional crop insurance changes, inform localized adaptation strategies, and enhance the effectiveness of drought monitoring systems. Particularly in long-term, station-based, and cloud-based remote sensing datasets, including various meteorological variables (Seo et al., 2019), the application of unsupervised learning to investigate drought-yield connections at a sub-state level is currently underutilized, despite its potential (Colston et al., 2018).
This study aims to bridge the gap by employing unsupervised learning to develop a spatially explicit framework that identifies regional clusters of drought-crop sensitivity in Iowa. We utilize a large dataset spanning 1998 to 2022, comprising monthly drought index values and detrended agricultural output from over 100 meteorological stations throughout the state. We compute sensitivity profiles—that is, the correlation between different drought indices and the yields of Corn and Soybeans for every station. Subsequently, the profiles are used as input for a k-means clustering technique to classify the stations based on the proximity of their drought-yield response patterns. The resulting clusters are defined and investigated to determine how different sections of Iowa exhibit varying susceptibility to different types and durations of drought.
In addition, its method presents an innovative perspective on the regional distinctions of drought outcomes in the Midwest. Our approach recognizes the variety of Iowa's agricultural landscape and aims to identify developing regional crop sensitivity patterns, avoiding the presumption of homogeneous responses. This work integrates meteorological, satellite, and machine learning techniques to provide a novel regional drought risk assessment method. The insights gained can enhance decision-making in both public and private sectors, particularly in developing more targeted crop insurance policies, adaptive planting methods, and precision early warning systems (Alabbad et al., 2024; Kadiyala et al., 2024). This initiative aims to encourage climate-resilient agriculture in a highly productive yet vulnerable area of the country.
2. Materials and Methods
This study focuses on Iowa, utilizing secondary atmospheric and geographical data to examine the impact of drought on soybean and corn yields. Geographic patterns and numerical data were investigated using GIS tools and statistical analysis. Thorough explanations for data selection, preparation techniques, and analytic approaches improve transparency and repeatability.
Study Area
Iowa, situated in the north-central United States, is a vital part of the American Corn Belt and one of the most agriculturally productive states in the country (Laingen, 2012; Islam & Demir, 2025). The state has more than 145,000 square kilometers, primarily dedicated to agriculture, with over 90% of its territory designated for agricultural purposes, particularly for the production of Corn and Soybean (ITS, 2020). Iowa's agricultural output is vital to national and international food systems, making it a critical region for analyzing the effects of environmental stresses on crop productivity (Al-Kaisi et al., 2013).
Iowa experiences a humid continental climate with severe winters and warm, humid summers (Takle & Gutowski, 2020). The primary annual precipitation in the state, ranging from 750 to 1,200 mm, occurs between April and September, coinciding with the primary growing season for Corn and Soybean (Nayak et al., 2016). The temporal and spatial distribution of precipitation is unpredictable, with a significant portion of the state mainly relying on rainfall rather than irrigation. Thus, even minor changes in precipitation patterns or increased evapotranspiration due to heat waves can significantly reduce soil moisture availability, resulting in agricultural drought conditions (Miralles et al., 2019).
The state's terrain is primarily uniform, marked by gently rolling plains created by glacial activity. Despite its apparent uniformity, Iowa exhibits considerable geographical diversity in soil types, water retention capabilities, drainage properties, and specific climatic conditions, resulting in varying drought responses (Daigh et al., 2014). The western and southern regions are often more vulnerable to drought due to their coarser-textured soils and reduced water retention, whereas the northeastern regions often experience higher precipitation and have finer soils (Al-Kaisi et al., 2013). The intra-state variations underscore the need for geographically precise evaluations to pinpoint locations susceptible to flood, drought, and associated risks (Alabbad et al., 2023; Cikmaz et al., 2025). Iowa's varied and data-abundant topography is ideal for examining fine-scale, station-based crop-climate interactions using advanced analytical techniques.
Data and Methods
Data Sources
This study utilizes an extensive, station-specific dataset that includes yearly yield data for Corn and Soybeans in Iowa from 1998 to 2022, as well as meteorological drought indices. Drought measurements were taken from 126 meteorological stations throughout the state, including a broad spectrum of local environmental factors. Selected for their ability to depict several temporal and physical dimensions of drought stress, the collection includes meteorological and remote-sensing drought indicators.
This analysis employs the following drought indices: the Standardized Precipitation Index (SPI) and the Standardized Precipitation-Evapotranspiration Index (SPEI), both assessed at 1, 3, 6, and 12-month accumulation intervals; the Palmer Drought Severity Index (PDSI), which incorporates temperature and soil moisture balance over prolonged durations; the Evaporative Demand Drought Index (EDDI), which indicates atmospheric moisture demand and is particularly adept at identifying flash droughts; the Crop Moisture Index (CMI), which evaluates short-term root-zone soil moisture; and the Normalized Difference Vegetation Index (NDVI), a satellite-derived measure of vegetation vitality and overall crop health.
The US Department of Agriculture's National Agricultural Statistics Service (USDA-NASS) provided annual yield data for Corn and soybeans (USDA-NASS, 2024). The data were directly linked to their relevant stations or spatially interpolated where necessary. Employing a complete sensitivity analysis at the station level, integrating long-term, high-resolution meteorological data with yield records exposes regional trends in drought effects.
This study employed six drought indicators to capture various aspects of drought and its impact on agricultural output. Indices were considered based on their potential to capture both short-term and long-term drought conditions. Below is a description of each index, including its restrictions and rationale for inclusion.
The Standardized Precipitation Index (SPI) measures precipitation anomalies over defined intervals (1, 3, 6, and 12 months) by employing a fitted probability distribution to normalize deviations from the local historical average. SPI values were derived from Iowa Mesonet station data and generally fluctuate between − 3.0 and + 3.0, with negative values signifying dry conditions and positive values denoting humid intervals.
The Standardized Precipitation-Evapotranspiration Index (SPEI) enhances the SPI framework by including potential evapotranspiration (PET), capturing atmospheric demand and temperature-induced drought conditions. The SPEI was computed utilizing the SPEI R program, based on the monthly water balance (
) at each station. The resultant values, obtained by log-logistic fitting, also span from approximately − 2.0 to + 2.0, providing insight into both short-term and long-term droughts.
The Palmer Drought Severity Index (PDSI) assesses long-term moisture equilibrium by analyzing precipitation, temperature, and a soil water accounting framework. Utilizing a recursive z-index formulation, it monitors extended moisture surpluses or deficits. PDSI values often span from − 4.0 (severe drought) to + 4.0 (very wet), rendering it effective for detecting prolonged agricultural droughts, while its reaction to short-term abnormalities is comparatively sluggish.
The Evaporative Demand Drought Index (EDDI) quantifies atmospheric evaporative demand, providing early indicators of drought initiation. It is predicated on anomalies in atmospheric evaporative demand within historical climatology, like SPI in structure, but emphasizing evaporative potential instead of precipitation. As obtained from NOAA data, EDDI values rise with air aridity and span approximately from − 2.0 (wet conditions) to + 2.0 (elevated evaporative stress).
The Crop Moisture Index (CMI) provides short-term assessments of soil moisture levels that impact crop growth. Based on the PDSI architecture, it adapts to weekly moisture variations and was particularly developed for agricultural monitoring. CMI values range from − 3.0 to + 3.0, where negative values indicate moisture stress and positive ones denote excessively wet conditions.
The Normalized Difference Vegetation Index (NDVI) is a remote sensing metric obtained from MODIS images, indicating vegetation greenness and photosynthetic activity. NDVI was computed utilizing the conventional red and near-infrared (NIR) reflectance formula
and underwent preprocessing through mosaicking, projection, resampling, and masking in ArcGIS. NDVI values span from − 1.0 to + 1.0, with elevated values signifying robust vegetation. Despite the derivation of NDVI for the study, it was omitted from the final correlation analysis because of interference from cloud cover and irregular temporal resolution. Collectively, these indices provide a comprehensive and varied perspective on drought conditions in Iowa, incorporating both meteorological factors and vegetation responses pertinent to agricultural output variability.
A
Table 1
Name of the indices and data sources.
Drought Index | Data | Data Source | Reference |
|---|
Different SPI time scales | SPI-1 | Monthly precipitation data | Iowa Mesonet website (https://mesonet.agron.iastate.edu/rainfall/) and climate data centers (NOAA | (McKee et al., 1993) |
SPI-3 |
SPI-6 |
SPI-12 |
Different SPEI time scales | SPEI-1 | Monthly precipitation and temperature data | Iowa Mesonet website (https://mesonet.agron.iastate.edu/) and climate data centers (NOAA) | (Vicente-Serrano et al., 2010) |
SPEI-3 |
SPEI-6 |
SPEI-12 |
PDSI | Monthly precipitation and temperature data, soil water holding capacity | Iowa Mesonet website (https://mesonet.agron.iastate.edu/) and climate data centers (NOAA) | (Palmer, 1965) |
EDDI | Temperature, relative humidity, wind speed, and solar radiation data | Iowa Mesonet website (https://mesonet.agron.iastate.edu/ ) and climate data centers (NOAA) | (Hobbins et al., 2016) |
CMI | Weekly precipitation and temperature data | Iowa Mesonet website (https://mesonet.agron.iastate.ed.u/) and climate data centers (NOAA) | (Palmer, 1968) |
NDVI | Satellite imagery data (visible and near-infrared light) | Satellite data providers (NASA's MODIS) | (Rouse Jr et al., 1973) |
Data Preprocessing and Detrending
All crop production data were linearly detrended prior to analysis to exclude the impact of technological developments, improved management techniques, and genetic advances that have contributed to general yield increases across the research period. Using ordinary least squares regression, this detrending was performed independently for Corn and Soybeans; all subsequent studies utilized the residuals as the dependent variable. This ensures that, rather than structural agricultural advances, reported yield changes are more directly related to climatic and environmental variability, especially drought circumstances.
All drought index values were normalized to z-scores to enable comparisons between scales and indices. The study excluded stations with significant missing data or inconsistent reporting; remaining missing values were imputed using a nearest-neighbor method. Monthly drought readings were compiled seasonally to match the usual phenological cycle for Corn and Soybeans in Iowa—during planting, mid-season, and reproductive phases.
Sensitivity Analysis
To measure the extent of crop production variation due to local drought circumstances, we performed a station-level sensitivity analysis correlating several drought indicators with detrended yield estimates for Corn and Soybean. This approach entails calculating the statistical correlation between drought severity indicators and crop performance within the same growing season, producing a quantitative profile of drought sensitivity for each station.
Let
and
represent the detrended yield residuals for Corn and Soybean, respectively, at station
for a given year. These residuals, derived from the detrending process described in Section 2.2.2, reflect interannual yield variability not explained by long-term technological trends and are presumed to be primarily influenced by environmental factors, including drought. Each drought index
corresponds to the standardized value of the
drought indicator at station
, calculated over time scale
, such as 1-month, 3-month, 6-month, or 12-month aggregation. For instance,
denotes the 3-month Standardized Precipitation Index value at station
during a specific year.
We applied the Spearman rank correlation (Spearman,
1961) coefficient (ρ), a non-parametric statistic resilient to non-linear relationships and skewed distributions, to quantify the degree and direction of linear correlations between drought indices and crop yields. The Spearman correlation between the drought index
and the yield residual
at station
is defined as follows:
Where
and
denotes the relationship between the
drought index and the residuals of Corn and Soybean yields at station
. The correlation values observed span from − 1 to + 1, with negative values signifying an inverse relationship (i.e., increased drought severity correlates with decreased yields), while positive values reflect a direct relationship (i.e., improved crop conditions are linked to adequate moisture levels). Then, a multidimensional drought sensitivity vector represents each of the meteorological stations:
Here,
represents the total number of drought indices that are used in this study (e.g., SPI-1, SPI-3, SPEI-6, EDDI, CMI, NDVI, etc.), and
,
are for the drought sensitivity profiles for Corn and Soybean yield at station
, respectively.
The sensitivity vectors fulfill two main functions: (1) they encapsulate the correlation between drought stress and crop yield performance at each geographical point, and (2) they supply the feature inputs for the clustering analysis detailed in subsequent sections. This strategy effectively maintains the full range of responses from diverse stations to specific drought dimensions and temporal scales, enabling the identification of latent spatial patterns through unsupervised learning.
All correlation calculations were executed using Python's scipy.stats.spearmanr() function, applied separately to each station-year combination across all drought indices. Only stations with at least 10 years of comprehensive, uninterrupted data for drought indices and detrended yield were preserved to assure statistical reliability.
Dimensionality Reduction
The sensitivity analysis outlined in the preceding section yields a high-dimensional feature space, where each meteorological station is defined by a vector of correlation coefficients that denote its sensitivity to diverse drought indicators in terms of agricultural production. This multidimensional framework provides an extensive perspective on drought-yield correlations, although it also presents potential issues of multicollinearity, redundancy, and noise, particularly when the feature counts significantly exceed the number of observations. Furthermore, several drought indices may demonstrate significant intercorrelations, particularly those functioning across analogous temporal scales or addressing overlapping meteorological events (e.g., SPI-6 and SPEI-6).
To mitigate these issues and enhance clustering efficiency, we utilized Principal Component Analysis (PCA) as a method for dimensionality reduction (Jolliffe, 2002). Principal Component Analysis (PCA) is a widely used statistical technique that transforms the original feature space into a new set of orthogonal axes, known as principal components, which are linear combinations of the original variables (Botero-Acosta et al., 2024). The components are arranged to maximize the variance in the data, enabling the representation of high-dimensional sensitivity profiles in a lower-dimensional subspace with little information loss.
Let
denotes the matrix of sensitivity vectors, where
is the number of stations and
is the number of drought index–crop correlations computed (e.g., 12–16 features including Corn and Soybean sensitivities). Each row
corresponds to a single station's drought sensitivity profile:
PCA strives to determine a transformation matrix
given that:
Here,
denotes the changed dataset inside the diminished feature space, where
is significantly less than
. The matrix W is obtained by performing the eigenvalue decomposition of the covariance matrix of
, with the principal components (eigenvectors) arranged in descending order of their corresponding eigenvalues (explained variance). The resultant components are uncorrelated, making them appropriate for grouping and multivariate analysis.
To determine the optimal number of principal components 𝑘 to retain, we employed the Kaiser criteria, which selects components with eigenvalues greater than one, and scree plot visualization, which reveals the "elbow point" where the incremental increase in explained variance stabilizes. Typically, the initial two to four primary components adequately accounted for more than 85% of the variance in the sensitivity profiles, allowing a concise and comprehensible depiction of the fundamental patterns. The PCA-transformed features functioned as inputs for the clustering phase, guaranteeing that clustering remained unaffected by redundant or noisy features while accurately representing the geographical structure of drought sensitivity in a resilient, reduced-dimensionality format.
Clustering Analysis
We employed k-means clustering, an unsupervised learning technique (Li & Demir, 2022), to analyze the main component scores derived from the station-level sensitivity profiles and identify geographical patterns in drought-yield sensitivity across Iowa. Using internal similarity to identify latent groupings within the dataset, clustering provides a data-driven method for classifying areas with similar drought impacts on agricultural production. The approach serves the study's primary objective: identifying agro-climatic zones based on functional similarities in environmental response rather than predetermined administrative boundaries.
K-means is a partition clustering approach that allocates
observations into
distinct clusters.
, with each observation assigned to the cluster corresponding to the closest centroid. The approach seeks to reduce the overall within-cluster sum of squares (WCSS), which is defined as:
where
represents the feature vector of the
station in PCA space, and
is the centroid of cluster
. The process is iterative, commencing with the random initialization of cluster centroids or with the k-means + + method, and thereafter adjusting assignments and centroids until convergence is achieved. Before executing the clustering process, all main components were normalized using z-scores to guarantee uniform weight across dimensions. The ideal number of clusters
was established using a synthesis of the Elbow Method and the Silhouette Score. The Elbow Method entails graphing the Within-Cluster Sum of Squares (WCSS) over various
values and identifying the juncture at which further clusters produce marginal enhancements. The Silhouette Score
offers a quantitative assessment of clustering cohesiveness and separation, computed for each data point as follows:
Where
is the mean distance between point 𝑖 and all other points within the same cluster, whereas
represents the minimum average distance between point 𝑖 and points in the closest adjacent cluster, scores approaching 1 signify well-defined clusters, whereas scores around 0 imply uncertainty in cluster allocation. Upon ascertaining the ideal
, each station was allocated to a cluster according to its drought sensitivity profile. The assignments were employed to examine geographical distribution and evaluate regional drought susceptibility in the findings section. All clustering operations were performed utilizing Python's scikit-learn library, with cluster stability assessed over many iterations employing varying initialization seeds.
Spatial Visualization and Interpretation
After identifying drought-sensitivity clusters using k-means clustering, the resultant assignments were incorporated into a Geographic Information System (GIS) framework for spatial analysis and mapping. Each meteorological station in the dataset was georeferenced utilizing its latitude and longitude coordinates, and the designated cluster membership was spatially integrated with these locations. Geospatial mapping techniques were utilized as a visualization instrument and methodological enhancement of the investigation, facilitating spatial comparison with the underlying environmental conditions throughout Iowa.
Static and interactive maps were created as components of the spatial interpretation procedure. Static maps were produced utilizing Python's 'GeoPandas' and 'Matplotlib' libraries to superimpose clustered stations over the official shapefile of Iowa's county borders. Each cluster was represented by unique color codes, facilitating the detection of geographic coherence or dispersion among functionally analogous drought-response zones. Furthermore, interactive maps were developed utilizing the `Folium` library, facilitating an in-depth examination of station-level characteristics, including pop-up information and drought sensitivity profiles. These online maps are dynamic instruments for conveying results to technical and non-technical audiences.
To enhance geographic analysis, supplementary overlays were conducted in ArcGIS Pro, utilizing GIS techniques to assess the physical and environmental context of each cluster. This encompassed the integration of layers, including USDA soil texture classes, average annual precipitation isohyets, topography variations, and land cover maps. These geographical overlays facilitated visual analysis and quantitative zonal assessments of drought-affected areas. The objective was to evaluate whether clusters corresponded with environmental limits or displayed regional gradients in crop sensitivity to drought indices.
All maps created throughout this approach utilized a uniform coordinate reference system (NAD83 Iowa State Plane North or UTM Zone 15N, depending on the tool), ensuring spatial coherence among datasets. Section 3.4 presents the outcomes of spatial clustering and GIS-based analysis, including the geographic distribution of clusters, environmental attributes, and potential ramifications for localized drought monitoring and response strategies.
3. Results
Drought-Yield Sensitivity Profiles at the Station Level
Spearman correlation interactions between detrended yields of Corn and Soybean and seven chosen drought indices: SPI-1, SPI-3, SPI-6, SPEI-3, PDSI, EDDI, and CMI were computed to assess the relationship between drought conditions and variability in agricultural yields throughout Iowa. These indicators characterize various drought processes, including short-term precipitation deficiencies (SPI-1, SPI-3), cumulative soil moisture stress (SPI-6, SPEI-3, PDSI), atmospheric evaporative demand (EDDI), and root-zone conditions (CMI). Although satellite-derived vegetation indicators, such as NDVI, can provide useful insights into plant health and drought response, NDVI was excluded from this study due to recognized limitations in temporal consistency and data quality resulting from the influence of cloud cover. Furthermore, NDVI is better suited for predictive or remote sensing-oriented research than drought impact analysis, which primarily reflects the impacts of drought rather than acting as a direct cause of production variability.
Correlation patterns indicated substantial differences between the responses of Corn and Soybean. Corn yields had the strongest negative correlation with EDDI (ρ = − 0.458), highlighting the crop's susceptibility to atmospheric aridity. Soybean yields exhibited robust and persistent connections across various indices, demonstrating significant positive relationships with PDSI (ρ = 0.480) and SPI-1 (ρ = 0.418), as well as substantial negative correlations with EDDI (ρ = − 0.554) and CMI (ρ = − 0.450). These findings demonstrate that soybean yields exhibit more sensitivity to atmospheric water demand and prolonged moisture conditions than Corn. The results indicate that Corn and soybeans exhibit distinct drought sensitivities, with Corn being more susceptible to short-term atmospheric stress, whereas Soybeans react to cumulative hydrological and soil moisture deficiencies. This highlights the necessity of crop-specific and temporally sensitive drought monitoring tools to enhance Iowa's agricultural resilience and management approaches.
Principal Component Analysis (PCA) of Drought-Yield Sensitivities
Principal Component Analysis (PCA) was performed independently on the Corn and Soybean datasets to reduce the dimensionality of the drought indices and better understand their combined impact on crop production. The first two principal components (PC1 and PC2) accounted for nearly 75% of the overall variation for both crops, as indicated by the Scree plot (Fig. 5), which supports the use of a two-dimensional structure in further investigations. The PCA results validate the preceding correlation results by emphasizing that Soybean is more vulnerable to continuous hydrological drought, whereas Corn is more sensitive to acute atmospheric dryness. These findings support the necessity of crop-specific drought monitoring and management techniques.
Heatmaps in Fig. 6 of PCA loadings show different drought-yield sensitivity profiles for Soybean and Corn. Strong positive loadings on PC1 for Corn, SPI-1, SPI-3, and EDDI indicate that short-term precipitation shortfalls and atmospheric demand are the leading causes of yield fluctuations. On PC1 and PC2, Soybean loading patterns showed stronger correlations with SPI-6, PDSI, and CMI, thereby emphasizing the importance of cumulative soil moisture availability over extended temporal periods.
Identification of Drought Sensitivity Clusters
Using k-means clustering on the highest principal component scores (PC1 and PC2) obtained from the drought-yield correlation matrices for Corn and Soybean, spatially classified areas with similar drought-yield sensitivity profiles were accomplished. Three groups for each crop were chosen based on the Elbow technique and Silhouette score analysis (not shown), reflecting different drought response regimes across Iowa. Distinct variations in Corn and Soybean sensitivity were shown by cluster analysis. Cluster 1 stations for Corn showed great sensitivity to short-term drought indicators, such as SPI-1 and EDDI, indicating vulnerability to acute atmospheric dryness. Cluster 3 indicated very drought-resilient stations with mixed effects from many drought indices; Cluster 2 caught areas somewhat influenced by cumulative drought stress (SPI-6, PDSI).
Cluster 1 for soybeans included sites most susceptible to long-term hydrological dryness based on strong SPI-6, PDSI, and CMI loading. While Cluster 3 combined stations generally exhibited low correlations with drought yields, Cluster 2 showed substantial sensitivity across both short- and long-term drought indices. Five separate drought sensitivity clusters—Clusters 0–4—were identified over Iowa through k-means clustering analysis. Table 2 summarizes the distinct combinations of drought-yield sensitivity properties for Corn and Soybeans, which are each clustered. These clusters provide a basis for the geographical study of drought consequences in the following sections, reflecting local variations in crop responses to short-term and long-term drought stress.
Table 2
Mean sensitivity profiles (PC1 and PC2 loadings) by cluster for Corn and Soybean.
Cluster | PC1 Mean (Corn) | PC2 Mean (Corn) | PC1 Mean (Soybean) | PC2 Mean (Soybean) |
|---|
Cluster 1 | -1.23 | 0.67 | -0.87 | 1.02 |
Cluster 2 | 0.45 | -0.98 | 0.23 | -0.51 |
Cluster 3 | 1.12 | 0.31 | 0.94 | 0.17 |
These clusters highlight the spatial variability in drought mechanisms affecting agricultural productivity in Iowa and underscore the need for region-specific drought mitigation strategies tailored separately for Corn and Soybean systems.
Spatial Distribution of Drought Sensitivity Clusters
We visualized the station-level cluster assignments obtained from the k-means clustering analysis to show how various drought sensitivity patterns are spatially scattered throughout Iowa. The stations were arranged into five different drought sensitivity groups (Cluster 0 to Cluster 4) depending on k-means clustering and assessment strategies. Every cluster reflects differences in how local drought circumstances affect Corn and Soybean output, therefore representing a unique drought-yield sensitivity profile. The primary mix of drought processes—such as short-term air dryness, persistent soil moisture stress, or root-zone conditions—that most influence yield variability at each site is represented by cluster membership. Thus, spatial mapping of clusters provides an important understanding of regional drought vulnerabilities, thereby aiding the creation of focused adaptation and mitigating strategies for agricultural vulnerabilities.
The spatial distribution exposes distinct regional trends in drought sensitivity throughout the state. Mostly belonging to Cluster 0, Northwest and west-central Iowa stations exhibit little sensitivity to short-term drought indices, such as SPI-1 and EDDI. Cluster 1 indicates a greater vulnerability to cumulative soil moisture deficiencies, especially for SPI-6, PDSI, and CMI, with southeast and east-central Iowa stations being predominantly represented. Although distributed, cluster 2 stations typically inhabit transitional zones, implying a heterogeneous sensitivity regime shaped by short-term atmospheric stress and longer-term moisture anomalies. These trends have established agro-climatic gradients across Iowa, where precipitation, soil type, and cropping methods all influence the effects of drought.
By using spatially explicit risk assessment, mapping drought sensitivity enables the identification of areas where early drought monitoring activities or crop insurance modifications might be given top priority, depending on the main yield factors. This geographical method offers a crop-specific, outcome-oriented perspective on agricultural vulnerability, complementing conventional drought monitoring. As shown in Fig. 7, the geographical interpolation of drought sensitivity clusters in Iowa highlights different regional trends in agricultural production responses to drought stress. Inverse Distance Weighted (IDW) interpolation spatially represented the clusters produced by k-means computations of principal component scores based on Spearman correlations between drought indices and detrended yields.
The resulting surface marks five primary zones, each with a unique sensitivity profile to drought indicators: Cluster 0 through Cluster 4. Primarily found in the Northwest and portions of the West Central region, Cluster 0 is where yield declines, especially for Corn, are linked with short-term drought indicators like SPI-1 and EDDI. This suggests that quick atmospheric aridity is more likely in certain areas. Primarily found in Central and South-Central Iowa, Cluster 1 shows an intermediate sensitivity profile for both crops, maybe reflecting a combination of climatic and edaphic conditions. Primarily found in the East Central and Southeastern regions, Cluster 2 exhibits strong correlations with indices of cumulative moisture stress, including PDSI and CMI, which impact soybean yields.
With relatively weak links between drought and yield, Cluster 3 is observed in the North Central and Northeast Iowa areas, suggesting either greater resistance or the influence of non-climatic yield determinants. Lastly, Cluster 4, located in the Southwest, exhibits vulnerability to prolonged drought events due to its sensitivity to extended indices such as SPI-6 and SPEI-3. These geographical clusters align with identified agro-climatic zones, underscoring the need for region-specific drought risk assessments and guiding spatially concentrated adaptation and mitigation measures.
Figure 9 illustrates Iowa Corn and Soybean drought sensitivity clusters using Local Indicators of Spatial Association (LISA) statistics. The first principal component (PC1) scores were derived from PCA using drought-yield correlation observations. For every crop, PC1 scores incorporate many drought indicators to demonstrate the primary pattern of drought response across stations. Geographic clusters and outliers displaying statistically significant drought sensitivity variations over Iowa's agricultural terrain were identified using LISA.
Significantly high-high (HH) clusters in South Central and Southeast Iowa are seen on the Corn LISA map. These clusters comprise spatially continuous zones with significant drought sensitivity, as well as additional highly sensitive stations. Many Low-Low (LL) clusters in the East and North Central regions exhibit regional consistency in their sensitivity to corn drought. Several High-Low (HL) outliers in Northeast Iowa, where stations with high drought sensitivity are located near those with low sensitivity, indicate localized extremes resulting from soil variability, management, or microclimatic conditions.
Though it is more scattered, the Soybean LISA map has some significant traits. Once more observed in the South-Central area, high-high clusters partially coincided with the Corn to show drought sensitivity. Therefore, Soybeans show increased drought tolerance by exhibiting better low-low clustering in the North Central and Northwest regions. High-low and low-high outliers in several subregions imply that soybean susceptibility to drought varies more regionally than Corn, presumably in response to root depth, planting density, or temporary moisture stress.
Table 3 lists drought sensitivity clusters by agricultural area. The table highlights Iowa's geographical coherence between drought-prone and drought-resistant areas, as well as crop-specific weaknesses. This comparison highlights spatially explicit approaches to agricultural risk assessment. Simultaneous high-high clusters for Corn and soybeans in South Central Iowa indicate that agronomic or environmental variables are affecting multiple cropping systems. A low-low cluster discrepancy in areas where only one crop is robust necessitates crop-specific drought-mitigating strategies. These results can inform policymakers, agronomists, and water resource planners seeking to enhance agricultural climate resilience.
Table 3
Summary findings of the clustered region in the LISA method.
Crop | High-High Cluster Regions | Interpretation |
|---|
Corn | South Central, Southeast | Joint drought-sensitive zones |
Soybean | South Central | Overlaps with Corn (joint vulnerability) |
Corn | Low-Low in North/East Central | Corn shows regional drought resilience |
Soybean | Low-Low in North/Northwest | Soybean-specific drought resilience |
Principal Component Analysis indicated a nearly symmetric distribution of PC1 scores for Corn and Soybean (Fig. 9, top), reflecting the predominant direction of variance in drought sensitivity. The Corn distribution has a minor right skew (StdDev: 0.42), but Soybean statistics exhibit larger dispersion (StdDev: 0.50), indicating increased regional variability. Moran's scatterplots (Fig. 9, bottom) justified spatial autocorrelation in PC1_Corn and PC1_Soybean, exhibiting R² values of 0.25 and 0.26, respectively. These findings validate the use of Local Indicators of Spatial Association (LISA) to detect regional clusters of drought susceptibility.
4. Conclusion
This study presents a comprehensive, multi-year investigation into drought-crop sensitivity in Iowa, combining station-level meteorological and yield data for Corn and Soybean from 1998 to 2022. We used Spearman correlation analysis to determine distinct drought response patterns particular to each crop. Corn yields exhibit increased susceptibility to short-term atmospheric drought, as indicated by indices such as SPI-1 and EDDI, suggesting that Corn is particularly vulnerable to rapid-onset droughts during critical growth phases. Conversely, Soybean yields exhibited greater sensitivity to prolonged cumulative moisture deficiencies, with more robust correlations identified for SPI-6, PDSI, and CMI, indicating the Soybean's dependence on stable soil moisture during its growth period.
We performed Principal Component Analysis (PCA) to diminish complexity and identify the predominant patterns in drought-yield interactions. The initial principal component (PC1) aggregated multivariate drought sensitivity patterns into a singular interpretable score for each location. Spatial analysis, implementing Local Indicators of Spatial Association (LISA), identified statistically significant clusters of drought sensitivity. In Corn, High-High (HH) clusters—areas exhibiting persistent high drought sensitivity—were primarily located in South Central and Southeastern Iowa, whereas Low-Low (LL) clusters—signifying drought resilience—were identified in North Central and East Central Iowa. The geographical distribution of soybeans varied: HH clusters were present in South Central Iowa, but low-low clusters were more prominent in the North Central and Northwestern regions, indicating that soybeans may exhibit more drought resilience in those locations than Corn.
The LISA results have been verified by Moran's I scatterplots and PC1 histograms, which validated moderate positive spatial autocorrelation and a uniform distribution of drought sensitivity scores. Incorporating these indicators enhanced the statistical validity of the geographical clustering outcomes and validated that the PC1 values were suitable for localized spatial analysis. Using k-means clustering, stations are grouped into regional sensitivity groups based on their drought-yield correlation patterns, thereby strengthening the geographical gradients identified using LISA. This dual clustering methodology—encompassing both unsupervised and spatial techniques—emphasized the need to integrate statistical structure with geographic context in comprehending agricultural risk.
From a policy and agricultural management standpoint, these findings are significantly actionable. The distinct identification of mutually susceptible regions (e.g., South Central Iowa for both crops) and crop-specific robust zones (e.g., Northwest Iowa for Soybean, East Central Iowa for Corn) highlights the necessity for customized adaptation techniques. Generic drought approaches may fail to address these complex risk patterns adequately. Incorporating this regional sensitivity data into early warning systems, specialized crop insurance models, and drought preparedness strategies can markedly enhance the efficacy of interventions.
This study provides numerous significant contributions. Drought sensitivity in Iowa agriculture depends on the individual crops, the chosen index, and the regional distribution. We present a scalable system for identifying and displaying drought susceptibility by merging PCA with spatial clustering algorithms. These techniques enhance conventional drought monitoring by transitioning the emphasis from meteorological anomalies only to yield-based impact modeling. Subsequent studies should expand upon this approach by integrating real-time soil moisture monitoring, management factors (such as irrigation and tillage), and satellite-derived drought indices. Transitioning to a forecast-oriented or early warning framework might improve strategic planning for climate-resilient agriculture in the U.S. Corn Belt and beyond.