From Hazards to Settlement Planning in Nepal: Using Machine learning and GIS to assess impact of critical Flood and Landslide Overlays

Narayan Thapa 1

Kabir Uddin 1

Rajesh Bahadur Thapa 1

Erica Udas 1✉ Emailerica.udas@icimod.org

International Centre for Integrated Mountain Development Kathmandu Nepal

Narayan Thapa¹, Kabir Uddin¹, Rajesh Bahadur Thapa¹, Erica Udas¹

1 International Centre for Integrated Mountain Development Kathmandu Nepal

Corresponding author: Erica Udas erica.udas@icimod.org

Abstract

Surrounded by active seismic zones and influenced by one of the most dynamic climatic systems on Earth in the Himalayas, Nepal faces increasing threats from cascading natural hazards. Despite frequent co-occurrence of floods and landslides, national-scale assessments that explicitly quantify their spatial interactions remain largely unexplored. This study presents the first machine learning and Geographic Information System (GIS) based approach to map country-wide cascading zone for Nepal.

The model is built on topographic, climatic, environmental, and hydrological datasets, yielding strong predictive accuracy (Area Under Curve: 0.84 for floods and 0.85 for landslides). The results indicated that 19% of Nepal's total area, mostly in lowlands is medium to very highly susceptible to floods with threat to 900 thousand people and over 3.4 million infrastructures, and 40% is susceptible to medium to very high landslides susceptibility, which pose threats to 200 thousand individuals and 600 thousand infrastructures. A high-threshold interaction analysis (flood above 60% and landslide above 80%) produces novel four zones 81% low hazard zone area, 9% flood only zone, 5% landslide only zone, and 5% critical cascading hotspots.

We further traced cascading landslides, and floods flow downstream to the nearest settlements. These critical zones cover approximately 7588 km² with possibility to impact 88 km² of built-up area (12% of total built-up area) and 1722 km² of cropland. The results reveal that while valley regions show lower probabilities of co-occurring hazards, high mountain areas, especially around glacier lake, and steep slopes are prone to compound impact, highlighting potential downstream risk to settlements. These insights provide actionable guidance for sustainable human settlement plans, infrastructure planning, and land use management. Importantly, the study emphasizes that establishing resilient settlement requires us to consider all potential risk scenarios and mitigation strategies in future infrastructure projects to enhance resilience settlement.

Keywords:

Risk zonation

susceptibility mapping

land use planning

human Settlement

Google Earth Engine

Introduction

The Hindu Kush Himalaya (HKH) is one of the most disaster-prone regions, where active seismic zones, extreme topographic gradients, seasonal monsoon-driven precipitation, and fragile geological formations intersect, creating a highly dynamic risk landscape [1]. Nepal, which lies in the HKH region, frequently experiences cascading natural hazards, particularly floods and landslides, that often interact and trigger chain reactions from upstream to downstream. The causes of these events include underlying with young mountainous terrain, unplanned urbanization, poor watershed management, and high frequencies of earthquakes, which create conditions for natural disaster.

The cascading effects, however, can be triggered by several factors, which might be time-dependent or occur simultaneously. For instance, intense monsoon rainfall can trigger flooding and landslides concurrently. On the other hand, time-dependent causes include wildfires that destabilize slopes, often resulting in debris flow [2], and earthquakes that can directly trigger landslides or Glacier Lake Outburst Flood [3]. The Melamchi flood (2021) mainly due to a cascading chain reaction[4] devastating the impact. These examples highlight that disasters are no longer defined by natural extremes but by the complex interactions among multiple hazards, social vulnerability and systematics governance gaps [5]. Since floods and landslides can occur in the same region due to the same triggering factor, this study seeks to identify the overlay and flow zones. Decision makers today not only face challenges of mitigating single hazards but they need to respond to the chains of cascading events which require understanding of systematic interaction and interrelationship. The cascading effect concepts are relatively new in natural risk governance, with limited methodologies and field applications, yet it is critical in understanding cumulative damage, where multiple hazards affect the same element of risk (people, infrastructure and economy). In a simple term, cascading events can be understood as a snowball process, in which an initial trigger evolves into a series of adverse outcomes, each branch of the event tree represents a cause-and-effect chain that amplifies vulnerability [6]

Despite this evolving risk landscape, Nepal lacks an integrated national scale understanding of where floods and landslide intersect as cascading hazards, limiting long-term planning for infrastructure, settlements, agriculture and emergency preparedness. Mountain settlement in Nepal is particularly vulnerable due to their location on steep slopes, proximity to river and glacier lakes, and dependence on limited infrastructure. Identifying zones where flood and landslide interact is therefore critical for ensuring the safety and resilience of these communities, guiding sustainable infrastructure development and informing risk informed land use strategies. This will also directly support global frameworks including the Sendai Framework for Disaster Risk Reduction and Sustainable development Goals (SDG11), which emphasizes understanding risk, mitigating risk, evidence-based planning, early warning systems and resilient infrastructure development[7, 8] required for sustainable human settlement. Although Nepal has advance national risk management strategies[9], the cascading hazard dimension remains largely unexplored.

Previous studies in Nepal have primarily employed Geographic Information System (GIS)-based multi-criteria analysis (MCA) to assess individual hazards, relying on expert-defined weights for variables such as slope, land cover, elevation, and other parameters [10–12]. Although these methods suffer from subjectivity and limited transferability, this was only the solution for mapping susceptibility. With the advancement in machine learning techniques, there are several methodologies focusing on the linear and non-linear relationships of the parameters. This study applied the Random Forest model, which offers a more data-driven weightage, which will overcome the problem of human bias by automatically ranking variable importances[13, 14] to generate national-scale susceptibility maps for both floods and landslide using recognized drivers including drivers include topographical, hydrological, land characteristics and vegetation characteristics[15, 16] and also applied unsupervised machine learning to identify the cluster for both flood and landslide. To our knowledge, this is the first national-level assessment in Nepal that combines machine learning based susceptibility modeling for floods and landslides to identify their cascading interaction zones to assess risks to mountain settlement.

The objective of this study is therefore to:

Apply a Random Forest model within GEE to generate national scale flood and landslide susceptibility.

Identify four zones low hazard zones, flood only zones, landslide only zones and cascading interaction zones by integrating susceptibility output using unsupervised machine learning and calculate the cascading flow and interaction using GIS.

To fill a crucial national-scale data gap to support data to decision making for resilient human settlement and infrastructure development in Nepal.

Study Area

Nepal, with its diverse topography, complex geography, and highly varying climate, provides an ideal study ground for risk susceptibility mapping. Globally, Nepal is ranked as the 4th most climate-vulnerable nation and among the top 20 countries that face multiple hazards, both natural and human-induced [17]. The figure of loss and damage reflects that an average of two people died, and more than 300 families have been affected by all kinds of disasters in the past 50 years [9]. The past semicentennial analysis reflects that floods and landslides are major risk concerns in the Nepalese context. The figure for the loss of life due to landslides is an average of 113 deaths each year [18]. Floods cause 9.33% of total disaster-related deaths and affect 61.60% of families [9].

Fig. 1

Study area showing the places, river network in varying elevations.

Data used and methodology

Data used

We used freely available datasets to develop the model for cascading zones, starting with flood and landslide susceptibility mapping in Nepal. Historical flood and landslide inventories inform past hazard occurrences for model development. Topographical dataset includes DEM, slope, aspect, TRI, TWI, HAND, and landforms capture terrain characteristics for water flow and slope stability. Land cover and NDVI provide information on vegetation and soil exposure. Hydrological datasets (distance from rivers, soil properties, hydraulic conductivity) describe water movement and infiltration. Climate data, such precipitation, represents rainfall driven hazard potential. OpenStreetMap data (buildings, roads) are used for identifying exposure. These parameters are widely used in flood and landslide susceptibility modeling in past studies.

Table 1
Data sources for flood and landslide modeling (see detailed maps
A
in Supplementary 1).
Datasets	Time	Data sources	Resolution	Type	Remarks
Nepal administrative boundary data		Survey Department		Administrative	Digitization and vector data processing within a Geographic Information System (GIS)
Flood data	2019–2023	[19, 20]	10 meters	Hazards inventory	This dataset is based on the flood inundation mapping using Copernicus data using thresholding.
Landslide	1960–2017	[21–23]		Hazards inventory	Digitization of high-resolution satellite imagery from Google Earth.
Copernicus Digital Elevation Model	2021		30 meters	Topographical	It is developed from radar collected by the TanDEM-X satellite mission.
HAND	2016		30 meters	Topographical	Terrain analysis algorithm that measures the vertical distance above the nearest water feature.
Landform	2006–2011	[24]	30 meters	Topographical	It is derived from Digital Elevation Model with the help of multi-scale Topographic Position Index and Continuous Heat-insolation Load Index.
OpenStreetMap data (Road, building, River network)	2022	OpenStreetMap		Exposure and topographical	It is created by digitization of high-resolution imageries by global volunteer using participatory approach.
Precipitation	2018–2023	[25]	Resampled 30 meter	Climate dataset	It is developed by combining satellited infrared imagery with rain gauge dataset.
Silt and Sand	2020	[26]	30 meters	Hydrological datasets	The field survey was conducted and resample to develop the final product.
Land cover	2021	[27]	30 meters		The Landsat dataset was used with random forest classifier to develop the final product.
Hydraulic conductivity	2015	[28]	30 meters	Hydrological datasets	Estimated using predictive models, such as Ped transfer function based on soil properties.
Slope	2022		30 meters	Topographical	It measures maximum rate of change of elevation using 3*3 moving window on the DEM.
Aspect	2022		30 meters	Topographical	It is determined from a DEM by identifying the downslope direction of each of the steepest slope for each cell, measured in degree clockwise from north.
Flow accumulation	2022		30 meters	Hydrological datasets	It represents the number of upslope cells contributing flow to a given down cell and calculation based on a pre-computed flow direction raster.
Flow direction	2022		30 meters	Hydrological datasets	Compute from a DEM by identifying the path of steepest descent from each cell to one of its neighbors using D8 flow direction methods.
Terrian Wetness index	2022		30 meters	Topographical	It is derived from DEM and calculates the natural logarithm of the ratio of the upslope contributing area to the tangent of the local slope.
Terrian Roughness index	2022		30 meters	Topographical	It calculates the standard deviation of elevation within a 3*3 moving window.
Normalized Difference Vegetation using Landsat	2022		30 meters	Vegetation	It is calculated using the Near infrared and red band

Methodology

The methodology is divided into two sections. First, flood susceptibility maps were generated using a Random Forest regression model integrated with various flood related variables. Similarly, landslide susceptibility was modeled using dependent variables and validated against the ground truth samples and existing inventories. Second, an unsupervised machine learning model was employed to cluster the flood and landslide susceptibility data. Finally, a detailed impact assessment was conducted for single hazards, cascading overflow area as shown in Fig. 2.

Fig. 2

Methodological flow chart for mapping interaction of flood and landslide

Sampling techniques

After compiling all the required data, a stratified random sampling method was used to ensure a more accurate representation of the entire population by dividing it into distinct subgroups using Eq. 1, thereby reducing overfitting and spatial autocorrelation [29]. This sampling strategy was designed to provide a 99.9% confidence level, using Eq. 2 [30].

$\:\text{s}={z}^{2}*\frac{p(1-p)}{{m}^{2}}\:$

eq1

Here

$\:\text{s}$

is the sample size,

$\:\varvec{z}$

is the confidence level, and the population of the partition to be used.

$\:m$

is the margin of error. The adjusted sample size formula is used when the population size is known:

$\:\text{sadj}=\frac{s*N}{s+N-1}\:$

eq2

Here

$\:\varvec{s}$

is the sample size calculated from Eq. 1, N is the population size and

$\:\text{sadj\:}\text{i}\text{s}\:\text{t}\text{o}\text{t}\text{a}\text{l}\:\text{p}\text{o}\text{p}\text{u}\text{l}\text{a}\text{t}\text{i}\text{o}\text{n}\:\text{s}\text{i}\text{z}\text{e}\:\text{t}\text{o}\:\text{b}\text{e}\:\text{u}\text{s}\text{e}\text{d}.\:$

The total population size represented by the image pixels, which cover the whole of Nepal at a resolution of 30 meters, is approximately thirty-seven million. Using Eq. 1 and Eq. 2 a total number of samples required was found to be 1600 samples, of which 70% of the data was used for training purposes and 30% for testing the models.

The flood and landslide inventory were then transformed into point datasets. In the case of stratified random sampling, we created a grid with a resolution of 400m x 400 m, which spread the dataset all over the study area at least 400 meters apart. It was used to minimize spatial autocorrelation.

Model

This study used a random forest regression model for susceptibility mapping of floods and landslides using GEE. The small sample size was used for hyperparameter tuning between the number of trees for prediction from 400–600, and it was found that the optimal number of trees for best model accuracy was 600. The model deals with binary data, which is flooded and non-flooded points or landslide and non-landslide points and is implemented using the bootstrapping technique to develop multiple decision trees [13]. The predictions from individual trees are averaged, enabling the model to effectively capture the relationship between the input variables and the target phenomenon. The model produces continuous outputs ranging from 0 to 1, representing susceptibility scores using Eq. 3.

Mathematically, it is expressed as:

$\:f\left(x\right)=\frac{1}{B}{\sum\:}_{b=1}^{B}fb\left(x\right)\:$

eq3

Here,

$\:\text{f}\left(x\right)$

is the estimated probability distribution

$\:B$

is the number of trees

$\:fb\left(x\right)$

is the prediction of b^th individual trees.

We integrated various environmental parameters (features) for flood and landslide susceptibility mapping. Each feature was used to train the model, where multiple decision trees were constructed from bootstrap samples of the training data. At each node of the tree, a random subset of features was selected, and among them, the best split features were used. Randomization reduces the correlation between trees and decreases the model’s variance.

The importance of each parameter in the model was determined using the permutation importance method as shown in Eq. 4. This method determines the importance of each parameter based on the model’s ability to predict output in terms of an increase in error metric when the feature values are randomly shuffled [13]

Mathematically, the importance of a feature is computed as:

$\:{importance}_{j}={MSE}_{shuffled}-{MSE}_{baseline}\:$

eq4

Where:

$\:{\varvec{i}\varvec{m}\varvec{p}\varvec{o}\varvec{r}\varvec{t}\varvec{a}\varvec{n}\varvec{c}\varvec{e}}_{\varvec{j}}$

is the importance of score of feature j

$\:{\varvec{M}\varvec{S}\varvec{E}}_{\varvec{b}\varvec{a}\varvec{s}\varvec{e}\varvec{l}\varvec{i}\varvec{n}\varvec{e}}$

is the mean square error of the model

$\:{\varvec{M}\varvec{S}\varvec{E}}_{\varvec{s}\varvec{h}\varvec{u}\varvec{f}\varvec{f}\varvec{l}\varvec{e}\varvec{d}}$

is the mean square error after shuffling the values of the feature

Accuracy Assessment

The accuracy of the model was evaluated using the Receiver Operating Characteristics (ROC) curve, which is a graphical representation that illustrates the model’s ability to distinguish between actual events and non-events[31]. The curve is plotted between the true positive rate (TPR) and the false positive rate (FPR). The TPR, also known as sensitivity, is calculated as:

$\:TPR=\frac{\text{T}\text{P}}{TP+FN}$

Eq. 5

$\:FPR=\frac{\text{F}\text{P}}{FP+TN}$

Eq. 6

In these formulas, TP represents True positives, FN represents False Negatives, FP represents False Positives, and TN represents True Negatives [32]. The Area Under the Curve (AUC) is a single number summary of the overall performance of the ROC curve, ranging from 0 to 1, where 1 denotes a perfect classifier and a value of 0.5 denotes a worthless classifier.

Published data was used for validation. This involved several tests between the two results, which include Root Mean Square Error (RMSE), Bias, and Index of Agreement (see Supplementary 2).

The RMSE measure provides the standard deviation of the residuals, indicating how this product varies.

$\:RMSE=\sqrt{\frac{1}{n}}\sum\:_{i=1}^{n}({p}_{i}-{o}_{i})$

² Eq. 7

The bias measures the average error of the two results, indicating the accuracy of the results. It helps to identify which results are overestimated.

Bias b=

$\:\frac{1}{n}\sum\:_{i=1}^{n}({p}_{i}-{o}_{i})$

Eq. 8

The index of agreement (D) is used to compare the similarities between two sets of data, indicating reliability.

Index of agreement

$\:D=1-\frac{\sum\:_{i}^{n}(\:{p}_{i}-{o}_{i}\:{)}^{2}}{\sum\:_{i}^{n}\left(\right|\:{p}_{i}-o\left|\right)+\left(\right|{o}_{i}-o|{)}^{2}}$

Eq. 9

Here

$\:\varvec{n}$

is the total number of observations.

$\:{\varvec{p}}_{\varvec{i}}$

is the data from susceptibility

$\:{\varvec{o}}_{\varvec{i}}\:$

is the validation data from a different source and

$\:o\:$

is the mean of the observed data.

Unsupervised machine learning involves algorithms finding patterns in unlabeled data without a target outcome. The goal is to discover inherent data structure through methods like clustering or dimensionality reduction. The flood susceptibility above 0.6 and landslide above 0.8 was applied to identify the clustered where flood and landslide events might occur separately and in clustered. In this study K-means clustering was applied which relies on minimizing the object function which represents the within-cluster sum of squares:

$\:J=\:\sum\:_{k=1}^{k}\sum\:_{{x}_{i}\in\:{C}_{k}}{\Vert\:{x}_{i}-{\mu\:}_{k}\Vert\:}^{2}$

Eq. 10

This equation sums the squared Euclidean distance between each grid cell, and it’s assigned the clustered centroid optimally grouping areas with similar hazard susceptibility [33]. This highlights three zones safe zone, flood risk zone and landslide risk zone. The landslide risk zone was further classified into two landslides occurring alone and flood and landslide interaction zone using the equation below:

Let,

$\:A$

is Flood susceptibility above 60% and

$\:B$

landslide susceptibility area above 80%

$\:cascading\:zone=\{b\in\:B\:|b\cap\:A=\varnothing\:\:and\:\partial\:b\cap\:\partial\:A\ne\:\varnothing\:\}$

Eq. 11

$\:b\cap\:A=\varnothing\:$

indicates that the landslide susceptibility area

$\:b$

does not overlap with any flood susceptibility area A

$\:\partial\:b\cap\:\partial\:A\ne\:\varnothing\:$

means the boundaries of the landslide susceptibility area

$\:b$

and the flood susceptibility area A intersects with that which is spatially adjacent.

The susceptibility thresholds for defining high risk flood and landslide classes were selected based on the empirical comparison with historical hazard event datasets. The multiple threshold values were tested and overlaid with past events. A flood susceptibility value of > 0.60 showed the highest spatial correspondence with known flood with cascading events. Similarly, landslide susceptibility threshold of > 0.80 provided the best agreement with mapped landslide scars available in supplementary (S15).

Results

Flood risk susceptibility model was assessed using a Random Forest model incorporating eleven key conditioning factors. The parameters were ranked by importance accessibility > flow accumulation > landform > landcover > slope > hydrological conductivity > flow direction > terrain roughness index > elevation > terrain wetness index > HAND as shown in supplementary S9 a). The overall accuracy of flood susceptibility model with validation dataset is 0.84 as shown in ROC curve supplementary S8 a). The susceptibility data was further validated with hydrological model dataset which was developed by METROR’s 100- year return period flood data which have index of agreement of 74% with RMSE of 0.48 and positive bias of 0.18 as shown in supplementary S 10 a). The positive bias indicate that susceptibility data outperform than METROR dataset. This might be due to flooded and non-flooded training dataset capture the latest incidents where the METROR dataset only used DEM, Rainfall and landcover as primary input parameter.

Approximately 19% of Nepal falls into medium to very high susceptible zones, predominantly in the southern low land regions Fig. 3a). Exposure analysis reveals significant vulnerability: nearly 900 thousand people and over 3.4 million infrastructures are in the medium to very high-risk zones as shown in supplementary ST1.

Fig. 3

a) Showing flood risk susceptibility map b) and landslide risk susceptibility map

Landslide susceptibility was assessed using a Random Forest model incorporating twelve key conditioning factors. These variables are ordered by their importance for prediction (slope > aspect > NDVI > DEM > sand > silt > distance from roads and rivers > TWI > landform > precipitation > land cover), are recognized indicators of slope stability as shown in supplementary S9 b). The model achieved robust performance with an Area Under the Curve (AUC) value of 0.85 as shown in ROC curve supplementary S8 b). The susceptibility data strong performance with manually digitized 2015 earthquake induced landslide inventory with index of agreement of 81%, a small negative bias of (-0.18) and an RMSE of 0.43 as shown in supplementary S 10 b). The negative bias may be attributed to the inclusion of NDVI as a predictor, which reflect vegetation cover. Nevertheless, the overall agreement with observed landslide and robust discrimination ability indicate that model provides reliable and conservative estimates for landslide susceptibility.

Approximately 40% of the total area fall in medium to very high-risk zones predominantly in the northern part of Nepal Fig. 3b). Exposure analysis indicates that two hundred thousand individuals and six hundred thousand infrastructures fall within medium to very high landslide susceptibility zones, revealing localized but critical vulnerabilities, particularly in mountainous and hilly regions of Nepal as shown in supplementary ST2.

Fig. 4

A) clustered of flood and landslide in hilly regions, b) flood and landslide clustered in valley, c) flood and landslide clustered in the southern part of Nepal, and D) flood and landslide clustered in the northern part of Nepal

Both flood and landslide susceptibility data were used as a primary parameter to develop the clustered area which highlights the low hazard zone, flood zone, and landslide zone. The flood risk zones were observed mostly in southern part of Nepal as shown in Fig. 4D), however landslide zone predominately observed in the northern part of Nepal Fig. 4C). The cluster of both flood and landslide was observed in the middle part of Nepal as shown in Fig. 4A) and B). In a country scale, the total area of low hazard zones includes 81%, flood risk zone 9%, and landslide risk zone is 10%.

Fig. 5

A) Natinal-scale flood and landslide overlay flow, B) and D detailed regional views illustrated landslide-river interaction in Northern mountainous Nepal, C) flood and landslide interact in the middle-hills area E)the valley, F) a mountainous region featuring a glacier lake

To understand the complex interactions between the flood and landslide, we classified clustered analysis results into four hazard zones using Eq. 11. 81% of the area was found to be low hazard zone, 9% high flood only risk zone, 5% both cascading zones and high landslide only risk zone (supplementary S18). The low hazard zone exhibits minimal susceptibility to both hazards (flood threshold < 0.6, landslide threshold > 0.8) and spans most of Nepal. High flood zones occur primarily in the south regions and river valleys, while landslide-only zones are spatially isolated but often parallel to flooded prone area. Finally, cascading zones are defined as areas of high landslide probability immediately adjacent to flood-prone areas, highlighting locations where landslides could trigger secondary flooding by blocking river channels.

To delineate cascading landslide hazard impact, we derived the cascading flood hazard zone defined as flood-prone area that intersect with high probability landslide zones. The runoff from these cascading landslide zones was modeled to extend downstream until the nearest settlement, capturing potential impact to communities. If the cascading flood encounters another landslide prone area along the way, the flow is further propagated to the next downstream settlement, iteratively accounting for compounded hazard interactions. With this approach it was found at cascading flood zone covers 7588 km² impacting 88 km² of built-up areas (12% of the total) and 1722 km² of cropland, underscoring the exposure of settlement and agricultural land to compounding hazards. The spatial patterns show minimal cascading effects in the central region Fig. 5E), where stable geomorphic conditions and gentle slopes limit simultaneous hazard occurrence. In contrast, steep terrain with dense river networks and fragile landcover ( Fig. 5B, C,D,F) exhibits higher likelihood of compounding events.

Discussion

This study provides first national-scale, machine-learning based cascading flood-landslide zonation in Nepal. By integrating Random Forest model with large geospatial dataset and unsupervised clustering, the analysis presents a more robust, objective and reproducible understanding of the country’s multi-hazard landscape compared to traditional, expert-driven methods.

The RF model demonstrated high predictive accuracy, benefiting from the hyperparameter tuning and large-scale computation offered by Google Earth Engine (GEE). Compared to traditional approaches such as Multi-Criteria Decision Analysis (MCDA) or Analytical Hierarchy Process (AHP), which are limited in handling numerous parameters and complex pairwise comparisons [34]. The machine learning approach allowed the incorporation of 11 variables for floods and 12 for landslides. This increased the robustness, objectivity, and generalizability of hazard predictions, a trend corroborated by other geospatial applications like wetland mapping and temperature estimation [35]. The RF models automated variable importance ranking reduces human bias and improves transferability to other regions within Nepal or the border Himalayan.

The susceptibility maps reveal distinct spatial patterns of flood and landslide risks in Nepal. High flow accumulation areas significantly increase flood risk in the southern part of Nepal by concentrating runoff [36], whereas high hydrological conductivity exhibits lower flood susceptibility as soils and vegetation improve water absorption increasing threats to both population and infrastructure. Steep slopes, loose terrain, and weathering processes make northern hilly and mountainous regions prone to landslides.

Anthropogenic activities like road construction, land use and land cover change, slope modification, riverbank encroachment and deforestation further intensify susceptibility[37, 38]. Expanding settlements into flood plains and agricultural intensification in the southern part of Nepal, exposure, while in the central region hill cutting and road expansion in northern part mountainous regions weaken slope stability. These anthropogenic drivers highlight the urgency of integrating geospatial evidence into land use decisions.

The susceptible data was further validated with hydrological model dataset and earthquake induced landslide which contain positive and negative bias. This highlighted a crucial implication for disaster management models that should be updated with recent inventories to provide more accurate and timely risk assessments,

The use of unsupervised machine learning like kmeans clustering algorithms identifies natural groupings in this multidimensional data space. The landslide susceptibility and flood susceptibility were combined based on its characteristics resulting in distinct zones flood susceptible, landslide susceptible zones and low hazard zones. This provides clear, data driven visualization of risk distribution. This visualization is a powerful tool for sustainable mountain settlement planning. Decision makers can use the low hazard zones or no interaction zones for developing new settlements and infrastructure, as these areas inherently possess combinations of factors that correlate with low risk. The flood risk zone can be targeted for specific mitigation strategies such as implementing structural defenses, enforcing building codes, or focusing on environmental measures. Similarly, for the landslide risk zone can be targeted for specific mitigation strategies such as planting the forest in slope area.

A key advancement of this study is the identification of cascading hazard zones, where floods and landslides susceptibility co-occur or are spatially adjacent. Such areas are susceptible to multi-hazard chains, including landslide dammed lakes, sudden outburst floods, and slope failures triggered by prolonged inundation. While previous studies in the region typically addressed these hazards in isolation [12, 39–41], our methodology provides a data driven framework for spatially explicit interaction mapping. These areas represent potential “snowball” processes [42], in which an initial hazard triggers subsequent events. For example, a landslide may block a river, forming a temporary dam that can fail suddenly and generate an outburst [43]. Such a cluster was identified in the northeast -to-northwest part of Nepal. Conversely, intense flooding can satisfy slopes, increasing landslide probability. Such interactions are particularly relevant in Nepal’s monsoon season, when tectonics, land-use changes, prolonged droughts, and forest fires can amplify cascading risks [44, 45].

The cascading hazard maps have direct relevance for Nepal development planning, climate adaptation and disaster risk reduction agendas. Low-hazard zones offer opportunities for safer settlement expansion and infrastructure placement, high flood susceptible zones require stricter zoning, flood resistant designs and improved watershed and wetland management. Landslide dominant areas necessitate slope stabilization, bioengineering and controlled infrastructure development. Cascading hotspots are particularly important for hydropower planning, road alignment and community based early warning systems. The cascading hotspots area needs to redetermine the threshold for early warning considering the damming of river. Incorporating this map into local government planning process avoid high-risk areas, enhance preparedness and support evidence-based investment aligned with Sendai Framework and Sustainable cities and communities. In flood-prone southern regions, adherence to flood-resistant building codes, such as FEMA’s BRIC principles, can enhance infrastructure resilience[46]. In northern areas, it is vulnerable to landslides, policies promoting reforestation and restricting hill cutting can stabilize slopes.

Furthermore, the maps guide risk-sensitive land use planning. Sustainable practices such as terracing, contour farming, wetland restoration, and controlled levees (Kasumi-tei) can mitigate cascading hazard impacts [47–50]. Restricting new construction in high-risk zones or relocating settlements transform hazard-prone areas into safer land uses, such as forests or terrace fields. Integration of susceptibility maps with policy tools allows evidence-based decision-making to enhance resilience and reduce socio-economic losses.

The methodology developed in this study is easily transferable to other multi-hazard contexts because it relies primarily on freely available datasets and is implemented using open-source tools. This makes the framework adaptable for researchers aiming to investigate additional hazard interactions such as wildfire–landslide, drought–flood, earthquake–landslide, or avalanche–GLOF cascading processes. However, applying this workflow to new regions or hazard types requires careful calibration of susceptibility thresholds based on local environmental conditions, hazard characteristics, and inventory quality. Developing region-specific and process-specific thresholds will be essential to ensure accurate, context-driven multi-hazard assessments.

Although this methodology provides valuable national-scale cascading zone mapping and is suitable for high-level strategic planning. This study does not incorporate temporal dynamics which are essential for developing scenario based multi-hazard modeling. The current methodology based on susceptibility mapping identifies where interactions are likely to occur but does not predict when or how they will evolve over time. Therefore, while the national-scale data guides broad policy, future studies should focus on integrating temporal dynamics and developing specific landslide hazard scenarios. This work could involve incorporating time-series rainfall data, citizen reported inventories, geomorphological field validation and dynamic simulation model to generate scenario-based cascading hazard forecasts, including potential one-day to multi-day river blockage breach sequences.

Conclusion

For the first time, this study presents a high-resolution, national-scale mapping of cascading flood and landslide risk susceptibility in Nepal using machine learning and the cloud-based Google Earth Engine platform using open datasets. By integrating diverse geospatial, climatic, soil, and geological datasets along with historical hazard inventories, the Random Forest model demonstrated strong predictive performance while efficiently processing large and complex data. This study identified distinct spatial patterns, with flood risk primarily concentrated in the southern plains and landslide susceptibility highest in the northern hilly and mountainous region, reflecting differential impacts across Nepal’s varied terrain. Importantly, this study identifies significant cascading hazard zones where flood and landslide risks intersect and interact, highlighting the critical need for multi-hazard approaches in disaster risk management. The cascading high risk zones provide valuable inputs for advanced hydrodynamic and dam breach modeling, early warning systems, and disaster preparedness planning to design resilient infrastructure and human settlements. Despite these advances, the challenges remain in estimating landslide volumes and modeling secondary impacts such as sediment transport and flood wave dynamics. These aspects need further research. Future directions include dynamic multi-hazard simulation frameworks, integrating real-time hydrometeorological data, and establishing open-access databases of cascading hazard events. This novel framework could go beyond isolated hazard assessments toward integrated risk management, directly supporting climate adaptation, disaster resilience, and planned settlement development within Nepal, and scale out to other Hindu Kush Himalaya countries and beyond.

Acknowledgement

We would like to express our sincere appreciation to the editors and the anonymous reviewers for their constructive comments that helped improve the manuscript. We also wish to acknowledge the contributions of the SERVIR-HKH initiative and the Disaster Risk Reduction team at ICIMOD, whose efforts and collaboration have been invaluable. Additionally, we extend our gratitude to all organizations and individuals who have supported this research.

Funding

The study was funded from ICIMOD’s core funds supported by the governments of Afghanistan, Australia, Austria, Bangladesh, Bhutan, China, India, Myanmar, Nepal, Norway, Pakistan, Sweden, and Switzerland.

Data Availability

Upon the publication of this paper, all authors agree to share the processed data used in this paper as requested.

AI Disclosure Statement

During the preparation of manuscript, the author used ChatGPT-4 to improve the language clarity and readability of the text. After using the tool, the authors carefully reviewed and edited the content as necessary and took full responsibility for the final content for publication. The use of this technology was limited to language enhancements and did not influence the intellectual content, data analysis or conclusion of the study.

Declarations

Ethics approval and consent to participate

Not Applicable

Consent for publication

Not Applicable

Competing interest

The authors declare no competing interest

Author Contribution Statement

Kabir Uddin, Erica Udas, and Narayan Thapa contributed to the conceptualization and methodology, and Kabir Uddin and Erica Udas supervised the study. Narayan Thapa conducted a formal analysis with inputs from Kabir Uddin and Rajesh Bahadur Thapa. Narayan Thapa prepared the original draft of the manuscript. All authors participated in writing, reviewing, and editing the manuscript.

Author Contribution

**Kabir Uddin, Erica Udas, and Narayan Thapa** contributed to the conceptualization and methodology, and **Kabir Uddin and Erica Udas** supervised the study. **Narayan Thapa** conducted a formal analysis with inputs from **Kabir Uddin and Rajesh Bahadur Thapa** . **Narayan Thapa** prepared the original draft of the manuscript. All authors participated in writing, reviewing, and editing the manuscript.

References

Wester P, Mishra A, Mukherji A, Shrestha AB. The Hindu Kush Himalaya assessment: mountains, climate change, sustainability and people. Springer Nature; 2019.

Cannon SH, Gartner JE. Wildfire-related debris flow from a hazards perspective. Debris-Flow Hazards Relat Phenom. 2005;363–85. https://doi.org/10.1007/3-540-27129-5_15.

Fan X, Scaringi G, Korup O, West AJ, van Westen CJ, Tanyas H, Hovius N, Hales TC, Jibson RW, Allstadt KE, Zhang L, Evans SG, Xu C, Li G, Pei X, Xu Q, Huang R. Earthquake-Induced Chains of Geologic Hazards: Patterns, Mechanisms, and Impacts. Rev Geophys. 2019;57:421–503. https://doi.org/10.1029/2018RG000626.

Sudan Bikash Maharjan JFS, Arun Bhakta SN, Shrestha A, Maharjan GR. Mandira Singh Shrestha, Birendra Bajracharya, and N.G. Maxim Shrestha, Miriam Jackson, The Melamchi flood disaster, (2021) 1–21.

Alcántara-Ayala I. Cascading hazards and compound disasters. Npj Nat Hazards. 2025;2. https://doi.org/10.1038/s44304-025-00111-5.

Zuccaro G, De Gregorio D, Leone MF. Theoretical model for cascading effects analyses. Int J Disaster Risk Reduct. 2018;30:199–215. https://doi.org/10.1016/J.IJDRR.2018.04.019.

United Nations, United Nations. Goal 11: Sustainable cities and communities | Sustainable Development Goals | United Nations Development Programme, (2015). https://www.undp.org/sustainable-development-goals/sustainable-cities-and-communities (accessed February 26, 2024).

United, Nation. Sendai Framework for Disaster Risk Reduction 2015–2030, 2015.

Sharma AP, Fu X, Kattel GR. Is there a progressive flood risk management in Nepal? A synthesis based on the perspective of a half-century (1971–2020) flood outlook. Nat Hazards. 2023;118:903–23. https://doi.org/10.1007/S11069-023-06035-5/METRICS.

10.

Dev Acharya T, Yang I, Lee D, Dev T, Tae I, Ha D. GIS-based Landslide Susceptibility Mapping of Bhotang, Nepal using Frequency Ratio and Statistical Index Methods. J Korean Soc Surveying Geodesy Photogrammetry Cartography. 2017;35:357–64. https://doi.org/10.7848/ksgpc.2017.35.5.357.

11.

Lee S. Current and Future Status of GIS-based Landslide Susceptibility Mapping: A Literature Review. Korean J Remote Sens. 2019;35:179–93. https://doi.org/10.7780/KJRS.2019.35.1.12.

12.

Swain KC, Singha C, Nayak L. Flood susceptibility mapping through the GIS-AHP technique using the cloud. ISPRS Int J Geoinf. 2020;9. https://doi.org/10.3390/ijgi9120720.

13.

Breiman L. Random Forests, 2001.

14.

Belgiu M, Drăgu L. Random forest in remote sensing: A review of applications and future directions. ISPRS J Photogrammetry Remote Sens. 2016;114:24–31. https://doi.org/10.1016/J.ISPRSJPRS.2016.01.011.

15.

Mosavi A, Ozturk P, Chau KW. Flood prediction using machine learning models: Literature review. Water (Switzerland). 2018;10:1–40. https://doi.org/10.3390/w10111536.

16.

Kaya CM, Derin L. Parameters and methods used in flood susceptibility mapping: a review. J Water Clim Change. 2023;14:1935–60. https://doi.org/10.2166/WCC.2023.035.

17.

UNDRR. Disaster risk reduction in Nepal: Status report 2019. Disaster Risk Reduct Nepal (2019) 1–28. https://www.preventionweb.net/files/68257_682306nepaldrmstatusreport.pdf

18.

Adhikari BR, Gautam S. A Review of Policies and Institutions for Landslide Risk Management in Nepal. Nepal Public Policy Rev. 2022;2:93–112. https://doi.org/10.3126/nppr.v2i1.48397.

19.

Franz J, Meyer ICIMOD. (2022). https://rds.icimod.org/Home/DataDetail?metadataId=1973127 (accessed May 17, 2024).

20.

Thapa N, Nepali S, Shrestha R, Sanjel S. Time series flood mapping using the Copernicus dataset in Google Earth Engine of the Mountainous Region, Data Brief (2025) 112010. https://doi.org/10.1016/J.DIB.2025.112010

21.

Alberto Muñoz-Torrero Manchado, Multi-temporal Landslide Inventory for the Far-Western region of Nepal, Zenodod. (2020). https://doi.org/10.5281/zenodo.4290099

22.

ICIMOD, Landslide data of 14 earthquake affected districts of Nepal, ICIMOD. (2016). https://doi.org/https://doi.org/10.26066/RDS.31016

23.

ICIMOD. Landslide data of Koshi basin (within Nepal) of 1960 developed through remote sensing approach, ICIMOD (2017). https://doi.org/https://doi.org/10.26066/RDS.34424

24.

Theobald DM, Harrison-Atlas D, Monahan WB, Albano CM. Ecologically-Relevant Maps of Landforms and Physiographic Diversity for Climate Adaptation Planning. PLoS ONE. 2015;10:e0143619. https://doi.org/10.1371/JOURNAL.PONE.0143619.

25.

Funk C, Peterson P, Landsfeld M, Pedreros D, Verdin J, Shukla S, Husak G, Rowland J, Harrison L, Hoell A, Michaelsen J. The climate hazards infrared precipitation with stations - A new environmental record for monitoring extremes. Sci Data. 2015. https://doi.org/10.1038/sdata.2015.66.

26.

NARC, Get digital soil, crop data | NARC, NARC (2020). https://soil.narc.gov.np/getdata (accessed May 21, 2024).

27.

Uddin K, Matin MA, Khanal N, Maharjan S, Bajracharya B, Tenneson K, Poortinga A, Quyen NH, Aryal RR, Saah D, Lee Ellenburg W, Potapov P, Flores-Anderson A, Chishtie F, Aung KS, Mayer T, Pradhan S, Markert A. Regional Land Cover Monitoring System for Hindu Kush Himalaya. Earth Observation Science and Applications for Risk Reduction and Enhanced Resilience in Hindu Kush Himalaya Region. Springer International Publishing; 2021. pp. 103–25. https://doi.org/10.1007/978-3-030-73569-2_6.

28.

Boer de Froukje HHS. A High Resolution Soil Map of Hydraulic Properties Version 1.1, 2015. www.futurewater.nl

29.

Latpate R, Kshirsagar J, Gupta VK, Chandra G. Adv Sampl Methods. 2021. https://doi.org/10.1007/978-981-16-0622-9.

30.

Zarkovich SS, Murthy MN. Sampl Theory Methods. 1969. https://doi.org/10.2307/1402105.

31.

Metz CE. Basic principles of ROC analysis. Semin Nucl Med. 1978;8:283–98. https://doi.org/10.1016/S0001-2998(78)80014-2.

32.

Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett. 2006;27:861–74. https://doi.org/10.1016/J.PATREC.2005.10.010.

33.

Greene D, Cunningham P, Mayer R. Unsupervised Learning and Clustering. Cogn Technol. 2008;51–90. https://doi.org/10.1007/978-3-540-75171-7_3.

34.

Munier N, Hontoria E. Uses and Limitations of the AHP Method, (2021). https://doi.org/10.1007/978-3-030-60392-2

35.

Hird JN, DeLancey ER, McDermid GJ, Kariyeva J. Google earth engine, open-access satellite data, and machine learning in support of large-area probabilistic wetland mapping. Remote Sens (Basel). 2017;9. https://doi.org/10.3390/rs9121315.

36.

Stromberg JC, Beauchamp VB, Dixon MD, Lite SJ, Paradzick C. Importance of low-flow and high-flow characteristics to restoration of riparian vegetation along rivers in arid south-western United States. Freshw Biol. 2007;52:651–79. https://doi.org/10.1111/J.1365-2427.2006.01713.X.

37.

Uddin K, Matin MA. Potential flood hazard zonation and flood shelter suitability mapping for disaster risk mitigation in Bangladesh using geospatial technology. Progress Disaster Sci. 2021;11:100185. https://doi.org/10.1016/j.pdisas.2021.100185.

38.

Thapa N, Pant P, Prasai R, Mahata A, Dulal S. Sustainable land use planning in developing countries using GIS and multi-criteria analysis: a case study of Lalitpur district, Nepal, City and Built Environment 2025 3:1 3 (2025) 1–20. https://doi.org/10.1007/S44213-025-00050-X

39.

Hammami S, Zouhri L, Souissi D, Souei A, Zghibi A, Marzougui A, Dlala M. Application of the GIS based multi-criteria decision analysis and analytical hierarchy process (AHP) in the flood susceptibility mapping (Tunisia). Arab J Geosci. 2019;12. https://doi.org/10.1007/s12517-019-4754-9.

40.

Dahal RK. Rainfall-induced Landslides in Nepal. Int J Eros Control Eng. 2012;5:1–8. https://doi.org/10.13101/ijece.5.1.

41.

Cheng G, Wang Z, Huang C, Yang Y, Hu J, Yan X, Tan Y, Liao L, Zhou X, Li Y, Hussain S, Faisal M, Li H. Advances in Deep Learning Recognition of Landslides Based on Remote Sensing Images, Remote Sensing 2024, Vol. 16, Page 1787 16 (2024) 1787. https://doi.org/10.3390/RS16101787

42.

Marzocchi W, Garcia-Aristizabal A, Gasparini P, Mastellone ML, Di Ruocco A. Basic principles of multi-risk assessment: A case study in Italy. Nat Hazards. 2012;62:551–73. https://doi.org/10.1007/S11069-012-0092-X/FIGURES/12.

43.

Mackey BH, Roering JJ, Lamb MP. Landslide-dammed paleolake perturbs marine sedimentation and drives genetic change in anadromous fish. Proc Natl Acad Sci U S A. 2011;108:18905–9. https://doi.org/10.1073/PNAS.1110445108/. SUPPL_FILE/PNAS.1110445108_SI.PDF.

44.

Forbes K, Broadhead J. Forests and landslides: the role of trees and forests in the prevention of landslides and rehabilitation of landslide-affected areas in Asia., (2013). https://doi.org/10.5555/20133328364

45.

Davies T, Rosser N. Landslide hazards, risks and disasters: introduction, Landslide Hazards, Risks, and Disasters (2022) 1–12. https://doi.org/10.1016/B978-0-12-818464-6.00017-2

46.

Horn DP, Pre-Disaster FEMA, Mitigation. The Building Resilient Infrastructure and Communities (BRIC) Program, Congressional Research Service, Washington, DC, 2022. https://link.gale.com/apps/doc/A731350336/HRCA?u=anon~f4ab24bd&sid=googleScholar&xid=ed8ff988

47.

Hey DL, Philippi NS. Flood reduction through wetland restoration: the Upper Mississippi River Basin as a case history. Restor Ecol. 1995;3:4–17.

48.

Keesstra SD, Slovenia SW. Earth Surf Processes Landforms: J Br Geomorphological Res Group. 2007;32:49–65.

49.

Nakileza BR, Nedala S. Topographic influence on landslides characteristics and implication for risk management in upper Manafwa catchment, Mt Elgon Uganda. Geoenvironmental Disasters. 2020;7:1–13.

50.

Teramura J, Shimatani Y. Advantages of the open levee (Kasumi-tei), a traditional japanese river technology on the matsuura river, from an ecosystem-based disaster risk reduction perspective. Water (Switzerland). 2021;13. https://doi.org/10.3390/w13040480.

Yes