Abstract
Harmful algal blooms (HABs) pose a growing ecological and economic risk in tropical shelf seas, yet their drivers remain poorly constrained along the Somali coast, one of the least monitored upwelling systems of the western Indian Ocean. This study quantifies the spatio-temporal dynamics of chlorophyll-a and associated biogeochemical variables in the Bosaso coastal waters (48.5°–51.9° E; 8.8°–12.7° N) from January 2023 to October 2025 using Copernicus Marine Environment Monitoring Service (CMEMS) reanalysis data. Daily surface observations of chlorophyll-a, nitrate (NO₃), phosphate (PO₄), silicate (Si), and surface partial pressure of CO₂ (spCO₂) were harmonized, regridded, and merged into a unified dataset. Bloom events were identified using a 90th-percentile chlorophyll-a threshold (1.293 mg m⁻³), while a logistic regression model evaluated bloom probability from nutrient–carbon interactions. Results reveal strong seasonal and interannual variability linked to monsoon-driven upwelling, with elevated nutrients and reduced spCO₂ coinciding with bloom peaks. Chlorophyll-a exhibited significant correlations with NO₃ (r = 0.85), PO₄ (r = 0.73), and Si (r = 0.62), confirming nutrient enrichment as the primary driver of bloom initiation. The logistic model achieved high predictive performance (AUC = 0.97), indicating that simple nutrient and CO₂ metrics can serve as effective early-warning indicators. These findings provide the first quantitative HAB baseline for Puntland’s coastal waters and demonstrate a reproducible computational workflow for regional HAB monitoring and management.
Keywords
Harmful algal blooms
chlorophyll-a
nutrients
monsoon upwelling
Somali coast
Copernicus Marine Service
Bosaso
logistic regression
spCO₂
Arabian Sea
phytoplankton dynamics
climate variability
coastal productivity
biogeochemical reanalysis
early-warning system.
A
A
1. Introduction
Harmful algal blooms (HABs) — rapid, high-biomass phytoplankton events that may produce toxins or cause ecological disruption — represent a critical risk to coastal marine ecosystems, fisheries and human health [1]. Recent studies indicate an increasing frequency and geographic expansion of HABs, attributed to combined effects of nutrient enrichment, climatic forcing and changing ocean dynamics [2, 3]. The driving mechanisms of HABs in tropical shelf systems remain less well understood compared with temperate zones, and data limitations hamper predictive monitoring in many under-studied regions. In particular, the coastal waters of the Gulf of Aden and northern Indian Ocean margin, including Puntland (Somalia), represent a gap in HAB risk and process knowledge.
Coastal upwelling systems deliver subsurface nutrients to the euphotic zone, enhancing phytoplankton productivity, and are often associated with elevated HAB risk [4]. For instance, Gu et al. (2023) showed that atmospheric high-pressure systems induced mixed layer restratification that triggered large chlorophyll-a blooms in the northern Arabian Sea [5]. In upwelling systems, the coupling of nutrient influx with shallow mixed layers and high light availability creates favourable conditions for bloom growth [6]. Moreover, the timing and magnitude of nutrient pulses modulate bloom onset and intensity [7]. Studies across global upwelling zones show that nitrate and phosphate concentrations, together with silica when diatoms dominate, are strong predictors of bloom initiation [8, 9].
In recent work, Lan et al. (2024) emphasised that coastal HABs are especially sensitive to combined nutrient and hydrodynamic forcing rather than single-factor triggers [10]. This is reinforced by regional investigations demonstrating that nutrient–CO₂ dynamics, rather than solely temperature or light, drive bloom events in tropical waters [11, 12]. In shelf zones influenced by monsoon forcing, like the Somali coast, seasonal reversal of winds drives both vertical mixing and lateral advection of nutrient-rich water, yet HAB occurrence remains largely undocumented [13, 14]. Notably, the Somali Current and associated upwelling during the southwest monsoon deliver nutrient-rich waters that support high productivity, but the linkage to HAB events remains poorly quantified [15].
Efficient monitoring and forecast of HABs in data-scarce regions require development of threshold criteria and predictive models. Global satellite mapping shows coastal bloom incidence has risen over recent decades, with more than 120 countries now experiencing regular blooms [16]. Simultaneously, new modelling frameworks reveal “rate-induced” tipping behaviour in plankton systems: rapid changes in forcing can precipitate bloom transitions [17]. For tropical coastal shelves, the key challenge is to identify regional bloom thresholds and the minimal data needed for reliable early-warning systems.
In this study, we address this gap for the Bosaso coastal region of Puntland by using three years of high-resolution biogeochemical and physical reanalysis data to: (i) quantify the spatio-temporal dynamics of chlorophyll-a and nutrient–carbon interplay; (ii) establish a regional chlorophyll-a threshold for bloom identification; and (iii) test a simple predictive model of bloom probability based on nutrient and CO₂ drivers. By doing so, we provide foundational monitoring criteria and a reproducible computational workflow applicable to similar under-observed tropical shelf settings.
2. Materials and Methods
This study was conducted along the Bosaso coastal waters of Puntland, Somalia, located between 48.5°–51.9°E and 8.8°–12.7°N in the Gulf of Aden. The analysis covered the period from January 2023 to October 2025, capturing three consecutive monsoon cycles to evaluate the variability and drivers of chlorophyll-a as an indicator of harmful algal blooms (HABs). The data were obtained from the Copernicus Marine Environment Monitoring Service (CMEMS), which provides global reanalysis and forecast ocean products. The variables used included chlorophyll-a (Chl), nitrate (NO₃), phosphate (PO₄), silicate (Si), and surface partial pressure of CO₂ (spCO₂). These parameters were extracted at approximately 0.49 m depth, representing surface-layer processes most relevant to primary productivity.
All data were accessed programmatically using the CopernicusMarine Python API and downloaded as NetCDF files. The workflow automatically opened each file, harmonized coordinate systems and temporal resolution, regridded physical variables (originally at 0.083°) to match the 0.25° biogeochemical grid, and merged all datasets into a single multi-dimensional array. Temporal alignment ensured a common daily time series between 2023-01-01 and 2025-10-31, and spatial alignment ensured consistency in longitude (converted to the 0–360° domain) and latitude spacing. The merged dataset was exported in both Parquet and CSV formats to facilitate efficient analysis and reproducibility.
Data cleaning included the removal of anomalous or missing values, followed by linear temporal interpolation. The area-weighted daily mean for each variable was computed to obtain a regional time series using:
where
represents the average over all grid cells weighted by the cosine of latitude.
Bloom events were defined using the 90th-percentile threshold of chlorophyll-a (
mg m⁻³), such that a day was classified as a bloom if:
This binary classification enabled subsequent modeling of bloom likelihood.
Relationships among variables were analyzed using the
Pearson correlation coefficient:
which quantified the strength of linear associations between chlorophyll-a, nutrients, and spCO₂. To predict bloom events, a logistic regression model was applied, estimating the probability of bloom occurrence as:
Model performance was evaluated using accuracy, precision, recall, and the area under the ROC curve (AUC). Spatial maps of mean chlorophyll-a were then generated to identify potential bloom hotspots across the Bosaso region. All analyses were implemented in Python using xarray, pandas, and scikit-learn, ensuring a transparent, automated, and fully reproducible workflow.
Data and Materials Availability
All datasets used in this study were obtained from the Copernicus Marine Environment Monitoring Service (CMEMS), which provides open-access ocean reanalysis and forecast products. The specific datasets analyzed include Global Ocean Physics Analysis and Forecast (dataset ID: cmems_mod_glo_phy_anfc_0.083deg_P1D-m), Global Ocean Biogeochemistry CO₂, Nutrients, and Phytoplankton Analysis and Forecast products (dataset IDs: cmems_mod_glo_bgc-co2_anfc_0.25deg_P1D-m, cmems_mod_glo_bgc-nut_anfc_0.25deg_P1D-m, and cmems_mod_glo_bgc-pft_anfc_0.25deg_P1D-m).
All files were accessed programmatically using the CopernicusMarine Python API, harmonized in time and space, and exported as tidy Parquet and CSV datasets to ensure transparency and reproducibility. The processed datasets and analysis scripts can be made available by the corresponding author upon reasonable request.
Computational Reproducibility Statement
All computational steps in this study were performed using open-source software to ensure full reproducibility. Data preprocessing, harmonization, regridding, and analysis were carried out in Python 3.12, employing core scientific libraries including xarray, pandas, numpy, matplotlib, and scikit-learn. The workflow was designed to automatically open all NetCDF files, harmonize coordinates and time dimensions, interpolate physical parameters from the 0.083° grid to the 0.25° biogeochemical grid, merge all variables into a unified dataset, and export results as tidy Parquet and CSV tables. All scripts were executed in a controlled environment with fixed random seeds, ensuring identical results upon rerun. The complete codebase, including the data-processing scripts and model implementaStudy Workflow Description (Figure M1)
Figure M1 illustrates the complete analytical pipeline followed in this study, from data acquisition to interpretation. The workflow begins with the download of multi-source ocean datasets from the Copernicus Marine Environment Monitoring Service (CMEMS), including physical (0.083°) and biogeochemical (0.25°) products. Each dataset was opened programmatically using the CopernicusMarine Python API, and all spatial and temporal coordinates were harmonized to ensure alignment across sources. Physical variables were interpolated from the 0.083° grid to the 0.25° biogeochemical grid through bilinear regridding, allowing seamless integration with nutrient and chlorophyll data.
After interpolation, the datasets were merged into a unified daily surface-layer cube covering the Bosaso coastal domain (48.5°–51.9°E; 8.8°–12.7°N) for the period 2023–2025. Quality control and linear interpolation were applied to fill minor data gaps, and the harmonized dataset was exported as tidy Parquet and CSV tables for efficient downstream analysis. The analytical phase included statistical correlation between chlorophyll and nutrient–CO₂ dynamics, definition of bloom thresholds, and logistic-regression modeling of bloom probability. Finally, spatial averaging and mapping were conducted to visualize mean chlorophyll distribution and identify recurrent bloom hotspots along the Bosaso shelf.
Figure M1. Schematic representation of the study workflow showing the sequential steps of data acquisition, coordinate harmonization, regridding, dataset merging, quality control, model training, and spatial mapping for harmful algal bloom analysis in the Bosaso region, Puntland.
tion, can be provided upon reasonable request to the corresponding author.
3. Results
The three-year time series (January 2023 – October 2025) revealed pronounced temporal and spatial variability in chlorophyll-a and associated environmental parameters within the Bosaso coastal region. Chlorophyll-a concentrations fluctuated between 0.15 and 1.88 mg m⁻³, with an average of 0.72 mg m⁻³. Distinct bloom peaks were observed in early 2023, February–April 2024, and mid-2025, corresponding to enhanced nutrient availability and reduced surface CO₂ levels.
Nutrient concentrations followed coherent seasonal cycles. Nitrate and phosphate displayed rapid enrichment preceding bloom maxima, while silicate peaked concurrently with chlorophyll-a, suggesting a strong diatom contribution during high-productivity phases. The inverse response of spCO₂ confirmed active biological drawdown during bloom periods. Monthly averaging emphasized this coupling, indicating that nutrient-driven productivity was strongest during the southwest-monsoon upwelling season.
Correlation analysis demonstrated significant positive relationships between chlorophyll-a and major nutrients (r = 0.85 for NO₃; r = 0.73 for PO₄; r = 0.62 for Si) and a moderate negative correlation with spCO₂ (r = − 0.34). These results confirm that nutrient enrichment and carbon uptake jointly regulate bloom intensity in the Bosaso coastal system.
Applying the 90th-percentile bloom threshold (1.293 mg m⁻³) identified approximately 15% of days as bloom events. Bloom periods aligned closely with nutrient pulses and CO₂ minima, validating the threshold as an effective indicator of anomalous phytoplankton proliferation.
The logistic-regression model constructed from nitrate, phosphate, silicate, and spCO₂ achieved strong predictive capability (accuracy = 0.90; AUC = 0.966). High precision (1.00) indicated excellent discrimination of major bloom days, while the lower recall (0.33) reflected the model’s conservative tendency to capture only large-magnitude events. These metrics confirm that nutrient and CO₂ variability can reliably explain bloom probability within the study region.
Spatially, the mean chlorophyll-a distribution exhibited marked heterogeneity. The highest values (≈ 1.4 mg m⁻³) occurred near 10°–11° N and ≈ 51° E, directly off Bosaso and Ras Qayla, corresponding to persistent upwelling and coastal eddy activity. Offshore waters beyond 12° N were comparatively oligotrophic (< 0.4 mg m⁻³). This gradient underscores the influence of localized shelf processes in sustaining regional productivity.
Overall, the results demonstrate that harmful algal bloom intensity and frequency in Puntland waters are primarily governed by nutrient enrichment associated with monsoon-driven upwelling and that the established 1.293 mg m⁻³ chlorophyll-a threshold provides a robust criterion for bloom identification in this ecosystem.
4. Conclusion
This study provides a comprehensive analysis of harmful algal bloom (HAB) dynamics along the Bosaso coast of Puntland using three years (2023–2025) of Copernicus Marine biogeochemical and physical reanalysis data. The results demonstrate that chlorophyll-a variability, used as a proxy for phytoplankton biomass, is predominantly governed by nutrient enrichment, particularly nitrate and phosphate, and is strongly influenced by monsoon-driven upwelling processes. The inverse relationship between chlorophyll-a and surface CO₂ further confirms that bloom events coincide with periods of intense biological carbon uptake.
The 90th-percentile chlorophyll-a threshold (1.293 mg m⁻³) proved to be a reliable criterion for identifying bloom conditions, capturing the seasonal and episodic nature of phytoplankton proliferation in the study region. The logistic regression model achieved high predictive performance (AUC = 0.97), establishing nutrient and CO₂ dynamics as effective predictors of bloom occurrence. Spatial mapping revealed persistent productivity hotspots along the Bosaso shelf, corresponding to zones of enhanced mixing and nutrient inflow.
These findings establish a valuable baseline for HAB monitoring and forecasting in Somali coastal waters. The analytical workflow developed in this study—encompassing automated data harmonization, regridding, and model integration—can be replicated for regional monitoring and early-warning applications. Future research integrating satellite chlorophyll observations, temperature profiles, and in-situ sampling will further strengthen operational HAB management and coastal resource planning in Puntland.
5. Data and Materials Availability
All datasets were obtained from the Copernicus Marine Environment Monitoring Service (CMEMS) and include the Global Ocean Physics and Biogeochemistry products. Data were accessed via the CopernicusMarine Python API and processed into tidy Parquet and CSV outputs for transparency. Processed datasets and analysis scripts are available upon reasonable request from the corresponding author.
6. Computational Reproducibility Statement
All computations were performed in Python 3.12 using xarray, pandas, numpy, matplotlib, and scikit-learn. The workflow automated data opening, coordinate harmonization, regridding, merging, and export. Random seeds and version control were fixed to ensure reproducible results. All code and preprocessing scripts can be provided upon request.
A
Author Contribution
Abdulahi Jimale Said conceptualized the study, conducted data acquisition, analysis, visualization, and manuscript preparation.
A
Data Availability
All datasets used in this study were obtained from the **Copernicus Marine Environment Monitoring Service (CMEMS)** , which provides open-access ocean reanalysis and forecast products. The specific datasets analyzed include *Global Ocean Physics Analysis and Forecast* (dataset ID: cmems_mod_glo_phy_anfc_0.083deg_P1D-m), *Global Ocean Biogeochemistry CO₂* , *Nutrients* , and *Phytoplankton* Analysis and Forecast products (dataset IDs: cmems_mod_glo_bgc-co2_anfc_0.25deg_P1D-m, cmems_mod_glo_bgc-nut_anfc_0.25deg_P1D-m, and cmems_mod_glo_bgc-pft_anfc_0.25deg_P1D-m).All files were accessed programmatically using the **CopernicusMarine Python API** , harmonized in time and space, and exported as tidy **Parquet** and **CSV** datasets to ensure transparency and reproducibility. The processed datasets and analysis scripts can be made available by the corresponding author upon reasonable request.
9. Bibliography
1.Dai Y, et al. Coastal phytoplankton blooms expand and intensify in the 21st century. Nature. Mar. 2023;615(615):1–5. https://doi.org/10.1038/s41586-023-05760-y.
2.Hammond ML, Jebri F, Srokosz M, Popova E. Automated detection of coastal upwelling in the Western Indian Ocean: Towards an operational ‘Upwelling Watch’ system. Front Mar Sci. Aug. 2022;9. https://doi.org/10.3389/fmars.2022.950733.
3.Yang M, et al. Two-decade variability and trend of chlorophyll-a in the Arabian Sea and Persian Gulf based on reconstructed satellite data. Front Mar Sci. Dec. 2024;11. https://doi.org/10.3389/fmars.2024.1520775.
4.Roy R, Lotliker AA, Baliarsingh SK, Jayaram C. Water column properties associated with massive algal bloom of green Noctiluca scintillans in the Arabian Sea. Mar Pollut Bull. Dec. 2023;198:115913–115913. https://doi.org/10.1016/j.marpolbul.2023.115913.
5.Zahir M, Su Y, Shahzad MI, Ayub G, Rehman SU, Ijaz J. A review on monitoring, forecasting, and early warning of harmful algal bloom. Aquaculture. Jul. 2024;741351–741351. https://doi.org/10.1016/j.aquaculture.2024.741351.
6.Silva E, et al. Warming and freshening coastal waters impact harmful algal bloom frequency in high latitudes. Commun Earth Environ. Jun. 2025;6(1). https://doi.org/10.1038/s43247-025-02421-y.
7.Zhang Z, Ma W, Chai F. Dynamical Response of the Arabian Sea Oxygen Minimum Zone to the Extreme Indian Ocean Dipole Events in 2016 and 2019. Geophys Res Lett. Oct. 2023;50(21). https://doi.org/10.1029/2023gl104226.
8.Twinkle, Sathish, et al. Observed evidence for the impact of coastal currents on the recurrent Noctiluca scintillans blooms in the northwest Indian Ocean coast. Mar Pollut Bull. Aug. 2023;194:115426–115426. https://doi.org/10.1016/j.marpolbul.2023.115426.
9.Oh J-W, Pushparaj SSC, Muthu M, Gopal J. Review of Harmful Algal Blooms (HABs) Causing Marine Fish Kills: Toxicity and Mitigation. Plants. Jan. 2023;12(23):3936. https://doi.org/10.3390/plants12233936.
10.Chen Y, Zhao H, Shen C. Summer Chlorophyll-a Increase Induced by Upwelling off the Northeastern Coast of Hainan Island, South China Sea. Water. Jul. 2023;15:2770–2770. https://doi.org/10.3390/w15152770.
11.Clément Haëck M, Lévy Inès, Mangolte, Bopp L. Satellite data reveal earlier and stronger phytoplankton blooms over fronts in the Gulf Stream region. Biogeosciences. May 2023;20(9):1741–58. https://doi.org/10.5194/bg-20-1741-2023.
12.Fatma Jebri M, Srokosz DE, Raitsos ZL, Jacobs A, Sanchez-Franks, Popova E. Absence of the Great Whirl giant ocean vortex abates productivity in the Somali upwelling region. Commun earth Environ. Jan. 2024;5(1). https://doi.org/10.1038/s43247-023-01183-9.
13.Kumar J, Ratheesh S, Agarwal N, Sharma R. Study of Upwelling and mixing process in the Somali Coastal Region using satellite and numerical model observations: A Lagrangian Approach, Deep-sea research. Part 2. Topical studies in oceanography/Deep sea research. Part II, Topical studies in oceanography, pp. 105381–105381, May 2024, https://doi.org/10.1016/j.dsr2.2024.105381
14.Garg S, Gauns M, Bhaskar TVSU. Dynamics of subsurface chlorophyll maxima in the northern Indian Ocean. Mar Pollut Bull. Oct. 2024;207:116891. https://doi.org/10.1016/j.marpolbul.2024.116891.
15.Sadhvi K, et al. Intrinsic Versus Wind-Forced Great Whirl Non‐Seasonal Variability. J Geophys Research: Oceans. Feb. 2024;129(2). https://doi.org/10.1029/2023jc020077.
16.Pateraki M, Raitsos DE, Krokos G, Theodorou I, Hoteit I. Unravelling the influence of mixed layer depth on chlorophyll-a dynamics in the Red Sea. PLoS ONE. Mar. 2025;20(3):e0318214. https://doi.org/10.1371/journal.pone.0318214.
17.Brenckman CM, Parameswarappa Jayalakshmamma M, Pennock WH, Ashraf F, Borgaonkar AD, A Review of Harmful Algal Blooms: Causes, Effects, Monitoring, and, Methods P. Water, vol. 17, no. 13, p. 1980, Jul. 2025. https://doi.org/10.3390/w17131980