Article
Present Address:
James
E.
Dare
1,2,3✉
Emailjd227@students.waikato.ac.nz
Deniz
Özkundakci
1
Richard
W.
McDowell
4,5
1
School of Science
The University of Waikato
Hamilton
New Zealand
2
Bay of Plenty Regional Council
Tauranga
New Zealand
3
AquaWatch Solutions
Auckland
New Zealand
4
Faculty of Agriculture and Life Sciences
Lincoln University
Lincoln
New Zealand
5
Lincoln Science Centre
AgResearch
Lincoln
New Zealand
James E. Dare*1,2,3, Deniz Özkundakci1, Richard W. McDowell4,5
1
School of Science, The University of Waikato, Hamilton, New Zealand.
2
Bay of Plenty Regional Council, Tauranga, New Zealand.
3
AquaWatch Solutions, Auckland, New Zealand.
4
Faculty of Agriculture and Life Sciences, Lincoln University, Lincoln, New Zealand.
5
AgResearch, Lincoln Science Centre, Lincoln, New Zealand.
*
Corresponding author: James Dare, jd227@students.waikato.ac.nz
Abstract
Quantifying contaminant loads to estuaries is essential for setting effective limits on resource use and safeguarding ecological values. Traditional monitoring programmes often rely on infrequent sampling, which can substantially underestimate loads and obscure key transport processes. While commercial high-frequency sampling stations are prohibitively expensive, open-source, do-it-yourself technologies now offer affordable alternatives for continuous monitoring. Here, we deployed ten low-cost, in situ monitoring stations equipped with research-grade sensors across an intensively farmed catchment draining to a sensitive estuarine environment in the Bay of Plenty, New Zealand. Using artificial neural network models trained on concurrent grab samples, we converted sensor measurements into reliable 15-min estimates of contaminant concentrations. High-frequency load calculations revealed nitrogen, phosphorus, and sediment exports to be 6–87% greater than estimates derived from traditional monthly sampling. Moreover, time-series outputs uncovered distinct sub-catchment contaminant mobilisation and transport dynamics that would otherwise remain undetected. These findings demonstrate that open-source, high-frequency monitoring can substantially improve contaminant load quantification, provide new insights into catchment processes, and inform the development of land management and policy strategies that reflect the unique spatio-temporal patterns of contaminant export.
Introduction
Agricultural intensification and urbanisation has led to widespread degradation of freshwater quality globally, with increased contaminant losses and declining water quality in many rivers, lakes, and estuaries1–4. Effective management of these systems requires accurate monitoring of water quality, which is essential for understanding long-term changes, estimating contaminant loads, and identifying the drivers of ecological decline in affected catchments5. In many regions, local authorities use a combination of policy tools, regulatory processes, and on-the-ground action to manage freshwater resources. However, traditional water quality monitoring programmes with monthly or greater sampling frequency, often fail to capture rapid hydrochemical processes and changes that occur on shorter time scales6,7. Kirchner6 likens this to ‘trying to understand a symphony by listening to only one note every minute or two’. However, this symphony occurs across an entire catchment, where discrete moments in space and time are known to contribute disproportionately relative to background conditions. While datasets from monthly sampling are suitable for some analytical purposes such as trend analysis8 or the calculation of crude contaminant loads, the lack of temporal resolution limits the ability to identify key processes where contaminant mobilisation is most acute, resulting in an incomplete understanding of water quality dynamics6.
Recent advances in the concept of 'hot-spots and hot-moments' (HSHM) have highlighted the importance of temporal and spatial variability in biogeochemical processes and contaminant mobilisation9,10. HSHM refers to specific locations and times periods within a catchment where biogeochemical reactions and contaminant fluxes are disproportionately high. Understanding these phenomena is critical for improving water quality management, as areas with elevated biogeochemical process rates (e.g., wetlands, or riparian zones) can be protected, enhanced, or created to limit the extent of problems such as eutrophication10. Resource managers also need to understand HSHM as sources of contaminants to be able to priortise the location of management interventions’, as well as understanding the relative importance of ‘hot-moments’10.
Traditional monthly water quality monitoring does not adequately characterise ‘hot-moments’ which can lead to underestimation of contaminant loads due to the disproportionate contribution of short-lived storm events11. For example, Abell et al.12 showed that annual load estimates of total nitrogen (TN) and total phosphorus (TP) in two New Zealand streams were 19% and 40% less when estimated via monthly samples versus samples supplemented with targeted storm monitoring. Furthermore, McDowell et al.13 found that monthly sampling resulted in a ~ 30% error in the estimation of TP and nitrate-nitrogen loads compared to sampling every 30 minutes.
Knowledge of HSHM is particularly important for implementing environmental policy in New Zealand (the focus areas of this study) within the context of New Zealand’s National Policy Statement for Freshwater Management14. This policy mandates regional authorities to set contaminant limits to prevent further water quality decline, and to achieve outcomes desired by their local community. However, underestimation of contaminant loads from traditional monthly sampling could lead environmental managers to make incorrect conclusions about the severity or location of contaminant issues, leading to weak or ineffective management actions that will never achieve the intended objective13. By enhancing monitoring efforts to capture HSHM, environmental managers could more accurately quantify contaminant loads and develop the most appropriate management strategy to match contaminant mobilisation nuances observed in discrete areas of the catchment15.
The expansion of in-situ water quality sensors provides an opportunity to capture HSHM at discrete sites within a monitoring network. However, commercially available monitoring stations are expensive which means that regional authorities are often limited to a small number of sites in large river network catchments16,17. An emerging solution comes via the ‘open-source movement’, which aims to make equipment and software available for modification with a comparably low-cost barrier to entry18. This has led to the development of ‘do-it-yourself’ (DIY) environmental monitoring options that are able to control an array of research grade sensors for a fraction of the price of commercial monitoring stations. One such option is the EnviroDIY Mayfly board, which has been designed for low-cost, solar-powered, wireless applications in environmental science16. Users can programme the board themselves using a range of different sensor-specific code templates written within the Arduino interactive development environment (IDE) and shared via an active GitHub community, providing similar functionality and reliability to commercially available systems.
Environmental practitioners often use in-situ sensor data to develop a relationship with hydrochemical or physical parameters measured via lab-samples, in a process referred to as a surrogate relationship19. Surrogate relationships provide a practical, cost-effective method to understand high-frequency, hydrochemical dynamics with more accuracy than can be achieved using discrete water quality samples alone19,20. Surrogate relationships have evolved from stepwise regression models to more computationally intensive machine learning models such as random forest or artificial neural networks (ANN)21,22. The use of ANN methods have been a popular choice for water quality studies due to their robustness, ability to resolve nonlinear problems, absence of statistical assumptions, and suitability to smaller datasets compared with other machine learning techniques21,23,24. Examples of ANN applications in the field of water quality research include: the prediction of water quality index scores from hydrochemical measurements in South African river basins25; estimation of microcystin concentrations in the Binhu river network of China26; prediction of TN and TP concentrations in US lakes from in-situ pH, conductivity, and turbidity measurements27; the prediction of turbidity in the Nalón river basin (Spain) from in-situ sensor measurements28; and detection of spills and discharges into the Potomac River Basin in Virginia, USA29.
The main objectives of this study are twofold: firstly, to use DIY water quality stations, in combination with ANN-based surrogate models to enhance the information obtained from a traditional water quality monitoring network, so that contaminant HSHM and more accurate contaminant loads are elucidated; and secondly, to interpret, translate, and provide recommendations as to how observed HSHM information can be applied to improve resource management. It is hypothesised that the enhanced network will provide a pragmatic way to elucidate contaminant HSHM and increase the accuracy of contaminant load estimation over existing methods.
Results
Routine monitoring overview
A
The number of routine water quality samples collected at each site between November 2021 and June 2024 ranged from 24 to 29. This did not include targeted storm samples or opportunistic samples and therefore represents ‘traditional monitoring’ initiated by the regional authority, Bay of Plenty Regional Council (BOPRC). Summary statistics for TN, TP, and total suspended solids (TSS) for each network site are available in the supplementary information (Table
S1).
Median TN concentrations ranged from 0.99 mg L− 1 (Pokopoko at Allport Rd) to 2.37 mg L− 1 (Wharere at Maniatutu Rd). ‘Puanene at SH2’ and ‘Mangatoetoe at end of Black Rd’ were the most variable sites for TN (stdev ~ 0.60 mg L− 1compared to mean stdev of 0.32 mg L− 1). Median TP ranged from 0.091 mg L− 1 (Pokopoko at Allport Rd) to 0.133 mg L− 1 (Puanene at SH2; Pongakawa trib at Rotoehu Rd). ‘Puanene at SH2’ had a maximum TP concentration that was 51 times higher than the median, compared to other sites where the maximum value averaged 4.4 times higher. Median TSS ranged from 3.8 mg L− 1 (Wharere at Maniatutu Rd) to 55.9 mg L− 1 (Pokopoko at Allport Rd). Four sites had maximum TSS values that were over 30 times their median value (Pokopoko at Allport Rd; Oeutehuehue at Farm Box Culvert; Pokopoko at Old Coach Rd; Pongakawa Tributary at Old Coach Road). Three of these sites were located within the Pokopoko sub-catchment, while ‘Pongakawa Tributary at Old Coach Road’ was located in the Pongakawa sub-catchment.
Storm event monitoring
An example of targeted storm event monitoring at ‘Mangatoetoe at end of Black Rd’ is shown in Fig. 1. Discharge peaked at 2.02 m3 s− 1 at 20:30 on June 14 2024, reflected by a drop in specific conductivity from a pre-event baseline of approximately 130 µS cm− 1 to a minimum of 60 µS cm− 1. Turbidity increased with discharge to approximately 400 NTU, with isolated peaks as high as 1250 NTU, before returning to pre-event levels as the discharge receded. Water temperature was less affected by the storm event, and more controlled by diurnal patterns, increasing to a peak of 14.5°C in the afternoon before reducing into the evening.
An ISCO autosampler was deployed throughout this event to characterise contaminant concentrations. The first sample was collected at 04:00 on June 14 2024, continuing until 24 samples had been collected at 14:30 on June 15 2024. The results for TN are displayed in the bottom panel of Fig. 1 and show a pre-event concentration of 1.64 mg L− 1, increasing to a peak of 3.21 mg L− 1 which occurred slightly before the discharge peak, reducing to a post-event concentration of 1.93 mg L− 1.
The dashed line in the bottom panel represents modelled TN concentrations, where discharge, conductivity, and turbidity were used as predictor variables. The model was able to accurately represent measured TN concentrations throughout this event, with the exception of the two highest values that occurred just before the discharge peak, and one value that occurred just before 06:00 on June 15.
Surrogate relationships
The overall performance of ANN models was good, with 53% of models across all parameters having a Nash-Sutcliffe efficiency (NSE) value > 0.8, and 76% of models having a NSE > 0.7. TSS was the best performing contaminant model across all sites (median NSE = 0.90), followed by TP (median NSE = 0.80) and TN (median NSE = 0.75) (Table 1).
Model performance was generally poorer at sites that were influenced by groundwater, such as Wharere at Maniatutu Rd (mean NSE = -0.34 across all parameters), and Pongakawa at Old Coach Rd (mean NSE = 0.77). All contaminant models for Wharere at Maniatutu Rd, and Pongakawa Tributary at Old Coach Rd, were removed from further analysis due to poor diagnostic performance (NSE < 0.5).
Table 1
Model performance for each site and parameter across the study period (November 2021 - June 2024). The coefficient of determination (R2), Nash-Sutcliffe efficiency (NSE), percent bias (PBIAS), and root mean squared error (RMSE) values are shown as indicators of model performance. ‘Storm Events’ shows the number of individual storm events that were included in the model dataset at each site, and ‘Storm Sample Composition’ provides the percentage of storm event samples relative to the entire dataset for that parameter.
|
Site
|
Parameter
|
Storm Events
|
Storm Sample Composition
|
R2
|
NSE
|
PBIAS
|
RMSE
|
|
Mangatoetoe at end of Black Rd
|
TN
|
2
|
55%
|
0.74
|
0.72
|
-2.7
|
0.04
|
| |
TP
|
2
|
55%
|
0.81
|
0.79
|
0.5
|
0.10
|
| |
TSS
|
1
|
44%
|
0.92
|
0.92
|
2.0
|
0.18
|
|
Oeutehuehue at Farm Box Culvert
|
TN
|
2
|
62%
|
0.88
|
0.83
|
-7.9
|
0.09
|
| |
TP
|
2
|
62%
|
0.81
|
0.80
|
3.5
|
0.17
|
| |
TSS
|
2
|
67%
|
0.95
|
0.94
|
1.5
|
0.14
|
|
Pokopoko at Allport Rd
|
TN
|
2
|
54%
|
0.80
|
0.79
|
7.5
|
0.06
|
| |
TP
|
2
|
54%
|
0.87
|
0.85
|
-5.2
|
0.11
|
| |
TSS
|
2
|
58%
|
0.87
|
0.85
|
-1.5
|
0.17
|
|
Pokopoko at Old Coach Rd
|
TN
|
2
|
59%
|
0.86
|
0.83
|
3.0
|
0.04
|
| |
TP
|
2
|
59%
|
0.83
|
0.82
|
0.2
|
0.18
|
| |
TSS
|
2
|
67%
|
0.98
|
0.98
|
-0.1
|
0.12
|
|
Pongakawa at Old Coach Road
|
TN
|
3
|
56%
|
0.68
|
0.66
|
2.1
|
0.02
|
| |
TP
|
3
|
60%
|
0.71
|
0.71
|
-0.3
|
0.06
|
| |
TSS
|
1
|
44%
|
0.97
|
0.94
|
-3.7
|
0.10
|
|
Pongakawa Tributary at Old Coach Rd
|
TN2
|
2
|
42%
|
0.55
|
0.45
|
2.4
|
0.06
|
| |
TP2
|
2
|
42%
|
0.91
|
0.82
|
4.1
|
0.07
|
| |
TSS2
|
1
|
45%
|
0.84
|
0.84
|
-6.3
|
0.15
|
|
Pongakawa Tributary at Rotoehu Rd
|
TN
|
2
|
56%
|
0.66
|
0.65
|
-1.1
|
0.03
|
| |
TP
|
2
|
56%
|
0.76
|
0.74
|
0.8
|
0.94
|
| |
TSS
|
2
|
63%
|
0.87
|
0.79
|
1.6
|
0.10
|
|
Puanene at SH2
|
TN
|
2
|
56%
|
0.84
|
0.83
|
-0.7
|
0.07
|
| |
TP
|
2
|
56%
|
0.71
|
0.59
|
-2.0
|
0.23
|
| |
TSS
|
1
|
43%
|
0.95
|
0.87
|
-7.8
|
0.22
|
|
Wharere at Maniatutu Rd
|
TN1
|
2
|
46%
|
0.36
|
-0.23
|
-0.7
|
0.03
|
| |
TP1
|
2
|
46%
|
0.01
|
-0.98
|
6.7
|
0.22
|
| |
TSS1
|
2
|
50%
|
0.20
|
0.19
|
-2.7
|
0.28
|
|
Wharere at Old Coach Rd
|
TN3
|
2
|
54%
|
0.79
|
0.78
|
-0.9
|
0.04
|
| |
TP
|
2
|
54%
|
0.93
|
0.90
|
4.1
|
0.09
|
| |
TSS
|
2
|
59%
|
0.97
|
0.96
|
2.7
|
0.10
|
1 Model was deemed unacceptable because NSE < 0.5.
2 Model was deemed unacceptable based on visual inspection.
3 Model was deemed acceptable but is considered to be of lower quality after visual inspection, despite acceptable model performance indicators.
Figure 2 provides examples of modelled TN for three of the ten network sites. The shaded bands represent upper and lower 90% prediction intervals, and the red points represent observed TN concentrations from laboratory samples. All three models provide a good fit between predicted and observed data, with NSE values of 0.83, 0.65, and 0.78 (top to bottom). However, the bottom model was considered lower quality than the others due to data steps and reduced certainty (wider prediction intervals), despite having a NSE value of 0.78. These examples show the importance of visually inspecting model output in addition to statistical output.
Total load export
The summed mean annual load export from the most downstream site on each river branch (henceforth ‘export sites’) was 550 t TN yr− 1, 56 t TP yr− 1, and 36,300 t TSS yr− 1.
Pongakawa River at Old Coach Rd had the highest mean TN and TP load export, contributing 58% (319 t yr− 1) of the total TN load, and 49% (27.8 t yr− 1) of the total TP load exported (Fig. 3). Pokopoko at Old Coach Rd contributed 25% of the TN (136 t yr− 1) and 34% of the TP (19.1 t yr− 1) load. TN and TP export at Pokopoko at Old Coach Rd was 4% and 10% greater than the combined loads from upstream sites (Pokopoko at Allport Rd, Oeutehuehue at Farm Box Culvert) for TN and TP, respectively. The combined load export from other sites within the catchment made up approximately 17% of the total load export for both TN and TP.
Sediment load export was dominated by the Pokopoko sub-catchment, with Pokopoko at Old Coach Rd exporting 62% (22,570 t yr− 1) of the total annual network load. The combined export from the two upstream branches of the Pokopoko Stream equated to 23% of the total network load, implying that 39% of the annual network load is generated in the small area between these sites. Pongakawa at Old Coach Rd site contributed 28% (10,313 t yr− 1), and all other river branches contributed a combined annual load ~ 10%.
Contaminant yields
Total nitrogen yields were highest in the Pongakawa sub-catchment (32.0 kg ha− 1 yr− 1), followed by the Mangatoetoe (25.9 kg ha− 1 yr− 1) and the upper western branch of the Pokopoko River (18.4 kg ha− 1 yr− 1) (Fig. 4). The Mangatoetoe expressed the largest TP yield (2.9 kg ha− 1 yr− 1), followed by the Pongakawa River (2.8 kg ha− 1 yr− 1), and the western branch of the Pokopoko Stream (2.4 kg ha− 1 yr− 1). Sediment yields were largest for the lower Pokopoko Stream (2,385 kg ha− 1 yr− 1) and could be traced upstream to the Pokopoko at Allport Rd Site (1,911 kg ha− 1 yr− 1). Other significant contributors of sediment were the Mangatoetoe (1,284 kg ha− 1 yr− 1) and Pongakawa sub-catchments (1,034 kg ha− 1 yr− 1).
The Pongakawa Tributary at Rotoehu Rd site had the lowest yield across all three contaminants.
Baseflow load export
The percentage of TN load exported during baseflow conditions varied from 76% to 97% (Fig. 5). The most baseflow dominant sites were Pongakawa at Old Coach Rd (97%), and the two upper Pokopoko catchment sites (Oeutehuehue at Farm Box Culvert = 92%; Pokopoko at Allport Rd = 89%). The least baseflow dominant sites were in the lower catchment (Mangatoetoe at end of Black Rd = 76%; Puanene at SH2 = 79%; Pokopoko at Old Coach Rd = 82%).
Baseflow TP export was more variable between sub-catchments than TN, ranging from 49% to 96%, and more controlled by non-baseflow events. Sites with the largest baseflow TP delivery were located in the Pongakawa or upper Pokopoko sub-catchments. Similar to TN, catchments closer to the estuary were more controlled by event-flow delivery (e.g., Mangatoetoe at end of Black Rd = 49%; Puanene at SH2 = 58%).
TSS was the most variable contaminant across sites for baseflow export (40%-95%), and the most event-driven. Two sites in the Pongakawa sub-catchment (Pongakawa at Old Coach Rd; Pongakawa Trib at Rotoehu Rd), delivered the greatest percentage of TSS load during baseflow conditions, with 95% and 86%, respectively. Conversely, Mangatoetoe at end of Black Rd and Puanene at SH2 delivered the highest percentage of TSS load during storm events, with baseflow delivery percentages of 40% and 47%, respectively.
Comparison of load estimation methods
Figure 6 shows a comparison of load estimates calculated from ANN models and discrete monthly samples (henceforth the ‘traditional’ method). TN load estimates were approximately equal between the two methods, however TP and TSS estimates diverged, particularly for sites with larger estimated loads. ANN model estimates were on average 5% (-24.3% to 36.5%), 29% (-30% to 101%), and 61% (-6% to 315%) larger than traditional estimates, for TN, TP, and TSS, respectively. Export sites had total combined loads for TN, TP and TSS that were 6%, 32%, and 85% larger for ANN model estimates than the traditional method.
The width of confidence intervals (CI’s) was contaminant dependent, with the narrowest CI’s calculated across both methods for TN (average width = 56% of estimate), and the widest CI’s calculated for TSS (average width = 284% of estimate). Confidence intervals were on average 2% (-66% to 56%), and 74% (-61% to 482%) wider for ANN models compared to the traditional method for TN and TP, and 8% (-65% to 104%) narrower for TSS.
Discussion
The current study is among the first to use DIY sensor technology and surrogate modelling across multiple sites within a catchment, providing cost-effective, spatio-temporal insights into contaminant mobilisation. Our results show that integrating traditional monitoring with targeted storm-event sampling and low-cost sensors yields critical understanding of catchment function, particularly the timing and magnitude of nutrient delivery during high flows.
Surrogate water quality models have been widely applied in other contexts, with performance comparable to our findings. For example, Harrison et al.30 pooled data across multiple River Thames sites using random forest models to predict TN and TP, achieving NSE values of 0.85 and 0.74 (compared with averages of 0.76 and 0.78 in this study). Shyu et al.31 developed surrogate models for wastewater treatment constituents, reporting R² values of 0.46–0.97 (0.84–0.97 in this study), while Holmberg et al.32 used ANN models in boreal streams to predict TN and TP with R² values of 0.83 and 0.78 (0.76 and 0.82 in this study).
Our findings also highlight the disproportionate timing and location of contaminant delivery. For instance, 50% of TN, TP, and TSS loads occurred during just 36%, 19%, and 2% of monitored time, respectively, reflecting strong spatio-temporal variability in supply and transport. This pattern supports the ‘hot spot–hot moment’ (HSHM) concept10, which describes uneven biogeochemical processing and contaminant movement in time and space. At the sub-catchment scale, Pokopoko at Old Coach Rd contributed 62% of TSS loads but only 28% of flow, while a single six-day storm cluster accounted for 5% of annual TN, 14% of TP, and 32% of TSS loads across all sites. Such concentrated export events are difficult to capture with monthly sampling, reinforcing the value of high-frequency monitoring.
Comparison of ANN-derived load estimates with those based on monthly sampling revealed that ANN loads were 6%, 32%, and 85% greater for TN, TP, and TSS, respectively. These differences reflect the ability of high-frequency models to capture in-stream changes that occur between discrete samples. Similar discrepancies have been reported elsewhere, with many studies concluding that monthly sampling underestimates true loads and increases uncertainty12,33–35. For example, McDowell et al.13 showed that 48% of annual TN and 66% of annual TP loads in New Zealand occur during high flows—events that are often missed in monthly programmes. By capturing these dynamics, the method developed here improves understanding of the magnitude and source of contaminant loads to Waihi Estuary, providing a stronger foundation for setting achievable management targets.
Finally, our findings can inform catchment-scale modelling by enhancing the simulation of spatio-temporal nutrient fluxes. For example, ANN-derived contaminant signatures can be interpreted in relation to established biogeochemical concepts such as the four ‘control points’ described by Bernhardt et al. 9 (permanent, activated, export, and transport). Yulianti et al. 36 used a similar approach to improve SWAT + model performance in Lake Ōkaro, calibrating streamflow sources and nitrogen transport dynamics using water and nitrogen isotopes. In our study, ANN outputs suggest that TN reaches the mainstem river in the Pongakawa sub-catchment primarily via low-threshold export (vertical leaching) and efficient shallow aquifer transport, resulting in persistently elevated baseflow concentrations. In contrast, the Pokopoko sub-catchment exhibited higher-threshold export, particularly for TP and TSS, where contaminants accumulated and were episodically delivered to the river during overland flow events.
Translating results into management actions
Our results show that ANN models can identify contaminant hot spots and hot moments (HSHM) and resolve transport pathways at the catchment scale, while simultaneously capturing storm-event dynamics that drive load estimates. To inform management, these insights must be translated into actions that align with catchment-wide goals—in this case, achieving sustainable contaminant load targets for the Waihi Estuary. This process involves both prioritising and characterising model outputs.
Prioritisation is essential, as land managers must allocate limited resources where they will have the greatest impact. Sub-catchments with yields exceeding load targets per unit area are prime candidates for intervention, but gross contaminant export must also be considered to ensure meaningful reductions at the estuary scale. For example, Pongakawa and Pokopoko are high-priority areas for TN, TP, and TSS management due to both high yields and high total export. In contrast, Mangatoetoe, Puanene, and Wharere exceed yield thresholds but contribute relatively little gross export; improvements here would have limited influence on overall estuary loads and are therefore better suited for long-term rather than immediate action.
Characterisation of contaminant pathways ensures that management actions are tailored to the dominant transport processes within each sub-catchment. In Pongakawa and Pokopoko, elevated baseflow TN export indicates vertical leaching of NO₃ to groundwater as the key pathway. This is typical of intensively farmed New Zealand landscapes with porous volcanic or alluvial soils37. Here, soils act as export control points, accumulating nitrate until rainfall triggers leaching, while aquifers act as transport control points, conveying nitrate to streams. Because these pathways are largely controlled by inherent soil and geological properties, direct in-stream management is limited. The most effective strategies are those that reduce nitrogen accumulation and water movement at the source38. Recommended practices include limiting stocking rates, aligning fertiliser use with crop uptake and hydrology38,40, using organic nitrogen sources39, managing land-use sensitivities39, restricting tillage38,40, and modifying land surfaces to reduce permeability or slow overland flow39. Importantly, hydrological lag times mean that improvements may take years to be reflected in-stream; McDowell et al. 41 estimate a median lag of 4.5 years for NO₃ in New Zealand, with detectable improvements often requiring up to a decade post-intervention.
Sub-catchment TP dynamics fall into two categories: baseflow-driven (e.g., Pongakawa, upper Pokopoko) and storm-event-driven (e.g., lower Pokopoko, Mangatoetoe, Puanene, Wharere). Elevated baseflow TP is largely attributable to the dissolution of P-rich volcanic geology42, which is difficult to manage without chemical interventions such as alum dosing43. Consequently, management is better directed toward storm-event-driven sub-catchments, where TP is predominantly particulate or sediment-bound. In these areas, contaminants are generally mobilised when rainfall generates sufficient overland flow, creating higher thresholds for export compared to nitrate in baseflow-dominated areas. Such dynamics present an opportunity for interceptive, co-management strategies that retain water and allow sediment to settle before reaching the river. Effective measures include retention bunds44, constructed wetlands45, protection of natural floodplains46, and riparian buffer strips45,47. At the same time, the most cost-effective way to address anthropogenic phosphorus is at its source—through fertiliser management48, identification of critical source areas49, and restriction of stock access49. Additional gains can be made by stabilising erosion-prone gully heads through prevention measures (e.g., topsoil reinforcement, vegetation barriers) or control methods (e.g., runoff diversion, gully reshaping, check dams, revegetation) 50.
Limitations and improvements
Development and knowledge requirements
A
A
A
A key limitation of this approach is the need for users to build and program each sensor station, which requires more technical expertise than most off-the-shelf (OTS) systems. However, these skills can be readily acquired through online resources designed to support community groups with little prior experience (Table S2). The main advantage of this method remains its cost: equipment costs are up to ten times lower than OTS systems, allowing wider spatial coverage across a catchment. In our study, however, additional requirements such as field technicians, autosamplers, and storm-event laboratory analyses meant that total costs were approximately 2.8 times higher than those of monthly sampling (Tables S3–S5). These expenses were primarily driven by event sample processing (64%), followed by sensor station equipment (30%), maintenance labour (4%), and autosampler deployment labour (2%). Despite higher overall costs, the added investment is minor when weighed against the risks of poorly informed land management actions or generic policies based on limited contaminant data.
Model performance
Model performance was generally acceptable, though sites influenced by groundwater performed worse than mixed-flow sites. This limitation likely reflects the difficulty of detecting pulses of soluble contaminants from groundwater using the available sensors. Future improvements could involve deploying additional probes capable of capturing subtle groundwater signals, such as pH or redox sensors.
Paepae et al. 51 propose accuracy targets for surrogate models, with R² values of 0.95–1.00 considered desirable, 0.90–0.95 acceptable, 0.85–0.90 tolerable, and 0.80–0.85 poor. By these standards, TN models in our study ranged from below poor to tolerable, TP from poor to acceptable, and TSS from poor to desirable. However, these thresholds were developed for commercial nitrate sensors, and Paepae et al. note that accuracy requirements depend on both environmental context and intended application. Given that our study used indirect sensor combinations to estimate concentrations, we argue that the spatio-temporal and hydrochemical insights gained outweigh the modest reduction in accuracy—particularly in applied contexts such as the Waihi Estuary, where outputs improve policy advice beyond what is possible with traditional monitoring.
Another challenge was the development of hydrological rating curves at soft-bottomed sites, which we addressed by developing flow relationships with nearby hydrological stations. This introduced variation in discharge estimation methods across the network. Future studies could reduce this limitation by employing hydrodynamic river models to standardise discharge estimates and improve comparability across sites.
Autosampling
A major strength of this study was the establishment of ten monitoring sites across the catchment, which improved spatial resolution of contaminant signatures. However, equipment availability limited the number of storm events sampled per site. Future work could focus on fewer, strategically selected locations to increase storm-event coverage and enhance model performance. In particular, emphasis should be placed on the Pongakawa River during major storm events, as this sub-catchment only responded to very high rainfall accumulation.
Conclusions
This study demonstrates that traditional water quality monitoring can be substantially enhanced through the use of cost-effective DIY sensor stations combined with ANN models. High-frequency surrogate records captured hydrochemically significant storm events missed by monthly sampling, improving load estimates—particularly for TP and TSS.
Integrating model outputs within the HSHM framework provided several key benefits for water quality management. First, it identified sub-catchments and event timings where export and transport control points combined to create disproportionate contaminant risks. Second, it enabled a triage-based approach to management, directing immediate interventions to high-export sub-catchments while reserving lower-priority areas for long-term planning. Third, it supported efficient investment by showing that the majority of contaminant exports originated from a small number of sub-catchments, highlighting the value of targeted mitigation. Finally, the framework demonstrated the need for adaptive management, as shifting contaminant signatures under changing climatic and land-use conditions demand flexible prioritisation.
By coupling ANN modelling with the HSHM framework, this study presents a cost-effective and practical method for disentangling complex contaminant transport dynamics. The approach not only provides more accurate load estimates for sensitive estuarine environments such as Waihi Estuary but also generates actionable insights that can inform spatially explicit, tailored land management strategies. While this requires greater resources than monthly sampling, the improvements in accuracy, certainty, and management relevance strongly justify the investment, and the framework is readily transferable to other agricultural catchments worldwide.
Methods
Study area
A
This study was carried out in the Waihi Estuary catchment, near the township of Pukehina, in the Bay of Plenty region of New Zealand (Fig.
7). The Waihi Estuary catchment is predominantly agricultural and covers approximately 34,256 hectares at the north-western edge of the Taupo Volcanic Zone
52. Soil types are predominantly well-drained pumice soils overlying ignimbrite in the upper and mid catchment, transitioning to poorly drained gley soils overlying swamp deposits near the estuary
53,54. Major land uses include, dairy farming (35%), exotic forestry (20%, mainly
Pinus radiata), sheep and beef farming (14%), and orchards (7%, mainly kiwi fruit). The catchment has undergone extensive hydraulic modification, particularly in lowland areas, where wetlands fringing Waihi Estuary have been drained, and riverine inputs have been channalised, to make way for agricultural intensification.
The catchment drains into Waihi Estuary, a shallow, eutrophic, intertidal estuary that is currently experiencing nuisance blooms of macroalgae due to excess nutrient and fine sediment loading from the contributing catchment55. A catchment modelling study carried out in 2014 estimated that TN, TP, and total suspended solids (TSS) loads are currently 524%, 124%, and 232% of their natural state55. Furthermore, it has been estimated that TN and TSS loads need to be reduced by approximately 40% and 65%, respectively to achieve a state that supports biodiversity, ecological functioning, and provides for cultural and community values56. Estimated TP load reductions are more complex; work by Atkinson and Smith57 implies that an N:P ratio of 30:1 is an ideal standard for assessing nutrient limitation in estuaries, which translates to a target of 18.2 t yr− 1, or an approximate TP reduction of 30% from the 2014 load55. However, natural state’ modelling implies that the natural N:P ratio was historically around 6.4:1 due to natural, volcanic, sources of P within the catchment, which means that the ideal ratio of 30:1 may be unrealistic55. Total phosphorus remains a focus in this study given that opportunities to manage anthropogenic sources are still abundant and important for managers to understand.
Traditional water quality monitoring investigation
Bay of Plenty Regional Council (BOPRC), the regional authority that is responsible for managing water resources in the Bay of Plenty, established a routine water quality monitoring network in the Waihi Estuary catchment in 2021. This traditional monitoring network is part of BOPRC’s ‘focus catchment programme’, which aims to identify high-risk areas within a deteriorating catchment by increasing the spatial resolution of water quality monitoring sites. Monthly water quality samples were collected by trained professionals at 10 representative sites throughout the catchment (Fig. 7), and analysed for a suite of constituents, including TN, TP, and TSS. This dataset was used as the foundation for the current study.
Continuous monitoring
For the current study, each of the 10 sites in the BOPRC investigation was supplemented with a ‘Mayfly’ sensor station to better characterise riverine conditions between sampling dates. Sensor stations were built using an EnviroDIY Mayfly data logger board, which is an arduino-based microcontroller that has been developed for low-cost environmental research58. Each Mayfly board was housed within a 2.4 litre Pelican™ box and attached to a 1.5m pole on the riverbank. Sensor stations were equipped with a 3.7V 3800mAh LiPo battery pack and a 6V 3.5W Voltaic solar panel providing an indefinite power source.
Two factory calibrated water quality sensors were attached to each sensor station: a HYDROS-21 conductivity, temperature, and depth sensor (CTD) from METER Group, and a Y11-A nephelometer with wiper from Yosemite Technologies. This sensor combination is recommended for detecting changes in dissolved contaminants (e.g., nitrate-nitrite-N [NNN], dissolved reactive P [DRP]) as well as particulate contaminants (e.g., TN, TP, TSS)16, while being price competitive, reliable, and compatible with the Mayfly system. CTD and turbidity data for each site were captured at 15-minute intervals and stored on a removable SD memory card.
Sensors were maintained via monthly site visits which involved removal of debris, thorough cleaning, and repairing of any damaged wipers. Data was downloaded at six monthly intervals and uploaded to a water management database. This database was designed for hydrological and water quality datasets and contains numerous in-built tools that enable data QA/QC processes. The four continuous data records from each monitoring site were assessed by a qualified environmental data specialist following upload to the database system. The data specialist was tasked with removing erroneous spikes, cross checking abnormalities with field notes, comparing data records to discrete field validation measurements, and adjusting datasets where appropriate (e.g., where sensor drift had occurred).
Storm event sampling
Stormflow sampling was used to capture contaminant dynamics during high-flow periods, which are known to contribute disproportionately to annual nutrient and sediment loads. Stormflow samples were collected using one of two available Teledyne ISCO 6715 autosamplers which were deployed opportunistically in response to incoming weather events. The intake hose of each autosampler was positioned next to the sensor station at each site to ensure that samples were representative of the conditions recorded by the sensors. Autosamplers were programmed to collect samples at a set time interval, commencing in pre-event conditions and, ideally, finishing when the hydrograph returned to baseflow. This generally equated to one sample every 90 minutes throughout an event.
A total of 21 storm events were captured across all 10 monitoring sites, with each site being sampled during at least two weather events. Storm event sampling lasted an average of 35 hours and represented rainfall accumulations of 27mm to 116mm.
Laboratory analysis
All hydrochemical analyses for the routine monitoring programme, as well as additional storm event samples, were carried out by the BOPRC water quality laboratory using IANZ accredited methodology. TN and TP were measured by the persulphate method APHA 4500-P J59 using a Flow Injection Analyser. TSS was measured using a standard glass fibre filter, according to APHA 2540 D60.
Discharge measurements
High-frequency discharge measurements were required at each site to allow for calculation of instantaneous loads. Puanene at SH2 met this requirement as it was a permanent hydrological monitoring site maintained by BOPRC, however discharge records at the other nine sites required development. This process involved manually measuring stream discharge on at least five dates over the latter half of 2022, using either a SonTek FlowTracker2 Acoustic Doppler Velocimeter, a SonTek-RS5 ADCP, or a SonTek M9. These data were combined with sensor water level readings and analysed by qualified BOPRC hydrologists to develop water level-discharge rating curves for each site. Four of the nine rating curves were unreliable due to inconsistent cross-sectional areas. This was addressed at three of the four sites by developing linear relationships between the discharge measurements and continuous discharge measurements from Puanene at SH2. Discharge at Pongakawa at Old Coach Rd was estimated through a multi parameter model that incorporated discharge from a permanent hydrology site in a neighbouring catchment (Waitahanui at Otamarakau Valley Rd, approximately 1 km away from the study catchment), water level from a nearby lake (Lake Rotoiti, approximately 7.5km away from the study catchment), and calendar year. Discharge records were summarised to 15-minute intervals to match the frequency of data from each Mayfly station. Details for each flow relationship are shown in Table 2.
Table 2
Discharge relationships developed for each of the ten monitoring sites.
|
Site
|
Discharge Model Type
|
Predictor Variables
|
Performance
|
|
Mangatoetoe at end of Black Rd
|
Linear relationship
|
● Stream discharge at Puanene at SH2
|
● P < 0.01
● R² = 0.84
|
|
Oeutehuehue at Farm Box Culvert
|
Rating curve
|
● Water level
|
● P < 0.05
● R² = 0.73
|
|
Pokopoko at Allport Rd
|
Rating curve
|
● Water level
|
● P < 0.001
● R² = 0.99
|
|
Pokopoko at Old Coach Rd
|
Linear relationship
|
● Stream discharge at Puanene at SH2
|
● P < 0.001
● R² = 0.93
|
|
Pongakawa at Old Coach Road
|
Artificial Neural Network
|
● Stream discharge at Waitahanui at Otamarakau Valley Rd
● Lake level at Rotoma at Whangaroa
● Calendar year
|
● RMSE = 0.023
● R² = 0.86
● NSE = 0.84
● PBIAS = -0.2
|
|
Pongakawa Tributary at Old Coach Rd
|
Linear relationship
|
● Stream discharge at Puanene at SH2
|
● P < 0.01
● R² = 0.81
|
|
Pongakawa Tributary at Rotoehu Rd
|
Rating curve
|
● Water level
|
● P < 0.05
● R² = 0.96
|
|
Puanene at SH2
|
Rated hydrological site
|
● Water level
|
● P < 0.001
● R² = 0.99
|
|
Wharere at Maniatutu Rd
|
Rating curve
|
● Water level
|
● P < 0.001
● R² = 0.95
|
|
Wharere at Old Coach Rd
|
Rating curve
|
● Water level
|
● P < 0.05
● R² = 0.79
|
Data analysis
TN, TP, and TSS concentrations from BOPRC’s routine sampling programme, and the targeted storm event monitoring programme, were extracted for each site and combined with mean-hourly or mean half-hourly sensor values. Missing sensor values were imputed by linear interpolation for gaps smaller than four hours using the ‘na.approx’ function in the ‘Zoo’ package61, and laboratory measurements that were below the 5th or beyond the 95th percentile were treated as outliers and removed from the dataset. Final datasets ranged from 64 to 101 observations, depending on the site and parameter.
Datasets for each site-parameter combination were used to develop real-time, surrogate, contaminant data records using feed-forward back-propagation ANN machine learning models27. These models used sensor output (conductivity, temperature, depth, turbidity) as predictor variables, and either TN, TP, or TSS concentration as the response variable.
The general modelling procedure for each site-contaminant combination was as follows. Datasets were visually inspected using the ‘ggplot2’ package62 to determine the most appropriate sensor parameters to use, and statistical transformations to apply. They were then split into training and test datasets with a percentage ratio of 70:30, ensuring that an equal proportion of ‘event’ samples were present within each dataset. ANN models were developed using sensor measurements and either TN, TP, or TSS, using the ‘brulee’ package63 of the ‘tidymodels’ ecosystem64, within the R Computing Environment65. Ten-fold cross-validation was applied to the training dataset, and a grid search was used to tune the following hyperparameters: number of epochs, number of hidden units, penalty value, and learning rate.
Finalised models were applied to the isolated test dataset, and four performance metrics were calculated: the coefficient of determination (R2), Nash-Sutcliffe efficiency (NSE), percent bias (PBIAS), and root mean squared error (RMSE). Fifteen-minute resolution, surrogate models of TN, TP, and TSS concentrations were computed by applying each site-specific ANN model to the continuous sensor output. Upper and lower 90% prediction intervals were calculated for each modelled datapoint using split conformal inference methods within the ‘probably’ package66. Model performance was assessed by interpreting performance metrics and visually inspecting the predicted vs observed time series data. Models that had a NSE < 0.5 were deemed not fit for further analysis.
Flow partitioning
Modelled discharge for each site was separated into baseflow and event-flow using a Lyne Hollick recursive baseflow filter67 within the ‘grwat’ package68. This baseflow separation method is widely accepted to be simple, robust and repeatable, and was the preferred method for the New Zealand baseflow index characterisation project and is well suited for streams where a substantial proportion of flow originates from groundwater69. A filter parameter of 0.99 was selected for all sites by comparing the saturated hydraulic connectivity of volcanic soils in Alavani and Tomer70 with optimal filter parameter values in Li et al.71.
Contaminant load calculation
Contaminant loads for TN, TP, and TSS were calculated for each site according to Eq. 1, derived from the numeric integration method in Meals et al.11, where ci is the ith modelled concentration value, qi is the corresponding modelled flow, and ti is the time interval between measurements (fifteen minutes). Sensor data gaps greater than four hours were removed from the dataset to avoid bias.
The same equation was also applied to upper and lower 90% prediction interval datasets. Mean annual baseflow loads were calculated in the same way with the exception of using scaled flow output from the recursive baseflow filter.
Comparison with traditional load estimation
Mean annual load estimates calculated from ANN models were compared to estimates calculated from monthly samples for each site-contaminant combination. Datasets for the traditional method required removal of autosampler and opportunistic samples, leaving 24–29 discrete monthly samples at each site.
Symmetric 90th percentile confidence intervals were calculated for the traditional method using Eq. 2
8, where
is the true population mean,
and
are critical t values with
,
= sample size,
= mean load estimate, and
= the standard deviation of
.
Spatial analysis
Spatial analysis was carried out using the ‘sf’ package72 within the R Statistical Computing Environment. Contributing sub-catchments for each monitoring site were delineated by combining upstream watersheds from the River Environment Classification (REC) digital river network73. This information was used to calculate the annual areal load for each sub-catchment (kg ha− 1 yr− 1).
A
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.