A
Resolving the Relationship Between Microbial Growth and Soil organic carbon via a Principle of Energy Conservation and Maintenance Priority
Affiliations:
Authors: Shumiao Shu1,2, Yuelin Wang3,, Xiaoxiang Zhao4, Xiaodan Wang2*, Wanze Zhu2*
1Tuojiang River Basin High-Quality Development Research Center, Neijiang Normal University, Neijiang 641000, China
2Institute of Mountain Hazard and Environment, Chinese Academy of Sciences, Chengdu 610041, China
3College of Geography and Planning, Chengdu University of Technology, Chengdu 610059,China
4College of Life Science and Environmental Sciences, Central South University of Forestry and Technology, Changsha, 410018, Hunan, China
*Corresponding author. Email: wxd@imde.ac.cn and wzzhu@imde.ac.cn
Abstract
A
Microbial carbon use efficiency (CUE) is often regarded as a key determinant of soil organic carbon (SOC) accumulation, yet this view faces a paradox: SOC continues to accumulate even as rising microbial biomass and associated respiration reduce CUE. We propose a new model (IGMM), grounded in microbial metabolic growth principles, which reveals a power-law relationship between SOC and the ratio of maximum microbial biomass (MICmax) to baseline decomposition rate (BDR). This single ratio explains most global SOC variation (R2 = 0.82). Notably, MICmax and BDR show a "triangular boundary pattern": BDR rises with MICmax initially, declines when MICmax hits ~ 0.4 g C/kg, and SOC then accumulates stably. Unlike the CUE-centric view, which is indirect, MICmax provides an explicit and quantitative basis for SOC dynamics. By establishing MICmax as a unified mechanistic axis, the IGMM opens up a holistic theoretical analysis pathway rooted in the metabolic growth mechanism for understanding soil carbon stabilization processes.
Key words:
Soil organic carbon
Iterative Growth Model for Microbial communities
Maximum microbial biomass
Baseline decomposition rate
Microbial carbon use efficiency
Global soil organic carbon variation
Soil organic carbon (SOC) governs terrestrial carbon cycling, climate feedbacks, and ecosystem functioning worldwide. A major driver of SOC dynamics is soil microbial activity. Microbes play a dual role: they decompose organic matter through extracellular enzymes, releasing carbon, while simultaneously channeling a fraction into persistent pools such as new biomass, metabolic by-products, and necromass (Bradford et al. 2016, Kallenbach et al. 2016, Malik et al. 2018, Woolf and Lehmann 2019, Wang et al. 2021b). This duality is shaped by the rules of microbial metabolic growth rather than passive carbon fluxes.
Microbial carbon use efficiency (CUE) — the fraction of substrate allocated to growth over respiration — reflects microbial metabolic balance and links growth dynamics to soil organic carbon (SOC) storage (Tao et al. 2023). High CUE, identified as more influential than carbon input or decomposition, promotes microbial biomass accumulation and SOC formation. However, CUE represents an emergent indicator of metabolic growth rather than its underlying mechanism. This limitation is illustrated by a paradox: as microbial biomass increases, maintenance respiration (e.g., for macromolecule repair and ion balance) rises, or microbes investing heavily in extracellular enzymes (e.g., cellulases, phenol oxidases) to degrade recalcitrant substrates incur additional respiration costs. Both processes lower CUE, yet SOC continues to accumulate through microbial residues and enzymatic products stabilized by aggregation or mineral association (He et al. 2024). Even when combined with other factors, CUE-based models explain only ~ 54% of global SOC variation (Tao et al. 2023), highlighting the risk of interpreting CUE as a driver rather than a manifestation of underlying metabolic constraints.
ShumiaoShu1,2Email
YuelinWang3
XiaoxiangZhao4
A
XiaodanWang2✉
WanzeZhu2✉Email
1
A
Tuojiang River Basin High-Quality Development Research CenterNeijiang Normal University641000NeijiangChina
2Institute of Mountain Hazard and EnvironmentChinese Academy of Sciences610041ChengduChina
3College of Geography and PlanningChengdu University of Technology610059ChengduChina
4College of Life Science and Environmental SciencesCentral South University of Forestry and Technology410018ChangshaHunanChina
Microbial metabolic growth follows two key principles: maintenance first and energy conservation. Maintenance first prioritizes energy for essential cellular functions (Thornley and Cannell 2000, Shu et al. 2021), while energy conservation links growth, biomass production, and respiration. Based on these principles, we propose an Iterative Growth Model for Microbial communities (IGMM), which shows that the maximum microbial biomass (MICmax), rather than CUE, primarily determines SOC and predicts SOC stocks as a power-law function of the MICmax to baseline decomposition rate (BDR) ratio.
Microbial Metabolic Growth Model (IGMM)
The core of biological growth depends on the coordinated operation and directional allocation of two energy flows: (i) anabolic processes that synthesize new biomass, and (ii) catabolic processes that generate energy. Anabolic pathways use ATP produced by catabolism to build macromolecules such as proteins, lipids, and polysaccharides, retaining most of the chemical energy in the new tissues. In contrast, catabolic pathways release energy through oxidation of organic substrates, producing ATP along with heat and CO₂ as byproducts (Clarke 2019). Thus, autotrophic or heterotrophic respiration (R) can be functionally partitioned into two components: maintenance respiration (Rm), which sustains the functioning of existing tissues, and growth respiration (Rg), which fuels the synthesis of new biomass (i.e., Growth–Maintenance Respiration Paradigm, GMRP).
Typically, Rg scales with the newly synthesized biomass, denoted as f(m), whereas Rm increases with both time (t) and the existing tissue biomass (m). Thus, total respiration can be expressed as:
1
Click here to Correct
where gr denotes the constant amount of respiratory energy required to synthesize one unit of new tissue — i.e., the growth respiration cost per unit biomass — and mr denotes the maintenance respiration rate per unit existing tissue, which may vary over time or with physiological state. Dividing both sides by t yields the respiration rate (r):
2
Click here to Correct
where, g represents the growth rate (f(m)/t), and grg and mrm denote the growth respiration rate (rg) and maintenance respiration rate (rm), respectively. Because rg is not a continuously required process, it can be assumed that under the current state, when rm accounts for its maximum fraction, the corresponding microbial biomass represents its maximum value (mmax), and respiration is dominated by mrmmax:
3
Click here to Correct
Combining Eq.s 2 and 3 yields an expression linking g with the ratio mr/gr and m:
4
Click here to Correct
Scaling this to microbial communities (replacing m with total community biomass MIC, mmax with maximum community biomass MICmax, and g with community growth rate Gmic), we get the core IGMM:
5
Click here to Correct
We hypothesize that MICmax is determined by three functionally coupled factors that regulate MICmax via a "substrate supply–conversion–utilization" chain. First, DOC provides a directly assimilable carbon source, but its contribution depends on microbial carbon partitioning according to the IGMM “maintenance-first” rule; as more DOC is allocated to maintenance respiration, MIC approaches MICmax Second, POC serves as a potential carbon source but is only accessible after ENZ-driven conversion to DOC. Third, ENZ activity mediates the POC-to-DOC conversion rate and efficiency, thus regulating bioavailable DOC supply. Together, these factors exert a synergistic, nonlinear effect on MICmax, which we formalize as a multiplicative power-law function: MICmax = a × DOCα × POCβ × ENZγ, where a is a scaling constant; α, β, and γ represent sensitivities to DOC (including carbon partitioning), POC (as a potential DOC pool), and ENZ (as the conversion driver), respectively. This multiplicative form avoids explicit interaction terms while capturing the underlying synergies.
Subsequently, substituting this into Eq. 5 yields:
6
Click here to Correct
We tested Eq. 6 with a global microbial dataset derived from soil profile observations processed by the PRODA framework (microbial-model MCMC fusion + global deep learning upscaling) (Tao et al. 2023). Because mr/gr is highly sensitive to environmental changes (Thornley and Cannell 2000, Zuo et al. 2012) and exhibits large variation (~ 2.7 orders of magnitude), we retained only its temperature-dependent component (modeled as a power-law function of temperature) to improve fitting robustness, i.e., to reduce the influence of inaccuracies in mr/gr on model performance, and accordingly reformulated Eq. 6, yielding:
7
Click here to Correct
This treatment simplifies its response to the environment, while the effects of other environmental factors that may influence mr/gr are effectively forced into the remaining model parameters and captured during the fitting process. Given the power-law structure of the remaining components in Eq. 7, we expected that this equation would still capture the majority of the variation in Gmic. As anticipated, the fit confirmed this (Fig. 1A; R2 = 0.987, RMSE = 0.012), with all parameters statistically significant. Repeated random splits (50 repeats, 70% training / 30% test) further confirmed the model’s robust predictive performance (Fig. S1; RMSE = 0.0120 ± 0.00019, R2 = 0.986 ± 0.00038). Although the high R² may arise from compensatory effects among parameters such as a, α, β, and λ, which are moderately to strongly correlated (r ≥ 0.64) due to the structure of the power-law product model, these results nonetheless provide preliminary support for the theoretical soundness of the IGMM framework.
Fig. S1
Distribution of R² (A) and RMSE (B) across 50 random training/test splits of Eq. 7
Click here to Correct
After establishing the overall validity of the model, we further examined whether MICmax, calculated from Gmic, mr/gr, and MIC (Eq. 6), could be largely explained by DOC, POC, and ENZ. This regression revealed that MICmax regressed against these variables exhibited substantial explanatory power despite the large variation in mr/gr (Fig. 1B; R2 = 0.94, p < 0.01, RMSE = 0.10). Parameter perturbation tests indicated that DOC, POC, and ENZ acted independently within the regression framework: perturbing any coefficient by 10–30% could not be compensated by adjusting others, and forced compensation even reduced R2 by up to 0.08 (Fig. S2). To rule out chance associations, we conducted a 500-iteration permutation test. By randomly shuffling the log-transformed MICmax data and refitting the model, we found the actual model's F-statistic (156813.84) far exceeded the maximum permuted value (5.28) (p < 0.01). These results underscore that the model's explanatory power is driven by genuine biological contributions from these variables, which operate independently rather than by statistical chance.
Fig. 1
Validation of the IGMM using microbial growth rate, Gmic (A) and maximum microbial biomass, MICmax (B)
Click here to Correct
The red dashed lines in A and B were y = 0.99x + 0.01 (R2 = 0.99, p < 0.01) and y = 0.94x − 0.01 (R2 = 0.94, p < 0.01), respectively. Model parameters in A were p = − 0.0008 ± 0.00, q = − 2.86 ± 0.00, a = 42.81 ± 0.36, α = 0.14 ± 0.00, β = −0.13 ± 0.00, and γ = 0.07 ± 0.00; in B, parameters were a = 2.25 ± 0.025, α = − 0.73 ± 0.002, β = 0.69 ± 0.002, and γ = 0.92 ± 0.002.
SOC determined by maximum microbial biomass/baseline decomposition rate Ratio
We first hypothesize MICmax exerts a positive driving effect on SOC accumulation. Mathematically, MICmax comprises two components: MIC and biomass-equivalent Gmic × gr/mr, both supporting SOC accumulation via two core pathways. MIC continuously replenishes POC and DOC pools (and thus promotes SOC input) through necromass turnover (Wang et al. 2021a) and metabolic by-products (e.g., small organic acids); Gmic is closely associated with the secretion of extracellular polymeric substances (EPS) and ENZ. These substances collectively promote the sequestration of DOC/POC into stable SOC by enhancing the physical protection of POC and improving the efficiency of substrate conversion to DOC. Meanwhile, the baseline decomposition rate (BDR) reflects the intrinsic decomposability of substrates (e.g., POC, DOC) and their inherent carbon loss potential, indirectly influencing SOC turnover and stability. The opposing effects of MICmax and BDR can be integrated as the MICmax/BDR ratio, providing a mechanistic predictor for SOC dynamics. Notably, environmental sensitivity is incorporated via mr, rather than BDR itself.
Thus, we then hypothesize SOC can be expressed as a power-law function of MICmax/BDR, linking microbial carbon regulation capacity and substrate decomposition constraints to predict SOC dynamics:
8
Click here to Correct
Eq. 8 can explained 82% of global SOC variation (Fig. 2A; p < 0.01, RMSE = 0.15), with parameter ɛ and θ estimated at 2.89 ± 0.012 and 0.75 ± 0.002, respectively. The differences and ratios between fitted and observed ln SOC were approximately normally distributed (Fig. S3), indicating that the observed - predicted relationship is largely isometric.
Fig. S3
Distribution and normality check of ln SOC residuals and ratios
A and C: Blue curve shows the observed density; red curve is the normal distribution reference based on the observed mean and standard deviation.
Click here to Correct
Further analysis revealed a “triangular boundary pattern” of SOC accumulation, consistently observed across different climatic zones (Fig. 2B), with high SOC largely concentrated along the lower boundary, indicating primary control by MICmax. Even when MICmax is low, a high MICmax/BDR ratio can sustain SOC. Increasing MICmax or BDR is accompanied by convergence of BDR or MICmax, reflecting a regulated dynamic between microbial capacity and background decomposition. Mechanistically, MICmax is constrained by substrate decomposability. However, as MICmax increases, the secretion of polysaccharides and residues enhances aggregate formation and mineral binding, providing physical and chemical protection to organic matter, reducing its effective BDR, and thereby promoting SOC stabilization (Schimel and Schaeffer 2012, Tao et al. 2023, He et al. 2024). This balance between these process shifts, and the microbial role in SOC gradually transitions from decomposition to protection, with the transition point occurring at approximately MICmax ≈ 0.40 (Fig. S4).
Fig. S4
Triangular distribution and boundary relationship between maximum microbial biomass (MICₘₐₓ) and baseline decomposition rate
Red, blue, and green lines indicate the triangle edges with vertices at (0,0), (0.4,0.45), and (4,0.1); grey points are within the triangle, encompassing over 99% of samples.
Click here to Correct
Finally, based on this triangular pattern and assuming a random distribution of MIC/MICmax between 0 and 1, we predicted the distribution of the MIC–SOC relationship. This prediction was supported by more than 1,000 global measurements encompassing a wide range of terrestrial environments (Fig. 3), providing empirical validation for the proposed theoretical framework.
Fig. 2
Explaining SOC by the MICmax - to - BDR ratio (A) and SOC distribution along MICmax and BDR gradients across climatic zones (B)
Click here to Correct
The red dashed lines in A was y = 0.82x + 0.50 (R2 = 0.82, p < 0.01)
Fig. 3
Measured MIC–SOC nested within the predicted MIC–SOC relationship
Click here to Correct
Discussion
Our central finding is clear: MICmax — not CUE — emerges as the fundamental microbial regulator of soil organic carbon (SOC) accumulation. Unlike CUE, which merely partitions substrate between biomass and respiration, MICmax integrates current biomass with its growth potential, directly capturing the microbial lever that drives SOC storage. In other words, what CUE hints at, MICmax reveals explicitly and quantitatively.
Under the CUE framework, high SOC in boreal soils is attributed to decomposition suppression by low temperatures, despite low CUE, whereas lower SOC in tropical soils is attributed to accelerated decomposition under high temperatures, despite high CUE (Tao et al. 2023). This explanation merely reconciles CUE – SOC patterns but represents redundant attribution: CUE is an emergent property of climate, substrate, and microbial metabolism, not an independent mechanistic driver. Environmental effects are internalized by microbial metabolism; for example, microbes internalize environmental acceleration of decomposition through metabolic adjustments that also modulate CUE (He et al. 2024), and thus cannot independently explain SOC accumulation.
In contrast, Eq. 8 identifies MICmax as the microbial basis of SOC accumulation, jointly determined by MIC and its growth rate–equivalent biomass (Gmic × gr/mr). In boreal soils, high MIC and Gmic × gr/mr (Fig. S5) sustain continuous residue and enzyme inputs, enabling SOC accumulation even under elevated decomposition. The predicted triangular boundary pattern suggests a functional transition in microbial metabolic growth influence from decomposition-dominated to stabilization-dominated pathways (Fig. 3). By integrating environmental influences and the effects of microbial secretions and residues on SOC, MICmax serves as a unifying mechanistic axis explaining SOC patterns across climates.
Fig. S5
Comparison of ln MIC and ln MIC·gr/mr across different climatic zones
There are significant differences in the mean values among the comparisons of various climatic zones, with p < 0.01
Click here to Correct
Variance decomposition within the model (Eq. 5) shows that mr/gr contributes 89.1% to MICmax variation — far exceeding Gmic (0.14%) and MIC (10.7%) — identifying it as the primary driver of MICmax fluctuations. Notably, gr/mr not only reflects the intimate coupling of environment and microbial physiology but also scales mathematically with the time for MIC to reach MICmax (Shu et al. 2021). This implies the environment regulates SOC accumulation more prominently by modulating microbial growth time: although growth time and growth rate are negatively correlated (r = − 0.54, p < 0.01), the greater variance in growth time translates to a stronger impact on MICmax. Consistent with this mechanism, the IGMM fully accounts for both systems: low boreal temperatures reduce growth rate (low CUE), prolong growth time, and increase MICmax, elevating SOC; high tropical temperatures accelerate growth (high CUE), shorten time, and decrease MICmax, reducing SOC.
In summary, the IGMM, by quantifying how microbial growth dynamics shape MICmax, transcends the CUE paradigm and grounds SOC control in the two universal principles of life: maintenance priority and energy conservation. This first-principles structure governs the core metabolic trade-off (mr/gr) that determines the final capacity and exhibits universality—it not only underpins our microbial model but has also been applied in previous studies to explain forest carbon sinks (Shu and Wang 2024, Yao et al. 2025). Our study holds strong potential to open new avenues for integrating above- and belowground carbon processes within a unified mechanistic framework.
References
Bradford MA, Wieder WR, Bonan GB, Fierer N, Raymond PA, Crowther TW (2016) Managing uncertainty in soil carbon feedbacks to climate change. Nat Clim Change 6:751–758
Clarke A (2019) Energy Flow in Growth and Production. Trends Ecol Evol 34:502–509
He X, Abs E, Allison SD, Tao F, Huang Y, Manzoni S, Abramoff R, Bruni E, Bowring SPK, Chakrawal A, Ciais P, Elsgaard L, Friedlingstein P, Georgiou K, Hugelius G, Holm LB, Li W, Luo Y, Marmasse G, Nunan N, Qiu C, Sitch S, Wang Y-P, Goll DS (2024) Emerging multiscale insights on microbial carbon use efficiency in the land carbon cycle. Nat Commun 15:8010
Kallenbach CM, Frey SD, Grandy AS (2016) Direct evidence for microbial-derived soil organic matter formation and its ecophysiological controls. Nat Commun 7:13630
Malik AA, Puissant J, Buckeridge KM, Goodall T, Jehmlich N, Chowdhury S, Gweon HS, Peyton JM, Mason KE, van Agtmaal M, Blaud A, Clark IM, Whitaker J, Pywell RF, Ostle N, Gleixner G, Griffiths RI (2018) Land use driven change in soil pH affects microbial carbon cycling processes. Nat Commun 9:3591
Schimel J, Schaeffer SM (2012) Microbial control over carbon cycling in soil. Front Microbiol Volume 3–2012
A
Shu S-m, Zhu W-z, Kontsevich G, Zhao Y-y W.-z. Wang, X.-x. Zhao, and X.-d. Wang. 2021. A discrete model of ontogenetic growth. Ecol Model 460:109752
Shu S, Wang X (2024) Metabolic growth mechanisms and theoretical growth potential of global woody plant communities. bioRxiv:2024.2010.2002.616230
Sun T, Wang Y, Hui D, Jing X, Feng W (2020) Vertical distributions of soil microbial biomass carbon: a global dataset. Data Brief 32:106147
Tao F, Huang Y, Hungate BA, Manzoni S, Frey SD, Schmidt MWI, Reichstein M, Carvalhais N, Ciais P, Jiang L, Lehmann J, Wang Y-P, Houlton BZ, Ahrens B, Mishra U, Hugelius G, Hocking TD, Lu X, Shi Z, Viatkin K, Vargas R, Yigini Y, Omuto C, Malik AA, Peralta G, Cuevas-Corona R, Di Paolo LE, Luotto I, Liao C, Liang Y-S, Saynes VS, Huang X, Luo Y (2023) Microbial carbon use efficiency promotes global soil carbon storage. Nature 618:981–985
Thornley JHM, Cannell MGR (2000) Modelling the Components of Plant Respiration: Representation and Realism. Ann Botany 85:55–67
Wang B, An S, Liang C, Liu Y, Kuzyakov Y (2021a) Microbial necromass as the source of soil organic carbon in global ecosystems. Soil Biol Biochem 162:108422
Wang C, Qu L, Yang L, Liu D, Morrissey E, Miao R, Liu Z, Wang Q, Fang Y, Bai E (2021b) Large-scale importance of microbial carbon use efficiency and necromass to soil organic carbon. Glob Change Biol 27:2039–2048
Woolf D, Lehmann J (2019) Microbial models with minimal mineral protection can explain long-term soil organic carbon persistence. Sci Rep 9:6522
Yao Y, Shu S-M, Feng J, Wang P, Jiang H, Wang X-D, Zhang S (2025) Convergence and differentiation of tree radial growth in the Northern Hemisphere. Agric For Meteorol 360:110300
Zuo W, Moses ME, West GB, Hou C, Brown JH (2012) A general model for effects of temperature on ectotherm ontogenetic growth and development. Proceedings of the Royal Society B: Biological Sciences 279:1840–1846
Acknowledgments:
We thank Mr. Zhiqiang Xiao for important advice concerning the specific manuscript and all the researchers who provided accessible data.
A
Funding:
This work was supported by the National Natural Science Foundation of China (32201374).
A
Author contributions:
Conceptualization: Shumiao Shu
Methodology: Shumiao Shu, Xiaoxiang Zhao
Visualization: Shumiao Shu, Yuelin Wang
Funding acquisition: Xiaodan Wang
Project administration: Xiaodan Wang, Wanze Zhu
Supervision: Xiaodan Wang, Wanze Zhu
Writing – original draft: Shumiao Shu
Writing – review & editing: Shumiao Shu, Xiaodan Wang, Wanze Zhu, Xiaoxiang Zhao, Yuelin Wang
A
Competing interests:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
A
Data Availability Statement
Statement:
All the data that support the findings of this study are available at https://zenodo.org/records/7676632 and https://zenodo.org/records/3970867.
Code will be made available on request.
Methods
Data sources
Our model was driven by data reported in Tao et al. (2023) (https://doi.org/10.5281/zenodo.7676632), rooted in observational soil profile data. The raw profiles (n = 113,926) were obtained from two sources: (1) the World Soil Information Service (WoSIS) 2019 snapshot (111,380 profiles across 173 countries); and (2) supplementary permafrost-region datasets (2,546 profiles from North America, northern Eurasia, and the Qinghai–Tibet Plateau). Stepwise preprocessing excluded shallow profiles (≤ 2 layers or depth ≤ 50 cm; 72,377 retained), removed inconsistent Markov Chain Monte Carlo (MCMC) results (Gelman–Rubin statistic > 1.05; 59,476 retained), and discarded poor fits (Nash–Sutcliffe efficiency < 0.0), yielding 57,267 high-quality vertical SOC profiles. These were analyzed with the Process-guided deep learning and Data-driven modelling (PRODA) framework: site-level parameters were estimated by fusing each profile with a microbial-process explicit model via MCMC data assimilation (with the empirical constraint that steady-state microbial biomass does not exceed 10% of total SOC), then upscaled to 0.5° global raster maps using a fully connected deep learning model (4 hidden layers, 256–512 nodes, ReLU activation, Adadelta optimizer, L1×L2 loss) with 60 environmental covariates. Uncertainty was assessed through 200 bootstrapped runs (90% training / 10% validation) to derive 2σ confidence intervals.
We performed targeted selection on the above dataset to obtain variables matching the study’s theoretical model (IGMM) and SOC-related indicators. For model operation, we directly extracted core indicators including microbial biomass (MIC), microbial growth rate (Gmic), microbial total respiration rate (r), and maintenance respiration rate (rm), and further calculated two key derived coefficients based on these variables: maintenance respiration coefficient (mr), defined as rm divided by MIC (mr = rm/MIC), and growth respiration coefficient (gr), calculated as the quotient of growth respiration (r minus rm) to Gmic (gr = (r - rm)/Gmic). For SOC-related indicators, we extracted variables including soil organic carbon (SOC), dissolved organic carbon (DOC), particulate organic carbon (POC), extracellular enzyme activity (ENZ), and baseline decomposition rate (BDR). To ensure data quality, we excluded outliers of the derived mr, gr, and their ratio using the Interquartile Range (IQR) method with a threshold of 1.5 × IQR, thereby retaining over 35,000 valid data entries for subsequent analyses. Other environment factor: mean annual temperature (MAT) is consistent with the PRODA framework’s input datasets (SoilGrids, CLM5 forcing data).
In addition to the PRODA outputs, we incorporated two independent measured datasets to validate our analyses. The first is a compilation by Tao et al. (2023), which includes over 50 paired MIC–SOC observations from 46 terrestrial sites worldwide. The second is the VDMBC dataset compiled by Sun et al. (2020), comprising 1,040 MIC–SOC observations from 289 soil profiles across five continents. From both datasets, we extracted paired MIC and SOC values to complement and validate the model-based results.
Statistical analyses
The relationship among Gmic, MAT, MIC, and MICmax (Eq. 8, where MICmax is assumed as a function of DOC, POC, and ENZ) was evaluated using weighted nonlinear regression, incorporating kernel density-based weights (WNLS-KD) in R (Fig. 1A). To further evaluate the robustness of Eq. 8, we repeatedly (50 times) split the dataset into training (70%) and test (30%) subsets, fitted the model using WNLS-KD in R, and quantified predictive performance on the held-out test data. Model performance was summarized using RMSE and R² distributions across the repeated splits, and visualized in Fig. S1.
When Eq. 7 proved valid, we estimated MICmax using the calculated mr/gr, MIC and Gmic (Eq. 7) and subsequently evaluated its functional dependence on DOC, POC, and ENZ (i.e., MICmax = a × DOCα × POCβ × ENZγ) using log-transformed kernel density-weighted least squares (WLS-KD) in R (Fig. 1B). To assess the sensitivity and potential compensation among the predictor variables, we performed a Monte Carlo-based perturbation analysis in R. Specifically, the coefficients associated with DOC, POC, and ENZ were systematically perturbed by 10–30%, and compensatory adjustments were applied to one of the other variables at 0%, 50%, or 100% of the perturbation magnitude. For each disturbance – compensation combination, 1,000 simulations were conducted to calculate the resulting R2 relative to the original model. Three variable pairs were examined: DOC disturbance with POC compensation, DOC disturbance with ENZ compensation, and ENZ disturbance with DOC compensation. The mean R2 across simulations was used to quantify the effect of perturbations and compensatory adjustments, and the results were visualized to illustrate the robustness of model predictions to parameter variability (Fig. S2). To rule out chance-driven associations, we conducted a 500-iteration permutation test consistent with our WLS-KD framework. For each iteration, we randomly shuffled the log-transformed MICmax data to break its link with the predictors and then refit the weighted linear model. We then extracted the F-statistic from this permuted model. The statistical significance of our actual model was determined by comparing its F-statistic to the distribution of F-statistics generated from the 500 permuted models. The p-value was calculated as the proportion of permuted F-statistics that were equal to or greater than the actual model’s F-statistic. This approach confirmed whether the explanatory power of our model reflects a genuine relationship or a random data structure.
To test Eq. 8, we fitted this power-law function to the observed SOC data using log-transformed WLS-KD in R (Fig. 2A), and compared the distributions of the differences and ratios between the predicted and observed values on the ln scale (Fig. S3). Moreover, we show the distribution of SOC with respect to MICmax and BDR across different climatic zones (Fig. 2B). Based on this distribution (Fig. S4) and assuming a random distribution of MIC/ MICmax between 0 and 1, we generated the predicted MIC–SOC relationship, which was then compared with over 1,000 measured MIC–SOC relationships to evaluate the robustness and applicability of the model predictions across different environmental conditions (Fig. 3).
Finally, we compared microbial MIC and Gmic × gr/mr across different climatic zones to examine potential variation in microbial growth characteristics. For comparisons that do not assume a specific data distribution, we applied a nonparametric test (Kruskal–Wallis test for multiple groups) in R (Fig. S5). For MICmax, we quantified the relative contribution of each variable (Gmic, MIC, and mr/gr) to its total variance. We used the first-order error propagation theory (Delta Method), which approximates the variance contribution of each variable based on the model's partial derivatives and the variable's own variance. This allowed for a direct estimation of the percentage contribution of each variable to MICmax fluctuations.
All statistical analyses and modeling were performed using R (version 4.2.0; R Core Team).
Supplementary Materials
Resolving the Relationship Between Microbial Growth and Soil SOC via a Principle of Energy Conservation and Maintenance Priority
Total words in MS: 3650
Total words in Title: 19
Total words in Abstract: 150
Total Keyword count: 6
Total Images in MS: 7
Total Tables in MS: 0
Total Reference count: 16