The perception of the impact of Staffing and Study Design on Performance Outcomes in Clinical Trials

Vincenzina Mora 1 Emailvincenzina.mora@policlinicogemelli.it

Daniele Napolitano 1,4✉ Emailmattia.bozzetti@asst-cremona.it Emaildaniele.napolitano@policlinicogemelli.it

Antonio Gasbarrini 1,3 Emailantonio.gasbarrini@unicatt.it

1 Clinical Trial Office Fondazione Policlinico Universitario A. Gemelli IRCCS Rome Italy

2 Direction of Health Professions ASST Cremona Cremona Italy

3 Università Cattolica del Sacro Cuore Rome Italy

4 Clinical Trial Office Fondazione Policlinico Universitario A. Gemelli IRCCS Largo Francesco Vito 1 00168 Rome, Rome Italy, Italy

• Vincenzina Mora¹ – vincenzina.mora@policlinicogemelli.it

• Daniele Napolitano¹ (Corresponding Author) – daniele.napolitano@policlinicogemelli.it

• Mattia Bozzetti² - mattia.bozzetti@asst-cremona.it

• Antonio Gasbarrini^1,3 – antonio.gasbarrini@unicatt.it

Affiliations:

1. Clinical Trial Office, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy

2. Direction of Health Professions, ASST Cremona, Cremona, Italy

3. Università Cattolica del Sacro Cuore, Rome, Italy

Corresponding author:

Daniele Napolitano

Email: daniele.napolitano@policlinicogemelli.it

Affiliation: Clinical Trial Office, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy Address: Largo Francesco Vito 1, 00168 Rome, Italy

Abstract

Background

Protocol complexity in clinical trials has increased substantially, yet empirical evidence linking staffing configurations, complexity, and operational performance remains limited. This study examined how team composition and study design relate to perceived performance using a validated site-performance instrument and a standardized complexity classifier.

Methods

A cross-sectional study was conducted in a high-volume academic clinical trials unit (January–March 2025). For all actively recruiting or ongoing trials, contextual characteristics, protocol complexity (Ontario Protocol Assessment Level, OPAL), and staffing (Study Nurses, Study Coordinators, Data Managers) were recorded. Site performance was assessed using the Clinical Trial Site Performance Measure (CT-SPM), completed jointly by each trial team. Non-parametric tests and canonical discriminant analysis evaluated differences across phases and designs; univariate analyses of variance examined associations between staffing, complexity, and performance.

Results

A total of 362 trials were included, predominantly oncology (67.7%). Mean performance scores indicated overall good functioning. Participant retention and adverse event reporting differed across phases, with Phase I outperforming later phases. Study design showed larger effects: observational studies exceeded randomized controlled trials in retention, data quality, protocol compliance, and overall performance. Staffing demonstrated strong associations with performance: Study Nurses showed very large effects on retention (η²p = 0.86) and substantial effects on data quality and compliance; Study Coordinators contributed moderate-to-large effects; Data Managers showed smaller, domain-specific contributions. OPAL-defined complexity did not explain additional variance once staffing was considered.

Conclusions

Perceived site performance is influenced more by staffing composition than by protocol complexity. Study Nurses and Study Coordinators—positioned at the clinical–operational interface—appear central in sustaining participant-facing and data-facing processes. Integrating standardized complexity assessments with structured performance measures may support a plan-and-prove approach to staffing and operational oversight.

Trial registration

Not applicable.

KEYWORDS:

Clinical trial operations

Study Nurse

Study Coordinator

Data Manager

Protocol complexity

OPAL

CT-SPM

Site performance

Workload

Trial methodology

Introduction

Clinical trials have become increasingly complex due to the development of sophisticated protocol designs, the introduction of new technologies, and the growing demands for expanded data. Recent evidence shows that trial sites view protocol complexity as a greater challenge than staff turnover or limited resources (1, 2). The Tufts CSDD likewise reports that oncology trials—marked by global scope, molecularly driven designs, and restrictive eligibility criteria—are especially burdensome, with more endpoints, amendments, and intensive procedures (3–5). Broader analyses confirm a decade-long increase in procedural volume and complexity, resulting in longer development timelines and operational inefficiencies (6, 7).

The operational impact of this trend is substantial. Complex designs require strong infrastructure, multidisciplinary expertise, and institutional support to avoid overwhelming staff (8). Insufficient resources and poor workload management disproportionately affect frontline professionals, including Study Nurses (SNs), Study Coordinators (SCs), and Data Managers (DMs)(9). These roles cover patient recruitment, follow-up, data quality, safety reporting, protocol compliance, and regulatory readiness—key determinants of trial feasibility (10, 11). Evidence shows that SCs act as vital links between protocols and patient care (12), while SNs ensure ethical and safe procedures, and DMs safeguard data integrity and traceability (13).

Healthcare research consistently demonstrates that increased nursing workloads are associated with burnout, reduced performance, and adverse patient outcomes (14, 15). Comparable mechanisms, such as cognitive overload, prioritization trade-offs, and reduced vigilance, are likely to operate within research settings, threatening both data integrity and protocol compliance (16, 17). Surveys of clinical research staff confirm that unbalanced workload is a significant determinant of job dissatisfaction, turnover intention, and declining site performance (10, 12, 18, 19).

To address these challenges, several workload assessment approaches have been developed to align protocol complexity with staff capacity. However, their integration with validated site performance indicators remains limited, leaving a critical gap in optimizing workforce allocation and sustaining research quality. Despite these developments, the integration of validated site performance indicators with workload assessment remains limited (9, 20). The Clinical Trial Site Performance Measure (CT-SPM) (21) offers a unique opportunity to bridge this gap by systematically linking staffing demands with measurable trial outcomes. Such integration would not only optimize workforce allocation but also ensure the sustainability and quality of clinical research delivery.

Materials

Aims

This study aimed to

Setting

A cross-sectional study (January–March 2025) was conducted in [blinded].

Inclusion Criteria

All the protocols conducted at the clinical trial unit were eligible.

Study Variables

Contextual descriptors

Study-level information included therapeutic area, trial phase (I–IV), translational orientation (clinical vs. translational), study design (e.g., randomized controlled, basket, platform, or cluster trial), and treatment type (single vs. multiple contact). Additional items were recorded to indicate whether special procedures (e.g., GCP training, informed consent documentation, adverse event reporting, complex data management) or centralized processes (randomization, monitoring, data management) were applied.

Clinical Trial Site Performance Measure

The CT-SPM (21) was initially conceived as a framework to identify and validate practical indicators for monitoring the operational conduct of clinical trials with optimal fit and reliability (ω = 0.65–0.82; ωh = 0.70) and sufficient discriminative accuracy (AUC = 0.628) to identify underperforming sites. In this study, however, it was employed to capture staff perceptions of site performance, thereby integrating subjective experience into a structured evaluation. To minimize single-rater bias, the questionnaire was completed jointly by all members of each trial team, ensuring that responses reflected shared operational expertise rather than individual views.

The CT-SPM is a behaviorally anchored questionnaire with items scored on a 5-point Likert scale (1 = Not frequent to 5 = Highly frequent). The full version comprises 18 items, organized into four domains: Participant Retention and Consent (F1) – dropout, consent, withdrawal; Data Completeness and Timeliness (F2) – CRF timeliness, missing data, query management; Adverse Event Reporting (F3) – accuracy and frequency of AE/SAE documentation; Protocol Compliance (F4) – deviations, violations, adherence to eligibility and procedures.

Workload Measures

Workload was assessed using the Ontario Protocol Assessment Level (OPAL) (22). OPAL is a protocol complexity rating tool that stratifies clinical trials along an 8-level hierarchical scale (Levels 1–8), ranging from non-interventional studies to highly complex early-phase interventional trials. Optional modifiers (+ 0.5 each) allow further granularity by accounting for features such as translational design, CRO/industry sponsorship, intensive monitoring, or extended study duration, yielding a maximum score of 10.

Staffing Variables

For each clinical trial included in the analysis, we documented the number of SCs, SNs, and DMs formally assigned to protocol execution.

Data Analysis

Descriptive statistics were first computed for all study and staffing variables, with results expressed as appropriate. To examine differences in site performance across study characteristics (e.g., phase, design, sponsorship), we applied non-parametric tests. Specifically, the Kruskal–Wallis test was used for multi-group comparisons, followed by Dunn’s post hoc tests with Holm correction to adjust for multiple comparisons. Effect sizes were estimated using η²[H] for omnibus tests and Cliff’s delta (δ) for pairwise contrasts, with 95% confidence intervals provided.

To further explore the multivariate structure underlying performance outcomes, we performed a Canonical Discriminant Analysis (CDA). Standardized canonical coefficients and group centroids were used to interpret discriminant functions and assess their contribution to distinguishing between trial phases and designs.

The association between staffing composition and site performance was investigated through univariate analyses of variance (ANOVA), with performance domains (CT-SPM factors and the Mokken short scale) as dependent variables and the number of SN, SC, and DM as predictors. Partial eta squared (η²p) values were reported to quantify effect sizes.

Finally, we assessed the role of trial complexity (OPAL) as a covariate to test its contribution to variance in performance outcomes. All analyses were conducted using R version 4.5.0 (23), with significance set at p < 0.05 (two-tailed).

Results

Most of the trials (n = 362) included were conducted in Oncology (67.68%) with < 50 subjects enrolled (96.41%). Trial characteristics are reported in Table 1.

Table 1

Characteristics of included trials
Phase
Phase I	49 (13.5%)
Phase II	99 (27.3%)
Phase III	178 (49.2%)
Phase IV	12 (3.3%)
Non Phase	22 (6.1%)
Study
Other	86 (23.8%)
Observational	161 (44.5%)
RCT	115 (31.8%)
CRO Involvement
No	13 (3.59%)
Yes	349 (96.41%)
Sponsor
No	15 (4.14%)
Yes	347 (95.86%)
Translational
No	130 (35.91%)
Yes	231 (63.81%)

Insert Table 1 here

Performance Indicators

Participant Retention (F1) was M = 3.10 (SD = 0.87), Data Quality (F2) was M = 3.11 (SD = 0.36), and Adverse Events (F3) was M = 2.97 (SD = 0.40). Protocol Compliance (F4) followed a similar pattern, with a mean of 3.05 (SD = 0.35). Finally, the overall performance showed comparatively higher values, with a mean of 3.40 (SD = 0.56).

Differences in Performance

The Kruskal–Wallis test revealed significant differences between phases for Participant Retention (F1) (χ²(5) = 50.000, p < 0.001, η²[H] = 0.126, moderate), Adverse Events (F3) (χ²(5) = 16.900, p = 0.005, η²[H] = 0.033, small), and Overall (χ²(5) = 25.400, p < 0.001, η²[H] = 0.057, small) (Fig. 1). For study design, large effect sizes were found for Participant Retention (F1) (η²[H] = 0.689) and Data Quality (F2) (η²[H] = 0.254), with moderate effects for Protocol Compliance (F4) (η²[H] = 0.095) and Overall (η²[H] = 0.100). Post-hoc analyses with Holm adjustment confirmed that, within trial phase, Participant Retention (F1) scores in Phase I were significantly higher than in Phases II, III, IV, and non-Phase (all p < 0.006). Observational studies scored considerably higher than other study types in Participant Retention (F1) and Data Quality (F2). Significant magnitude effects were observed in key contrasts, such as Phase I vs. Not applicable for Participant Retention (F1) (δ = 0.838, 95% CI [0.603, 0.939]) and RCT vs. Observational (δ = -0.951, 95% CI [-0.981, -0.876]).

Differences by study design were more pronounced. Observational studies achieved the highest Retention and Data Quality, clearly outperforming RCTs. In Retention, RCTs scored significantly lower than Observational studies, with a significant effect (δ = − 0.867, 95% CI [–0.918, − 0.788], p < 0.001), and lower than other studies as well (δ = − 0.558, 95% CI [–0.680, − 0.406], p < 0.001). Data Quality was also higher in Observational studies than in RCTs, with a medium effect (δ = − 0.400, 95% CI [–0.515, − 0.272], p < .001). Protocol Compliance favored Observational studies over RCTs with a small effect (δ = − 0.243, 95% CI [–0.362, − 0.117], p < .001). The overall performance index was also higher in Observational Studies compared to RCTs, with a medium effect (δ = − 0.382, 95% CI [–0.498, − 0.252], p < 0.001). For AEs, no significant difference was found between RCTs and Observational studies (δ = 0.000, 95% CI [–0.137, 0.137], p = 0.999), while RCTs reported slightly more events than other studies, with a small effect (δ = 0.201, 95% CI [0.046, 0.347], p = 0.037).

Canonical Discriminant Analysis

The canonical discriminant analysis yielded two significant dimensions. The first canonical variate (Can1), which explains 65.9% of the variance, was strongly defined by participant retention (standardized coefficient = 1.17) and, to a lesser extent, adverse events (0.44). This function clearly distinguished Phase I trials (M = 1.09) from all later phases, which showed substantially lower centroids. The second canonical variate (Can2), accounting for an additional 30.2% of the variance, was characterized by positive loadings on data quality (0.51) and negative loadings on adverse events (–0.68). This function separated Phase IV (M = − 1.11) and non-classifiable trials (M = − 0.74) from Phases II and III, which clustered near the positive end of the axis. Together, the two canonical functions captured over 96% of discriminative variance, underscoring the robustness of phase-related differences across performance outcomes.

Staffing impact on outcomes

The effect size estimates indicated that SNs exerted a very large influence on trial performance, particularly in participant retention (η²p = 0.86, large), data quality (η²p = 0.23, large), and protocol compliance (η²p = 0.07, medium) as shown in Table 2. SCs also showed meaningful contributions, with moderate-to-large effects across multiple outcomes, including data quality (η²p = 0.08) and adverse event reporting (η²p = 0.03). In contrast, DMs had only minor effects, limited mainly to adverse event outcomes (η²p = 0.02). Trial complexity did not explain meaningful variance in any domain, with consistently negligible effect sizes (all η²p < 0.01).

Table 2

Staffing predictors of performance outcomes in clinical trials
Predictor	F1 – Participant Retention F(p) [η²p]	F2 – Data Quality F(p) [η²p]	F3 – Adverse Events F(p) [η²p]	F4 – Protocol Compliance F(p) [η²p]	Mokken Short F(p) [η²p]
Study Nurse	2216.39 (< 0.001) *** [η²p = 0.86]	106.67 (< 0.001) *** [η²p = 0.23]	2.44 (0.119) [η²p = 0.01]	28.78 (< 0.001) *** [η²p = 0.07]	60.74 (< .001) *** [η²p = .15]
Study Coordinator	79.99 (< 0.001) *** [η²p = 0.18]	30.49 (< 0.001) *** [η²p = 0.08]	9.42 (0.002) ** [η²p = 0.03]	3.62 (0.058) [η²p = 0.01]	19.98 (< .001) *** [η²p = .05]
Data Manager	0.74 (0.389) [η²p < 0.01]	3.21 (0.074) [η²p = 0.01]	5.69 (0.018) * [η²p = 0.02]	0.08 (0.777) [η²p < 0.01]	0.49 (.484) [η²p < .01]
Trial Complexity	1.55 (0.214) [η²p < 0.01]	1.68 (0.196) [η²p < 0.01]	0.01 (0.914) 0.02 [η²p < 0.01]	0.46 (0.499) [η²p < 0.01]	0.17 (.678) [η²p < .01]
Notes: Bold values are statistically significant; Sign Codes: * p < 0.001; p < 0.01; * p < 0.05

Insert Table 2 here

Discussion

This study examined how clinical trial staff perceive the impact of team composition and study design on site-level operational performance. Using a validated, behaviorally anchored instrument and a structured complexity classifier, we triangulated staffing patterns with domain-specific performance outcomes across a heterogeneous set of protocols. Taken together, the findings argue for a dual, complementary approach to trial operations: plan capacity proportionate to protocol demands (via OPAL) and verify performance through recurrent, domain-specific signals (via CT-SPM).

Across CT-SPM domains, mean values generally indicated optimal performance, with overall scores exceeding those of individual domains—consistent with teams synthesizing multiple practices into a positive overall appraisal. However, between-group contrasts were informative. Differences by phase showed that early-phase studies clustered with stronger participant-facing performance (Retention), while design contrasts were more pronounced: observational studies outperformed RCTs on Retention and Data Quality, with medium-to-large effects, and showed modest advantages on Protocol Compliance; AE reporting differences were minimal between observational studies and RCTs. This pattern aligns with operational realities: RCTs concentrate risk at the intersection of protocol intensity, endpoint burden, and multi-site coordination, which can compromise timeliness and increase queries, even when scientific rigor is higher. Observational designs, in contrast, involve fewer invasive procedures and narrower safety constraints, allowing teams to preserve data completeness and timeliness, as well as patient workflow continuity.

The canonical discriminant analysis adds a multivariate perspective: a first function dominated by Retention (with some AE contribution) separates early-phase studies from later phases, while a second function contrasting Data Quality (positive) against AEs (negative) discriminates Phase IV and “Non-Phase” from Phases II–III. This suggests that perceived performance coalesces around two recognizable planes—participant-facing and data-facing—which is precisely the hierarchical structure supported by CT-SPM psychometrics (F1–F4 nested within higher-order dimensions). From a Quality-by-Design perspective, these planes are actionable: if a protocol’s features are expected to stress participant interfaces (e.g., frequent safety assessments), staffing and processes should prioritize protecting Retention and AEs handling; if the design elevates endpoint density and documentation, attention should shift to data-facing workflows.

The staffing analysis points to a clear, modifiable driver: SNs were strongly associated with improvements in Retention, Data Quality, and Protocol Compliance (very large to medium effects), with SCs contributing strongly to benefits across outcomes, including AEs; DMs showed more minor, domain-specific effects. Two interpretations are plausible and not mutually exclusive.

First, SNs occupy the clinical–operational bridge, where many preventable defects originate, such as late or missed visits, consent and re-consent issues, pre-visit preparation, bedside clarification of procedures, and immediate reconciliation of safety information. By stabilizing the patient path and closing loops upstream, SNs reduce downstream burdens (queries, missingness, deviations). SCs, in turn, orchestrate logistics, calendars, and stakeholder interfaces—work that carries significant leverage on timeliness and compliance. DMs safeguard integrity and traceability, but much of the variance captured by CT-SPM domains is determined before data reach DM workflows; this may attenuate their apparent effect in perception-based models. Second, counts (rather than FTE-normalized effort or seniority) may underestimate DM contributions where one experienced DM covers multiple protocols efficiently; conversely, SN and SC effects may scale more linearly with headcount at the protocol level. Regardless, the signal is operationally valuable: the mix matters at least as much as the absolute number of staff.

Notably, OPAL-defined complexity did not explain meaningful variance in perceived performance once staffing was considered. This is not contradictory; it reflects construct divergence. OPAL encodes the expected workload/complexity from protocol features, while CT-SPM captures the realized practices as perceived by teams. A high-maturity unit with adequate SN/SC coverage and tight workflows can absorb complexity without perceiving a decline in day-to-day performance. Range restriction may also play a role: the sample was heavily oncology-weighted, with near-universal CRO involvement, which compresses variability at the higher end of complexity (18). In short, OPAL still does what it is meant to do—quantify demand—but perceived performance depends on whether capacity and process are tuned to that demand. This reinforces a plan-and-prove model: use OPAL upstream to argue for resources; use CT-SPM downstream to verify that practice patterns remain robust under the realized load.

CT-SPM is intentionally perception-based, completed jointly by the entire team to minimize single-rater bias and capture shared operational experience. This approach harnesses tacit knowledge that rarely appears in hard KPIs but predicts where defects tend to accumulate. Its validated structure—four domains nested within two higher-order dimensions, and the presence of a scalable short form enables both depth (domain diagnostics) and breadth (lightweight screening) in routine oversight. Still, perception brings limits: shared biases, optimism/pessimism, and context effects may influence ratings; and some domains (e.g., Protocol Compliance) showed moderate discriminative accuracy, suggesting benefits from triangulation with objective, automated KPIs (e.g., CRF timeliness from EDC, query burden per subject-visit, adjudicated deviation counts).

Staff to the risk profile, not just the headcount. Where Retention lags or AE handling is fragile, grow SN capacity and standardize bedside workflows (visit scripts, re-consent triggers, pre-visit checklists). Where timeliness/missingness drive risk, strengthen SC-led logistics (calendar control, source prep, pre-query huddles) and pre-source checks before CRF entry; DMs should focus on traceability and reconciliation protocols for SAE data and primary outcomes. Use CT-SPM subscales to target the lever with the highest marginal return.

Use OPAL at feasibility, CT-SPM in conduct. Integrate OPAL into start-up checklists to make resource requests explicit and defensible; then embed CT-SPM (full or short form) in periodic reviews to confirm that practices occur with the required frequency. This sequencing operationalizes QbD and RBM—specifying risks ex-ante and monitoring the right behaviors in-process.

Prefer domain-aware dashboards. Visualize Retention, Data Quality, AE Reporting, and Protocol Compliance separately. Mixing them into a single index obscures the trade-offs inherent in different designs: RCTs will often carry data-facing stress; observational studies may excel in this regard but require vigilance on consent continuity when recruitment is diffuse. The bifactor architecture of CT-SPM supports exactly this domain-aware view.

Make the short form your “tripwire.” The four-item Mokken scale (queries, SAE accuracy, outcome-data queries, protocol violations) is well suited as a low-burden trigger in central monitoring to flag sites for focused review, especially between visits.

4.1 Strengths and limitations

A key strength is the integration of a validated perception instrument with a logically coherent complexity classifier, enabling separation of demand from performance. The team-consensus completion further mitigates idiosyncratic bias and aligns with how site operations are delivered—collectively, not by isolated roles. Finally, the domain-specific analysis respects real operational trade-offs across trial genres.

Limitations merit caution. First, the cross-sectional design prevents causal inference: better staffing may drive better performance, but high-performing units could also attract resources. Second, staffing was captured as counts, not FTE-normalized effort or seniority; future work should account for role expertise and turnover. Third, the sample’s oncology predominance and high CRO involvement may compress OPAL variability and limit generalizability beyond similar environments. Fourth, because CT-SPM is perception-based, common-method variance is possible; triangulation with objective KPIs is needed to solidify construct validity and to calibrate context-sensitive thresholds (e.g., via control charts over time).

Future directions

Future research should expand beyond cross-sectional perceptions to establish causal and longitudinal evidence linking staffing configurations, workload dynamics, and site performance. Multi-center, mixed-methods studies integrating objective metrics (e.g., CRF timeliness, deviation rates, AE reconciliation time) with perception-based measures such as the CT-SPM would enhance construct validity and help delineate the pathways through which team composition affects data quality and patient-centered outcomes. Experimental or quasi-experimental designs could test targeted staffing interventions—for instance, increasing SN or SC coverage, while monitoring their downstream effects on recruitment, retention, and regulatory compliance (24, 25).

Additionally, psychometric refinement of the CT-SPM should continue, including cross-cultural validation and the establishment of performance benchmarks stratified by therapeutic area and trial phase. Integrating the CT-SPM within centralized monitoring systems or electronic dashboards would operationalize real-time oversight, enabling early detection of risk signals and data drift. Future iterations of the OPAL–CT-SPM integration could also incorporate machine learning models to predict staffing needs or performance decline based on protocol complexity and operational history.

Finally, workforce sustainability should become a core endpoint. Understanding how workload equity, training, and professional recognition influence retention and well-being among SN, SC, and DM will be critical to maintaining quality and continuity in an increasingly complex research environment.

Conclusion

Perceived performance in clinical trials is not solely determined by complexity. Instead, how teams are staffed—especially the clinical–operational bridge provided by SNs and the coordinating leverage of SCs—appears pivotal in translating protocol demands into consistent, compliant, and timely operations. CT-SPM offers a structured lens on these behaviors, while OPAL ensures that resourcing remains proportionate to the task. Using both in tandem supports a plan-and-prove paradigm: plan staffing and oversight based on expected complexity and prove, continuously, that essential practices occur with the frequency needed to protect participant welfare and internal validity.

Declarations

Ethics approval and consent to participate

This study did not involve human participants or patient-level data. It relied exclusively on anonymized, protocol-level operational information collected during routine trial management activities. According to institutional policy, ethical approval was waived. No consent to participate was required.

Consent for publication

Not applicable.

Data Availability

The dataset supporting the conclusions of this article is available from the corresponding author upon reasonable request.

Competing interests

The authors declare that they have no competing interests.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author Contribution

DN conceptualized the study, designed the methodology, performed the data analysis, and drafted the manuscript.VM coordinated data collection, curated trial-level information, and contributed to interpretation.MB contributed to statistical analysis, methodological refinement, and manuscript revision.AG provided supervision, clinical and methodological oversight, and critical revision of intellectual content.All authors read and approved the final manuscript.

VM coordinated data collection, curated trial-level information, and contributed to interpretation.

MB contributed to statistical analysis, methodological refinement, and manuscript revision.

AG provided supervision, clinical and methodological oversight, and critical revision of intellectual content.

All authors read and approved the final manuscript.

Acknowledgement

The authors thank Fondazione Roma for its continued support of scientific research.

References

Hardman TC, Aitchison R, Scaife R, Edwards J, Slater G. The future of clinical trials and drug development: 2050. Drugs Context. 8 giugno 2023;12:2023-2-2.

Markey N, Howitt B, El-Mansouri I, Schwartzenberg C, Kotova O, Meier C. Clinical trials are becoming more complex: a machine learning analysis of data from over 16,000 trials. Sci Rep. 12 febbraio 2024;14:3514.

Botto E, Smith Z, Getz K. New Benchmarks on Protocol Amendment Experience in Oncology Clinical Trials. Ther Innov Regul Sci. luglio 2024;58(4):645–54.

Getz K, Smith Z, Kravet M. Protocol Design and Performance Benchmarks by Phase and by Oncology and Rare Disease Subgroups. Ther Innov Regul Sci. gennaio 2023;57(1):49–56.

Getz KA, Stergiopoulos S, Short M, Surgeon L, Krauss R, Pretorius S, et al. The Impact of Protocol Amendments on Clinical Trial Performance and Cost. Ther Innov Regul Sci. luglio 2016;50(4):436–41.

Getz KA, Campo RA. Trends in clinical trial design complexity. Nature Reviews Drug Discovery. 1 maggio 2017;16(5):307–307.

Yorke-Edwards V, Diaz-Montana C, Murray ML, Sydes MR, Love SB. Monitoring metrics over time: Why clinical trialists need to systematically collect site performance metrics. Res Methods Med Health Sci. settembre 2023;4(4):124–35.

Snowdon C, Kernaghan S, Moretti L, Turner NC, Ring A, Wilkinson K, et al. Operational complexity versus design efficiency: challenges of implementing a phase IIa multiple parallel cohort targeted treatment platform trial in advanced breast cancer. Trials. 7 maggio 2022;23(1):372.

Bozzetti M, Soncini S, Bassi MC, Guberti M. Assessment of Nursing Workload and Complexity Associated with Oncology Clinical Trials: A Scoping Review. Semin Oncol Nurs. ottobre 2024;40(5):151711.

10.

Durden K, Hurley P, Butler DL, Farner A, Shriver SP, Fleury ME. Provider motivations and barriers to cancer clinical trial screening, referral, and operations: Findings from a survey. Cancer. 2024;130(1):68–76.

11.

Jones CT, Griffith CA, Fisher CA, Grinke KA, Keller R, Lee H, et al. Nurses in clinical trials: perceptions of impact on the research enterprise. J Res Nurs. marzo 2022;27(1–2):50–65.

12.

Rico-Villademoros F, Hernando T, Sanz JL, López-Alonso A, Salamanca O, Camps C, et al. The role of the clinical research coordinator–data manager–in oncology clinical trials. BMC Med Res Methodol. 25 marzo 2004;4:6.

13.

Bozzetti M, Guberti M, Lo Cascio A, Privitera D, Genna C, Rodelli S, et al. Uncovering the Professional Landscape of Clinical Research Nursing: A Scoping Review with Data Mining Approach. Nursing Reports [Internet]. agosto 2025 [citato 24 luglio 2025]; Disponibile su: https://www.mdpi.com/2039-4403/15/8/266

14.

Ball JE, Bruyneel L, Aiken LH, Sermeus W, Sloane DM, Rafferty AM, et al. Post-operative mortality, missed care and nurse staffing in nine countries: A cross-sectional study. Int J Nurs Stud. febbraio 2018;78:10–5.

15.

Dall’Ora C, Saville C, Rubbo B, Turner L, Jones J, Griffiths P. Nurse staffing levels and patient outcomes: A systematic review of longitudinal studies. Int J Nurs Stud. ottobre 2022;134:104311.

16.

Asgari E, Kaur J, Nuredini G, Balloch J, Taylor AM, Sebire N, et al. Impact of Electronic Health Record Use on Cognitive Load and Burnout Among Clinicians: Narrative Review. JMIR Med Inform. 12 aprile 2024;12:e55499.

17.

Schäfer H, Lajmi N, Valente P, Pedrioli A, Cigoianu D, Hoehne B, et al. The Value of Clinical Decision Support in Healthcare: A Focus on Screening and Early Detection. Diagnostics (Basel). 6 marzo 2025;15(5):648.

18.

Yedro S, Tinari E, Napolitano D, Wlderk G, Ribaudi E, Giannone L, et al. DISYNCRO: Perceived roles of clinical study coordinators and data managers: results from a web-based survey of professionals from contract research organizations. Contemp Clin Trials Commun. ottobre 2025;47:101533.

19.

Napolitano D, Amato S, Creta E, Profeta F, Foscarini E, Ribaudi E, et al. The roles and professional competencies of clinical study coordinators and data managers in clinical trials: A systematic review. Clin Trials. 20 novembre 2025;17407745251387952.

20.

Tyson K, Harvey J, Forney L, Brinton D. Resource management and capacity planning for clinical trial sites. J Clin Transl Res. 2024;10(4):229–36.

21.

Bozzetti M, Lo Cascio A, Napolitano D, Orgiana N, Mora V, Fiorini S, et al. Measuring What Matters in Trial Operations: Development and Validation of the Clinical Trial Site Performance Measure. JCM. 26 settembre 2025;14(19):6839.

22.

Smuck B, Bettello P, Berghout K, Hanna T, Kowaleski B, Phippard L, et al. Ontario protocol assessment level: clinical trial complexity rating tool for workload planning in oncology clinical trials. J Oncol Pract. marzo 2011;7(2):80–4.

23.

R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2023. Disponibile su: https://www.R-project.org/

24.

Mora V, Colantuono S, Fanali C, Leonetti A, Wlderk G, Pirro MA, et al. Clinical research coordinators: Key components of an efficient clinical trial unit. Contemp Clin Trials Commun. aprile 2023;32:101057.

25.

Napolitano D, Lo Cascio A, Bozzetti M, Guberti M. Implementing research, improving practice: synergizing the clinical research nurse and the nurse researcher. Minerva Gastroenterol [Internet]. luglio 2025 [citato 7 luglio 2025]; Disponibile su: https://www.minervamedica.it/index2.php?show=R08Y9999N00A25070301

Figure legend

Fig. 1

Performance Indicators Across Trial Phases and Study Designs

Fig. 1

Violin plots of performance indicators across trial phases (top row) and study designs (bottom row).. Horizontal bars denote significant pairwise differences based on post hoc Dunn tests with Holm correction Significant codes:*p < .05, **p < .01, ***p < .001, ****p < 0.0001.

Table 2. Staffing predictors of performance outcomes in clinical trials

Predictor	F1 – Participant Retention F(p) [η²p]	F2 – Data Quality F(p) [η²p]	F3 – Adverse Events F(p) [η²p]	F4 – Protocol Compliance F(p) [η²p]	Mokken Short F(p) [η²p]
Study Nurse	2216.39 (< 0.001) *** [η²p = 0.86]	106.67 (< 0.001) *** [η²p = 0.23]	2.44 (0.119) [η²p = 0.01]	28.78 (< 0.001) *** [η²p = 0.07]	60.74 (< .001) *** [η²p = .15]
Study Coordinator	79.99 (< 0.001) *** [η²p = 0.18]	30.49 (< 0.001) *** [η²p = 0.08]	9.42 (0.002) ** [η²p = 0.03]	3.62 (0.058) [η²p = 0.01]	19.98 (< .001) *** [η²p = .05]
Data Manager	0.74 (0.389) [η²p < 0.01]	3.21 (0.074) [η²p = 0.01]	5.69 (0.018) * [η²p = 0.02]	0.08 (0.777) [η²p < 0.01]	0.49 (.484) [η²p < .01]
Trial Complexity	1.55 (0.214) [η²p < 0.01]	1.68 (0.196) [η²p < 0.01]	0.03 (0.914) 0.04 [η²p < 0.01]	0.46 (0.499) [η²p < 0.01]	0.17 (.678) [η²p < .01]

Notes: Bold values are statistically significant; Sign Codes: *** p < 0.001; ** p < 0.01; * p < 0.05

Table 1. Characteristics of included trials

Phase
Phase I	49 (13.5%)
Phase II	99 (27.3%)
Phase III	178 (49.2%)
Phase IV	12 (3.3%)
Non Phase	22 (6.1%)
Study
Other	86 (23.8%)
Observational	161 (44.5%)
RCT	115 (31.8%)
CRO Involvement
No	13 (3.59%)
Yes	349 (96.41%)
Sponsor
No	15 (4.14%)
Yes	347 (95.86%)
Translational
No	130 (35.91%)
Yes	231 (63.81%)

Yes

Abstract

Background Protocol complexity in clinical trials has increased substantially, yet empirical evidence linking staffing configurations, complexity, and operational performance remains limited. This study examined how team composition and study design relate to perceived performance using a validated site-performance instrument and a standardized complexity classifier. Methods A cross-sectional study was conducted in a high-volume academic clinical trials unit (January–March 2025). For all actively recruiting or ongoing trials, contextual characteristics, protocol complexity (Ontario Protocol Assessment Level, OPAL), and staffing (Study Nurses, Study Coordinators, Data Managers) were recorded. Site performance was assessed using the Clinical Trial Site Performance Measure (CT-SPM), completed jointly by each trial team. Non-parametric tests and canonical discriminant analysis evaluated differences across phases and designs; univariate analyses of variance examined associations between staffing, complexity, and performance. Results A total of 362 trials were included, predominantly oncology (67.7%). Mean performance scores indicated overall good functioning. Participant retention and adverse event reporting differed across phases, with Phase I outperforming later phases. Study design showed larger effects: observational studies exceeded randomized controlled trials in retention, data quality, protocol compliance, and overall performance. Staffing demonstrated strong associations with performance: Study Nurses showed very large effects on retention (η²p=0.86) and substantial effects on data quality and compliance; Study Coordinators contributed moderate-to-large effects; Data Managers showed smaller, domain-specific contributions. OPAL-defined complexity did not explain additional variance once staffing was considered. Conclusions Perceived site performance is influenced more by staffing composition than by protocol complexity. Study Nurses and Study Coordinators—positioned at the clinical–operational interface—appear central in sustaining participant-facing and data-facing processes. Integrating standardized complexity assessments with structured performance measures may support a plan-and-prove approach to staffing and operational oversight.