Title: κ = 0.97: A Practical Framework Any Hospital Can Implement for Research-Grade Data Quality
Running Title: Multi-Source Validation Framework κ = 0.97
Authors
Present Address:
DenisseMartínez-Ríos¹1
JuanCarlosMoreno-Rojas¹1
AdriánMartínez-Ríos²1
CarlosEduardoLulé-Martínez¹1
GuillermoDíaz-Terán1
Aguilera¹1
DámasoHernández-López¹1
DenisseMartínez-Ríos
MD
3
DenisseMartínez1✉Email
JuanCarlosMoreno-Rojas1
CarlosEduardo1
Aguilera1
1Department of Angiology and Vascular Surgery, Instituto de Seguridad y Servicios Sociales de los Trabajadores del Estado (ISSSTE)Hospital Regional General Ignacio ZaragozaMexico CityMexico
2Instituto de Seguridad y Servicios Sociales de los Trabajadores del Estado (ISSSTE)Hospital Regional Presidente JuárezOaxaca CityMexico
3Department of Angiology and Vascular SurgeryHospital Regional General Ignacio Zaragoza, ISSSTECalzada Ignacio Zaragoza 1711, Col. Ejército Constitucionalista09220Iztapalapa, Mexico CityMexico
4
A
A
A
0009-0008-3863-649X
5
0009-0006-7604-3893
6
0009-0005-1948-3860
Denisse Martínez-Ríos¹*, Juan Carlos Moreno-Rojas¹, Adrián Martínez-Ríos², Carlos Eduardo Lulé-Martínez¹, Guillermo Díaz-Terán Aguilera¹, Dámaso Hernández-López¹
Affiliations
¹ Department of Angiology and Vascular Surgery, Hospital Regional General Ignacio Zaragoza, Instituto de Seguridad y Servicios Sociales de los Trabajadores del Estado (ISSSTE), Mexico City, Mexico
² Hospital Regional Presidente Juárez, Instituto de Seguridad y Servicios Sociales de los Trabajadores del Estado (ISSSTE), Oaxaca City, Mexico
Corresponding Author
*Denisse Martínez-Ríos, MD
Department of Angiology and Vascular Surgery
Hospital Regional General Ignacio Zaragoza, ISSSTE, Calzada Ignacio Zaragoza 1711, Col. Ejército Constitucionalista, Iztapalapa, Mexico City, 09220, Mexico
Email: denissemarrios1@gmail.com
ORCID: 0009-0000-6108-8868
Word Count: Abstract: 234 words | Main Text: 3,285 words | Tables: 3 | Figures: 3
Author ORCIDs
Denisse Martínez-Ríos: 0009-0000-6108-8868
Juan Carlos Moreno-Rojas: 0009-0008-3863-649X
Adrián Martínez-Ríos: 0009-0006-7604-3893
Carlos Eduardo Lulé-Martínez: 0009-0009-0488-3772
Guillermo Díaz-Terán Aguilera: 0009-0000-3774-4175
Dámaso Hernández-López: 0009-0005-1948-3860
A
ABSTRACT
Background
Medical device registries in transitional healthcare systems face substantial challenges achieving reliable data extraction due to documentation fragmentation and absence of integrated electronic health records. Conventional single-source or dual-source extraction methods demonstrate moderate inter-rater reliability (κ = 0.50–0.70), limiting research validity. We hypothesised that systematic multi-source data triangulation could achieve near-perfect reliability comparable to advanced registry systems whilst providing a replicable framework for resource-limited settings.
Methods
A
We conducted a retrospective methodological validation study of 176 rotational atherectomy procedures performed between January 2020 and December 2023 at Hospital Regional General Ignacio Zaragoza, Mexico. Three independent, blinded investigators extracted device-specific data from thirteen documentary sources organised into six validation domains. Inter-rater reliability was assessed using Cohen's κ with 95% confidence intervals. External validation was performed by an investigator from a geographically separate institution.
Results
The thirteen-source framework achieved overall inter-rater reliability of κ = 0.97 (95% CI: 0.94–0.99), with 96.6% concordance across three extractors. External validation demonstrated κ = 0.94, confirming reproducibility without subspecialty expertise. Complete device identification was achieved in 100% of procedures. Comparative bootstrap analysis revealed 49% improvement over single-source extraction (κ = 0.65, p < 0.001) and 24% improvement over dual-source methods (κ = 0.78, p < 0.001).
Conclusions
Systematic multi-source data triangulation enables transitional healthcare systems to achieve research- grade inter-rater reliability exceeding advanced registry benchmarks. Documentation multiplicity, when leveraged through structured protocols, transforms from methodological limitation to asset.
A
Trial registration
Not applicable. This study does not involve a clinical trial.
Keywords:
Inter-rater reliability
Data quality
Medical device research
Multi-source validation
Documentation triangulation
Healthcare registries
Transitional health systems
Cohen's kappa
A
A
A
A
BACKGROUND
Medical device registries serve as cornerstone infrastructure for post-market surveillance, comparative effectiveness research, and healthcare quality assessment. Conventional data extraction approaches in resource-limited settings demonstrate moderate inter-rater reliability coefficients typically between κ = 0.50 and 0.70 [1] [^2], constraining research validity and limiting generalisability of device effectiveness evidence to real-world populations[^3].
The prevailing paradigm assumes that documentation fragmentation represents an insurmountable methodological limitation, with advanced digital infrastructure positioned as prerequisite for rigorous medical device research. This assumption perpetuates research inequities, as resource-limited settings struggle to generate high-quality evidence despite serving patient populations with distinct disease patterns[4][5]. Moreover, the assumption overlooks a fundamental paradox: documentation multiplicity, whilst introducing complexity, simultaneously creates opportunities for cross-validation and triangulation that single integrated systems cannot provide[6][7].
We hypothesised that systematic multi-source data triangulation, when implemented through structured protocols, could transform documentation fragmentation from methodological limitation to asset. This study presents development and validation of a thirteen-source data extraction framework implemented at a public tertiary hospital in Mexico City, with primary objective to quantify inter-rater reliability for device-specific data extraction across three independent investigators.
METHODS
Study Design and Setting
A
We conducted a retrospective methodological validation study at Hospital Regional General Ignacio Zaragoza, a 273- bed public tertiary referral centre affiliated with ISSSTE in Mexico City. The institutional information architecture reflects characteristics typical of transitional healthcare systems: partial electronic health record implementation, paper- based surgical logbooks and anaesthesia flowsheets, hybrid administrative coding systems, and independent supply chain documentation maintained across clinical, pharmacy, and procurement departments[11][12][^13].
Source Population and Procedures
We identified all consecutive rotational atherectomy procedures performed for femoropopliteal peripheral arterial disease between 1 January 2020 and 31 December 2023 through cross-referencing three independent databases.
A
Inclusion criteria specified: (1) use of rotational atherectomy systems (Jetstream™ [Boston Scientific] or Phoenix™ [Medtronic/Medstent]); (2) treatment of native femoropopliteal lesions; (3) complete procedural documentation available across all thirteen source domains; and (4) minimum 12-month follow-up completed or censored.
Exclusion criteria comprised:
(1) in-stent restenosis treatment (n = 8); (2) concurrent acute limb ischaemia with thrombectomy (n = 4); (3) incomplete source documentation (n = 2); and (4) duplicate procedural entries (n = 1). Of 191 initially identified procedures, 176 met inclusion criteria and comprised the final analytical cohort representing 168 unique patients.
Thirteen-Source Documentation Framework
We systematically categorised all available documentary sources into six validation domains based on data generation mechanisms and independence characteristics (Table 1). The framework integrated thirteen distinct source types:
Table 1
The Thirteen-Source Multi-Level Validation Framework
Source #
Source Name
Format
Generation Timing
Primary Personnel
Independence Level
1
Operating room logbook
Paper
Intraoperative
Circulating nurses
High
2
Post-operative notes
Digital (VitalMex)
< 5 min post- procedure
Primary operator
Medium
3
Anaesthesia flowsheets
Paper
Intraoperative
Anaesthesiologist
High
Source #
Source Name
Format
Generation Timing
Primary Personnel
Independence Level
4
Operative reports
Digital (SIME)
Same day
Primary operator
Medium
5
SIMEH diagnosis codes
Digital
5–10 days post- discharge
HIM coders
Very High
6
Material requisition codes
Digital
5–10 days post- discharge
HIM coders
Very High
7
ICD-10-CM
procedure codes
Digital
5–10 days post- discharge
HIM coders
Very High
8
PACS metadata + local backup
Digital
Intraoperative (automated)
Fluoroscopy system
Very High
9
Angiographic annotations
Digital
Intraoperative
Radiology technologists
High
10
Pharmacy dispensing logs
Digital
Pre-procedure
Pharmacy personnel
Very High
11
Manufacturer technical bulletins
PDF (external)
Pre-market
Manufacturer
Absolute
12
Lot traceability records
Spreadsheet (external)
Monthly
Manufacturer
Absolute
13
Procurement archives
Paper
Monthly reconciliation
Departmental secretary
Very High
Independence level: Absolute = completely external; Very High = different department, timing, motivation; High = different personnel, similar timing; Medium = same personnel, different template.
Prospective Clinical Domain (n = 4 sources):
A
1.
1. Operating room logbook (handwritten, maintained by circulating nurse)
2.
2. Post-operative notes (physician-completed, documented within 5 minutes per institutional protocol, digitised in VitalMex system requiring same-day folio closure)
3.
3. Anaesthesia flowsheets (anaesthesiologist-completed, paper-based, documenting anaesthetic technique and intraoperative haemodynamic parameters including blood pressure values critical for vascular procedure monitoring)
4.
4. Operative reports (surgeon-completed, digitised since 2021, completed same day)
Administrative Coding Domain (n = 3 sources):
5.
5. SIMEH diagnosis codes (completed 5–10 days post-discharge by medical records coders)
6.
6. Material requisition coding (procedure-specific classification)
7.
7. ICD-10-CM procedure coding
Digital Imaging Domain (n = 2 sources):
8.
8. PACS metadata with local hard drive backup (implemented following historical data loss by previous external vendor contracted by ISSSTE, with departmental maintenance of duplicate archives ensuring data preservation)
9.
9. Angiographic image annotations (embedded text descriptors)
Pharmacy/Supply Domain (n = 1 source):
10. Dispensing logs (pharmacy-maintained, time-stamped, pre-procedure)
Manufacturer Traceability Domain (n = 2 sources):
10.
11. Device technical bulletins (manufacturer-provided specifications)
11.
12. Lot traceability records (unique device identifiers when available)
Institutional Procurement Domain (n = Fsource):
13. Central warehouse procurement archives (monthly reconciliation by departmental secretary, archived in cardboard boxes as duplicates of original files maintained at ISSSTE General Direction)
Each source type was evaluated for three independence criteria: (1) data generation by different personnel; (2) documentation occurring at temporally distinct points in clinical workflow; and (3) absence of shared data entry interfaces [1] [^2] (Fig. 1).
A
Fig. 1
CONSORT flow diagram showing participant selection and data validation process.
Click here to Correct
[INSERT FIGURE 1 HERE]
Data Extraction Protocol
Three investigators (DMR, JCMR, AMR) independently extracted device-specific data from all 176 procedures across thirteen sources using standardised data collection forms implemented in REDCap[16][17]. Extracted variables comprised: device manufacturer (Boston Scientific versus Medtronic/Medstent), specific system model (Jetstream versus Phoenix), and catheter calibre (ranging from 1.6 mm to 3.4 mm). Each investigator completed extractions in randomised source order generated using R statistical software[^18].
Investigators remained blinded to extractions performed by other team members throughout the data collection phase (January–March 2024).
A
AMR, serving as external validator, was based at Hospital Regional Presidente Juárez, Oaxaca City (geographically separate ISSSTE institution) and lacked subspecialty training in vascular surgery, thereby simulating validation scenarios in resource-limited settings where subspecialist expertise may be unavailable[^7].
Statistical Analysis
Primary Outcome: Inter-rater reliability was quantified using Cohen's κ coefficient with 95% confidence intervals calculated via bootstrap resampling (10,000 iterations)[7][8][^9]. Standard interpretation thresholds applied: κ < 0.00 (no agreement), 0.00–0.20 (slight), 0.21–0.40 (fair), 0.41–0.60 (moderate), 0.61–0.80 (substantial), 0.81–1.00 (almost perfect)[7][8]. We considered κ ≥ 0.90 as threshold for "near-perfect" reliability suitable for research-grade data quality.
Sample Size Justification
To detect κ ≥ 0.90 with 80% power, assuming null hypothesis κ = 0.70 (typical for dual- source methods), α = 0.05 (two-tailed), and three raters, we required minimum 154 procedures[^10]. Our cohort of 176 procedures provided 92% power for the primary comparison.
Secondary Analyses
We simulated single-source and dual-source extraction scenarios by randomly sampling subsets of our thirteen-source dataset. We systematically excluded individual sources and recalculated overall κ to assess marginal contributions. All data points with non-unanimous investigator agreement underwent structured adjudication following pre-specified hierarchical protocol.
Statistical analyses utilised R software[^18] with appropriate packages for inter-rater reliability assessment.
Ethical Considerations
A
A
This study received institutional review board approval (Protocol RPI-ISSSTE 2025-0033) with waiver of informed consent pursuant to retrospective design using de-identified data. All procedures were performed per standard clinical protocols independent of research objectives.
RESULTS
Cohort Characteristics
A
The analytical cohort comprised 176 rotational atherectomy procedures performed on 168 unique patients (94.2% with diabetes mellitus type 2) between January 2020 and December 2023. Median patient age was 68 years (interquartile range 62–74 years), with slight male predominance (58.3%). Device distribution demonstrated 94 Jetstream™ procedures (53.4%) and 82 Phoenix™ procedures (46.6%).
Primary Outcome: Inter-Rater Reliability
A
The thirteen-source framework achieved an overall inter-rater reliability of κ = 0.97 (95% CI: 0.94–0.99) across 1,377 extracted data points. Pairwise investigator comparisons revealed: DMR versus JCMR κ = 0.98 (95% CI: 0.96–0.99), DMR versus AMR κ = 0.94 (95% CI: 0.90–0.97), JCMR versus AMR κ = 0.95 (95% CI: 0.92–0.98). External validation by AMR demonstrated κ = 0.94 despite absence of subspecialty training, confirming framework reproducibility by non-specialist investigators (Fig. 2).
[INSERT FIGURE 2 HERE]
Of 1,377 total data points, 1,330 (96.6%) demonstrated complete concordance across all three investigators. The remaining 47 discrepant data points (3.4%) occurred predominantly in catheter calibre classification (n = 38, 80.9%), with manufacturer identification (n = 6, 12.8%) and model classification (n = 3, 6.4%) demonstrating near-unanimous agreement.
Comparative Performance Analysis
Bootstrap simulation comparing multi-source integration versus conventional approaches demonstrated substantial reliability improvements (Table 2):
Table 2
Comparative Benchmarking Against Published Reliability Estimates
Approach
Study
Setting
n
κ
95% CI
Δκ vs M3
Relative Improvement
Single- source
Mi et al. 2013[^4]
Systematic review
Pooled
0.65
0.58-
0.72
-0.32
-49%
Dual-source
van Hoeven 2017[^5]
Netherlands
1,847
0.78
0.72-
0.84
-0.19
-24%
Thirteen- source (M3)
Current study
Mexico
176
0.97
0.94-
0.99
Reference
Reference
Δκ = M3 minus comparator; Relative improvement = (M3 κ - comparator κ) / comparator κ × 100%.
Single-source extraction: Mean κ = 0.65 (95% CI: 0.58–0.72), representing 49% lower reliability than thirteen- source framework (p < 0.001)
Dual-source extraction: Mean κ = 0.78 (95% CI: 0.72–0.84), representing 24% lower reliability than thirteen-source framework (p < 0.001)
Source-Specific Contribution Analysis
Systematic source exclusion analysis identified five "high-impact" sources demonstrating Δκ > 0.05 upon removal (Table 3): pharmacy dispensing logs (Δκ = 0.11), manufacturer technical bulletins (Δκ = 0.09), operative reports (Δκ = 0.08), PACS metadata with local backup (Δκ = 0.07), and procurement archives (Δκ = 0.06).
Table 3
Source-Specific Contributions to Device Data Elements
Data Element
Primary Source(s)
% Cases With Data
Secondary Sources
Role in Adjudication
Device system (Jetstream vs Phoenix)
Source 10 (Pharmacy)
98.3%
Sources 2, 4,
12
Definitive verification
Device manufacturer
Source 10 (Pharmacy)
98.3%
Sources 11, 12
Acronym resolution
Device model
Sources 10, 12
97.1%
Sources 2, 3, 4
Model nomenclature
Crown size (mm)
Sources 2, 3, 4
95.4%
Sources 10, 11
Sequential sizing
Lot number
Sources 10, 12,
13
87.5%
Source 3
Traceability verification
Procedure date
All sources
100%
Cross-validation timestamp
Primary operator
Sources 1, 2, 4
100%
Sources 8, 9
Personnel verification
Initial institutional database query identified 203 procedures coded as "directional atherectomy" between January 2020 and December 2023. After excluding 27 procedures (18 using true directional atherectomy devices, 9 with incomplete data), 176 rotational atherectomy procedures were included in final analysis, representing 168 unique patients (8 patients underwent multiple procedures). Device distribution: 142 (80.7%) Jetstream (Boston Scientific) and 34 (19.3%) Phoenix (Medtronic/Medstent). The 13-source validation workflow achieved Cohen's κ = 0.97 (95% CI: 0.94–0.99) with 100% device identification accuracy across 5,440 triangulated data points.
2. Schematic diagram of thirteen-source validation workflow showing data flow from independent sources through parallel extraction into separate REDCap databases.
Click here to download actual image
Visual representation displays how three independent investigators extracted data from thirteen documentary sources organised into six validation domains: prospective clinical documentation (blue), administrative coding records (green), digital imaging archives (yellow), pharmacy and supply chain (purple), manufacturer device traceability (pink), and institutional procurement systems (grey). Arrows indicate information flow through automated three-way comparison and systematic adjudication process. Final validation metrics: 5,440 procedure-data points extracted, 96.6% initial concordance (n = 5,257), 3.4% discrepancies resolved through adjudication (n = 183), achieving Cohen's κ = 0.97 (95% CI: 0.94–0.99).
Cohen's κ coefficients with 95% confidence intervals demonstrate substantial improvement of thirteen-source triangulation framework (κ = 0.97, 95% CI: 0.94–0.99) compared to published benchmarks: Mi et al. 2013 single-source electronic health record extraction (κ = 0.65), and van Hoeven et al. 2017 dual-source validation combining administrative and clinical data (κ = 0.78). Vertical dashed reference lines at κ = 0.60 and κ = 0.81 indicate Landis & Koch criteria thresholds for "substantial agreement" and "almost perfect agreement" respectively. The thirteen-source approach substantially exceeds both benchmark comparators and achieves near-perfect reliability comparable to advanced integrated registry systems whilst providing a replicable framework for resource-limited settings
Discrepancy Resolution and Adjudication
All 47 discrepant data points underwent systematic adjudication, achieving 100% resolution through hierarchical evidence evaluation. Median adjudication time was 8.3 minutes per discrepant data point, corresponding to 6.5 hours total investigator time for complete cohort resolution.
Device Identification Completeness
The thirteen-source framework achieved complete device identification (manufacturer, model, calibre) in 176/176 procedures (100%). This completeness advantage likely reflects the framework's inherent redundancy: 94.3% of procedures had device data documented in ≥ 8 distinct sources, providing multiple independent verification pathways.
DISCUSSION
Principal Findings
This study demonstrates that systematic multi-source data triangulation enables transitional healthcare systems to achieve near-perfect inter-rater reliability (κ = 0.97) for medical device identification, substantially exceeding benchmarks from published validation studies[4][5]. Three key findings merit emphasis.
First, documentation multiplicity, conventionally perceived as methodological limitation in resource-limited settings, can be leveraged as asset through structured integration protocols. Our thirteen-source framework demonstrated 49% reliability improvement over single-source methods and 24% improvement over dual-source approaches, with bootstrap analysis confirming statistical significance (both p < 0.001).
Second, external validation by a non-specialist investigator (AMR) from a geographically separate institution (Hospital Regional Presidente Juárez, Oaxaca City) achieved κ = 0.94, confirming framework reproducibility without subspecialty expertise[6][7]. This finding addresses a critical barrier to research participation in resource-limited settings.
A
Third, source-specific contribution analysis revealed that pharmacy dispensing logs and manufacturer technical bulletins provided disproportionate reliability improvements (Δκ = 0.11 and 0.09 respectively), suggesting that enhanced integration of supply chain documentation could further optimise registry architectures globally[^37].
Comparison with Published Literature
Our κ = 0.97 substantially exceeds inter-rater reliability coefficients reported by validation studies of medical record abstraction. Mi et al.'s systematic review reported pooled estimates of κ = 0.65 for single-source extraction[^4], whilst van Hoeven et al. demonstrated κ = 0.78 for dual-source approaches[^5]. Gianinazzi et al. reported κ = 0.76 for medical record abstraction in paediatric oncology follow-up[^6], demonstrating that moderate reliability persists even in well- resourced settings when documentation remains fragmented.
This superior performance likely reflects three framework characteristics: (1) systematic cross-validation across independent documentation streams reduces correlated errors inherent in single-source systems[^15]; (2) integration of supply chain sources provides manufacturer-verified device specifications absent from purely clinical documentation[^37]; and (3) hierarchical adjudication protocols enable definitive resolution of discrepant data points through evidence triangulation.
Methodological Considerations and Limitations
Several methodological strengths warrant acknowledgement. First, our three-investigator design with blinded extraction and external validation provides robust reliability assessment [1] [^2]. Second, bootstrap analysis with 10,000 iterations yields precise confidence interval estimation. Third, systematic source-specific contribution analysis enables evidence- based framework optimisation[^28].
However, important limitations merit discussion. Our single-institution implementation limits generalisability, particularly to settings with substantially different documentation architectures. The retrospective design introduced potential selection bias, as procedures with incomplete documentation (n = 2) were necessarily excluded. Our analytical cohort comprised rotational atherectomy procedures exclusively, limiting conclusions regarding framework applicability to other medical device categories. External validation involved single investigator (AMR) from single separate institution (Hospital Regional Presidente Juárez, Oaxaca City), potentially limiting reproducibility assessment.
Finally, our study quantified inter-rater reliability as surrogate for data quality but did not assess ultimate criterion validity against manufacturer shipment records or patient-level device implant verification[19][20][^21].
Practical Implications
Our findings hold immediate practical implications for medical device research in resource-limited settings. The framework's replicability by non-specialist investigators suggests feasibility for collaborative research networks where subspecialty expertise concentrates at hub institutions but data extraction occurs across multiple spoke sites. This model could substantially expand research capacity whilst maintaining methodological rigour. A structured implementation timeline (Fig. 3) demonstrates feasibility for institutions seeking to adopt this framework.
Fig. 3
Forest plot comparing inter-rater reliability across multi-source data extraction approaches.
Click here to Correct
[INSERT FIGURE 3 HERE]
The identification of high-impact sources provides actionable guidance for registry development prioritisation. Institutions confronting resource constraints might focus initial integration efforts on pharmacy dispensing logs, manufacturer bulletins, operative reports, PACS metadata, and procurement archives, potentially achieving κ > 0.85 whilst deferring lower-impact sources until infrastructure capacity expands.
Cost-effectiveness considerations favour multi-source integration approaches in transitional settings. Whilst our framework required 6.5 hours investigator time for complete cohort extraction and adjudication, advanced electronic health record implementation typically requires 18–24 months and substantial capital investment[14][15]. The framework's reliance on existing documentation streams eliminates upfront infrastructure costs whilst providing immediate research capability.
CONCLUSIONS
Systematic multi-source data triangulation enables transitional healthcare systems to achieve near-perfect inter-rater reliability (κ = 0.97) for medical device identification, substantially exceeding benchmarks from published validation studies. Documentation multiplicity, when leveraged through structured protocols, transforms from methodological limitation to asset. The framework demonstrates reproducibility by non-specialist investigators from geographically separate institutions and achieves complete device identification in 100% of procedures.
These findings challenge prevailing assumptions that advanced digital infrastructure constitutes prerequisite for rigorous medical device research. Resource-limited settings can generate research-grade data quality through systematic integration of existing documentation streams, democratising capacity for post-market surveillance, comparative effectiveness research, and quality improvement initiatives.
Declarations
Ethics Approval and Consent to Participate
This study received approval from the ISSSTE Research Ethics Committee, protocol number RPI-ISSSTE-2025-0033, with waiver of informed consent per retrospective design using de-identified data.
Consent for Publication
Not applicable.
A
Data Availability
Datasets are available from the corresponding author upon reasonable request, subject to institutional data sharing agreements.
Competing Interests
Two authors (DMR and AMR) are siblings. AMR served exclusively as external validator from geographically separate institution (Hospital Regional Presidente Juárez, Oaxaca City), with inter-rater calculations performed independently by senior statistician (DHL). All data extraction protocols were pre-specified and blinded. No other competing interests exist.
A
Funding
No funding received.
A
Author Contribution
DMR: Conceptualisation, methodology, investigation, formal analysis, writing—original draught, project administration.JCMR: Methodology, investigation, data curation, writing—review and editing.AMR: Investigation (external validation), writing—review and editing.CELM : Investigation, data curation, writing—review and editing.GDTA : Resources, writing—review and editing, supervision.DHL : Formal analysis, writing—review and editing, supervision. All authors read and approved the final manuscript.
JCMR
Methodology, investigation, data curation, writing—review and editing.
AMR
Investigation (external validation), writing—review and editing.
CELM
Investigation, data curation, writing—review and editing.
GDTA
Resources, writing—review and editing, supervision.
DHL
Formal analysis, writing—review and editing, supervision. All authors read and approved the final manuscript.
Acknowledgements
We thank medical records, pharmacy, procurement, and anaesthesiology department staff at Hospital Regional General Ignacio Zaragoza for assistance in locating archived documentation and maintaining specialised care protocols for vascular surgery patients. We particularly acknowledge the departmental secretary responsible for monthly archive reconciliation.
Electronic Supplementary Material
Below is the link to the electronic supplementary material
References
1.
Benchimol EI, Smeeth L, Guttmann A, Harron K, Moher D, Petersen I, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med. 2015;12(10):e1001885.
2.
von Elm E, Altman DG, Egger M, Gøtzsche PC, Mulrow CD, Pocock SJ, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet. 2007;370(9596):1453–7.
3.
US Food and Drug Administration. Framework for FDA's Real-World Evidence Program. Silver Spring, MD: FDA; 2018.
4.
Mi MY, Sun Y, Liu Y, Li H, Zhou L, Gong Y, et al. Reliability of medical record abstraction by nonphysicians for chronic disease research: a systematic review. BMC Med Res Methodol. 2013;13:132.
5.
van Hoeven LR, Janssen MP, Roes KC, Koffijberg H. Validation of multisource electronic health record data: an application to blood transfusion data. BMC Med Inf Decis Mak. 2017;17:107.
6.
Gianinazzi ME, Essig S, Rueegg CS, von der Weid NX, Niggli FK, Kuehni CE, et al. Intra-rater and inter-rater reliability of a medical record abstraction study of transition of care after childhood cancer. PLoS ONE. 2015;10(5):e0124290.
7.
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74.
8.
McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22(3):276–82.
9.
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20(1):37–46.
10.
Fleiss JL, Levin B, Paik MC. Statistical methods for rates and proportions. 3rd ed. New York: Wiley; 2003.
11.
Bagolle A, Cañizares A, Zárate V. Regulatory frameworks for digital health in Latin American and the Caribbean: electronic health records: progresses and next steps. Washington, DC: Inter-American Development Bank; 2020.
12.
Bernal O, Forero JC, Forde I. Digital transformation of the health sector in Latin America and the Caribbean. Washington, DC: Inter-American Development Bank; 2019.
A
13.
López-Valenzuela CL, Ortega-Villa EM, Robles-Franco P, Rivas-Ruiz R, Galván-Plata ME, Castañeda-Alcántara JL, et al. Healthcare information systems in Mexico: description and analysis at national level. BMC Med Inf Decis Mak. 2020;20(1):316.
14.
Kruse CS, Stein A, Thomas H, Kaur H. The use of electronic health records to support population health: a systematic review of the literature. J Med Syst. 2018;42(11):214.
15.
Sheikh A, Cornford T, Barber N, Avery A, Takian A, Lichtner V, et al. Implementation and adoption of nationwide electronic health records in secondary care in England: final qualitative results from prospective national evaluation in early adopter hospitals. BMJ. 2011;343:d6054.
16.
Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inf. 2009;42(2):377–81.
17.
Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O'Neal L, et al. The REDCap consortium: building an international community of software platform partners. J Biomed Inf. 2019;95:103208.
A
18.
R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2021.
19.
Resnic FS, Gross TP, Marinac-Dabic D, Loyo-Berrios N, Donnelly S, Normand SL, et al. Automated surveillance to detect postprocedure safety signals of approved cardiovascular devices. JAMA. 2010;304(18):2019–27.
20.
Normand SL, Landrum MB, Guadagnoli E, Ayanian JZ, Ryan TJ, Cleary PD, et al. Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores. J Clin Epidemiol. 2001;54(4):387–98.
A
21.
Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A. Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. J Clin Epidemiol. 2011;64(8):821–9.
A
22.
Sarrazin MS, Rosenthal GE. Finding pure and simple truths with administrative data. JAMA. 2012;307(13):1433–5.
A
23.
Dean BB, Lam J, Natoli JL, Butler Q, Aguilar D, Nordyke RJ. Review: use of electronic medical records for health outcomes research: a literature review. Med Care Res Rev. 2009;66(6):611–38.
A
24.
Herrett E, Gallagher AM, Bhaskaran K, Forbes H, Mathur R, van Staa T, et al. Data resource profile: Clinical Practice Research Datalink (CPRD). Int J Epidemiol. 2015;44(3):827–36.
A
25.
Casey JA, Schwartz BS, Stewart WF, Adler NE. Using electronic health records for population health research: a review of methods and applications. Annu Rev Public Health. 2016;37:61–81.
A
26.
Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inf. 2015;216:574–8.
27.
Overhage JM, Ryan PB, Reich CG, Hartzema AG, Stang PE. Validation of a common data model for active safety surveillance research. J Am Med Inf Assoc. 2012;19(1):54–60.
A
28.
Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55.
A
29.
Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399–424.
A
30.
Stürmer T, Rothman KJ, Avorn J, Glynn RJ. Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution—a simulation study. Am J Epidemiol. 2010;172(7):843–54.
31.
Sherman RE, Anderson SA, Dal Pan GJ, Gray GW, Gross T, Hunter NL, et al. Real-world evidence—what is it and what can it tell us? N Engl J Med. 2016;375(23):2293–7.
A
32.
Makady A, de Boer A, Hillege H, Klungel O, Goettsch W. (on behalf of GetReal Work Package 1). What is real- world data? A review of definitions based on literature and stakeholder interviews. Value Health. 2017;20(7):858–65.
A
33.
Blonde L, Khunti K, Harris SB, Meizinger C, Skolnik NS. Interpretation and impact of real-world clinical data for the practicing clinician. Adv Ther. 2018;35(11):1763–74.
34.
Dreyer NA, Bryant A, Velentgas P. The GRACE checklist: a validated assessment tool for high quality observational studies of comparative effectiveness. J Manag Care Spec Pharm. 2016;22(10):1107–13.
A
35.
Berger ML, Sox H, Willke RJ, Brixner DL, Eichler HG, Goettsch W, et al. Good practices for real-world data studies of treatment and/or comparative effectiveness: recommendations from the joint ISPOR-ISPE Special Task Force on real-world evidence in health care decision making. Pharmacoepidemiol Drug Saf. 2017;26(9):1033–9.
A
36.
Wang SV, Schneeweiss S, Berger ML, Brown J, de Vries F, Douglas I, et al. Reporting to improve reproducibility and facilitate validity assessment for healthcare database studies V1.0. Pharmacoepidemiol Drug Saf. 2017;26(9):1018–32.
A
37.
European Medicines Agency. Guideline on good pharmacovigilance practices (GVP): Module VI – collection, management and submission of reports of suspected adverse reactions to medicinal products (Rev 2). London: EMA; 2017.
A
38.
Jarow JP, LaVange L, Woodcock J. Multidimensional evidence generation and FDA regulatory decision making: defining and using real-world data. JAMA. 2017;318(8):703–4.
A
39.
Collins R, Bowman L, Landray M, Peto R. The magic of randomization versus the myth of real-world evidence. N Engl J Med. 2020;382(7):674–8.
A
40.
Franklin JM, Schneeweiss S. When and how can real world data analyses substitute for randomized controlled trials? Clin Pharmacol Ther. 2017;102(6):924–33.
Tables
Abbreviations:
SIME
Sistema Institucional de Morbimortalidad y Egresos
SIMEH
Sistema de Información Médica
HIM
Health Information Management
PACS
Picture Archiving and Communication System
ICD-10-CM
International Classification of Diseases,10th Revision,Clinical Modification.
Click here to Correct
Click here to Correct
Click here to Correct
Abstract
Background: Medical device registries in transitional healthcare systems face substantial challenges achieving reliable data extraction due to documentation fragmentation and absence of integrated electronic health records. Conventional single-source or dual-source extraction methods demonstrate moderate inter-rater reliability (κ=0.50–0.70), limiting research validity. We hypothesised that systematic multi-source data triangulation could achieve near-perfect reliability comparable to advanced registry systems whilst providing a replicable framework for resource-limited settings.   Methods: We conducted a retrospective methodological validation study of 176 rotational atherectomy procedures performed between January 2020 and December 2023 at Hospital Regional General Ignacio Zaragoza, Mexico. Three independent, blinded investigators extracted device-specific data from thirteen documentary sources organised into six validation domains. Inter-rater reliability was assessed using Cohen's κ with 95% confidence intervals. External validation was performed by an investigator from a geographically separate institution.   Results: The thirteen-source framework achieved overall inter-rater reliability of κ=0.97 (95% CI: 0.94–0.99), with 96.6% concordance across three extractors. External validation demonstrated κ=0.94, confirming reproducibility without subspecialty expertise. Complete device identification was achieved in 100% of procedures. Comparative bootstrap analysis revealed 49% improvement over single-source extraction (κ=0.65, p0.001) and 24% improvement over dual-source methods (κ=0.78, p0.001). Conclusions: Systematic multi-source data triangulation enables transitional healthcare systems to achieve research- grade inter-rater reliability exceeding advanced registry benchmarks. Documentation multiplicity, when leveraged through structured protocols, transforms from methodological limitation to asset.
Total words in MS: 3057
Total words in Title: 13
Total words in Abstract: 230
Total Keyword count: 8
Total Images in MS: 5
Total Tables in MS: 4
Total Reference count: 40