3. Results
3.1. Descriptive statistics
A
Among the 303 participants, about two-third were male (68.0%), while females represented 32.0%. The mean age was 54.4 ± 9.0 years (range: 29–77). Nearly half of the sample (47.5%) reported asymptomatic chest pain, whereas 28.4% experienced non-anginal pain, 16.5% atypical angina, and only 7.6% typical angina.
Most participants had normal fasting blood sugar levels (85.1%), with 14.9% showing elevated levels (> 120 mg/dl). Resting electrocardiographic results were almost equally distributed between normal findings (49.8%) and left ventricular hypertrophy (48.8%), while ST-T abnormalities were uncommon (1.3%).
The mean resting systolic blood pressure was 131.7 ± 17.6 mmHg (range: 94–200). Categorically, 44.6% had normal systolic pressure (< 130 mmHg), 23.1% had stage 1 hypertension (130–139 mmHg), and 32.3% had stage 2 hypertension (≥ 140 mmHg). Mean serum cholesterol was 246.7 ± 51.8 mg/dl, with 16.2% <200 mg/dl, 32.3% between 200–239 mg/dl, and 51.5% ≥240 mg/dl. Maximum heart rate averaged 149 ± 22.9 bpm (range: 71–202). The mean ST-segment depression during exercise was 1.04 ± 1.16 mm, with 32.7% showing no depression, 22.1% with < 1 mm, and 45.2% with ≥ 1 mm depression.
Exercise-induced angina was reported in about one-third of patients (32.7%), while the majority did not experience angina on exertion (67.3%). The slope of the peak exercise ST segment was either upsloping (46.9%) or flat (46.2%) in most patients, with only 6.9% demonstrating the clinically adverse down-sloping pattern.
Fluoroscopy revealed no major vessels colored in 179 patients (59.1%), while one, two, and three vessels were colored in 66 (21.8%), 38 (12.5%), and 20 (6.6%) patients, respectively.
Thalassemia-related results showed that 55.1% of participants were classified as normal, 38.9% exhibited a reversible defect, and 5.9% a fixed defect.
Regarding coronary heart disease (CHD) outcomes, 54.1% had no angiographic evidence of disease, while 45.9% had some degree of disease. On the ordinal severity scale, 18.2% had mild, 11.9% moderate, 11.6% significant, and 4.3% severe disease (Tables 1, 2).
Table 1
Descriptive statistics of categorical variables in the Cleveland Heart Disease dataset (N = 303)
Variable | Category | Frequency (n) | Percent (%) |
|---|
Sex | Female | 97 | 32.0 |
Male | 206 | 68.0 |
Chest pain type | Typical angina | 23 | 7.6 |
Atypical angina | 50 | 16.5 |
Non-anginal pain | 86 | 28.4 |
Asymptomatic | 144 | 47.5 |
Fasting blood sugar | Normal (< 120 mg/dl) | 258 | 85.1 |
Elevated (> 120 mg/dl) | 45 | 14.9 |
Resting ECG results | Normal | 151 | 49.8 |
ST-T abnormality | 4 | 1.3 |
Left ventricular hypertrophy | 148 | 48.8 |
Resting systolic blood pressure | Normal | 135 | 44.6 |
Stage 1 hypertension | 70 | 23.1 |
Stage 2 hypertension | 98 | 32.3 |
Serum cholesterol | normal | 49 | 16.2 |
elevated | 98 | 32.3 |
High levels | 156 | 51.5 |
ST depression | No ST-depression | 99 | 32.7 |
0–1 mm | 67 | 22.1 |
≥ 1 mm | 137 | 45.2 |
Exercise-induced angina | No | 204 | 67.3 |
Yes | 99 | 32.7 |
Slope of ST-segment | Upsloping (normal) | 142 | 46.9 |
Flat | 140 | 46.2 |
Down-sloping | 21 | 6.9 |
Thalassemia status | Normal | 167 | 55.1 |
Fixed defect | 18 | 6.0 |
Reversible defect | 118 | 38.9 |
No. of major vessels (fluoroscopy) | 0 | 179 | 59.1 |
1 | 66 | 21.8 |
2 | 38 | 12.5 |
3 | 20 | 6.6 |
CHD severity (0–4 scale) | No disease (0) | 164 | 54.1 |
Mild (1 vessel) | 55 | 18.2 |
Moderate (2 vessels) | 36 | 11.9 |
Significant (3 vessels) | 35 | 11.6 |
Severe (4 vessels) | 13 | 4.3 |
CHD presence (binary) | Absent | 164 | 54.1 |
Present | 139 | 45.9 |
Table 2
Descriptive statistics of continuous variables
Variable | Mean ± SD | Median | Minimum | Maximum |
|---|
Age (years) | 54.44 ± 9.04 | 56.00 | 29 | 77 |
Resting systolic blood pressure mmHg | 131.69 ± 17.60 | 130 | 94 | 200 |
Serum cholesterol mg/dl | 246.69 ± 51.78 | 241 | 126 | 564 |
Maximum heart rate achieved bpm | 149.61 ± 22.88 | 153 | 71 | 202 |
ST depression mm | 1.04 ± 1.16 | 0.8 | 0 | 6.2 |
No. of major vessels (fluoroscopy) | 0.67 ± 0.93 | 0 | 0 | 3 |
3.2. Logistic regression
3.2.1. Assumption checks
Prior to regression analysis, data were screened for suitability (Schober & Vetter, 2021). All the categorical predictors and dependent variables were pre-coded in SPSS v25, and no missing values were present in the dataset. the independence of observations was assumed, as each case represented a unique patient.
For continuous variables (age, resting blood pressure, serum cholesterol, maximum heart rate, and ST-segment depression), the linearity of the logit was assessed using the Box-Tidwell approach. None of the interaction terms were statistically significant (p > 0.05), indicating that the assumption of linearity was met.
Multicollinearity among predictors was assessed using variance inflation factors (VIF) derived from an auxiliary linear regression. All predictors showed tolerance values greater than 0.4 and VIF values below 2.5, confirming the absence of problematic multicollinearity among predictors.
3.2.2. Binary logistic regression
The overall binary logistic regression model fit was highly satisfactory. The Omnibus test of model coefficients indicated that the logistic regression model significantly improved prediction compared to the null model (χ²(18) = 224.197, p < 0.001), confirming that the included predictors collectively explained the presence of coronary heart disease. The Cox& Snell R² was 0.523 and a Nagelkerke R² was 0.699, suggesting that the predictors explained approximately 52% to 70% of the variance in disease status. Model calibration was also acceptable, as evidenced by the Hosmer-Lemeshow test (χ²(8) = 7.093, p = 0.527), which indicated no evidence of poor fit. The classification table demonstrated excellent discriminatory performance, with an overall accuracy of 87.5%, sensitivity of 82.7%, and specificity of 91.5%.
The reference groups for categorical variables were: asymptomatic (chest pain type), reversible defect (thalassemia status), left ventricular hypertrophy (resting ECG results), down-sloping (ST segment slope), yes (exercise-induced angina), high blood sugar (fasting blood sugar), and male (sex).
Significant predictors of CHD were sex, chest pain type, systolic blood pressure, number of major vessels, and thalassemia status.
Sex: females had 78.3% lower odds of CHD compared to males (B=-1.526, OR = 0.217, p = 0.004).
Chest pain: compared to asymptomatic patients, those with typical angina had 87.9% lower odds (B= -2.11, OR = 0.121, p = 0.001), and those with anginal pain had 84.7% lower odds (B=-1.876, OR = 0.153, p < 0.001).
Resting systolic blood pressure: each 1 mmHg increase was associated with a 2.5% increase in the odds of CHD (B = 0.024, OR = 1.025, p = 0.031).
Number of major vessels: each additional vessel increased the odds of CHD by 271.2% (B = 1.312, OR = 3.712, p < 0.001).
Thalassemia status: patients with normal perfusion had 74.7% lower odds of CHD compared to those with a reversible defect (B= -1.373, OR = 0.253, p = 0.001).
3.2.3. Ordinal logistic regression
An ordinal logistic regression model was performed to identify predictors of CHD severity, which was categorized as no disease (54.1%), mild (18.2%), moderate (11.9%), significant (11.6%), and severe (4.3%). Among the clinical variables, the number of major vessels visualized by fluoroscopy (Estimate = 0.890, p < 0.001) was the strongest predictor, with higher vessel counts markedly increasing the likelihood of progressing to more severe CHD categories. Chest pain type was also significantly associated with CHD severity: compared to asymptomatic patients, those presenting with typical angina (Estimate= -1.715, p = 0.002), atypical angina (Estimate = -1.090, p = 0.023), and non-anginal pain (Estimate= -1.536, p < 0.001) had significantly lower odds of being in a higher severity category. Male sex was independently associated with more severe (Estimate= -0.997, p = 0.006). thalassemia status showed a strong relationship with severity, where a reversible perfusion defect was associate with greater odds of more severe disease compared to normal scan (Estimate=-1.310, p < 0.001). Resting systolic blood pressure (B = 0.013, p = 0.099), ST-segment depression during exercise (oldpeak, Estimate = 0.276, p = 0.054), and thalassemia with fixed defect (Estimate= -0.946, p = 0.058) approached statistical significance. Other variables, including age, serum cholesterol, fasting blood sugar, maximum heart rate, resting ECG results, exercise-induced angina, and slope of the ST segment, did not show significant associations.
3.3. Development of a scoring system from ordinal logistic regression
Based on the results of the ordinal logistic regression model, the significant or approached significance predictors of CHD severity were identified as number of major vessels visualized by fluoroscopy, chest pain type, sex, thalassemia status, Resting systolic blood pressure, and ST-segment depression during exercise. To translate these findings into a practical scoring system, regression coefficients were rescaled into integer points to reflect their relative contribution to CHD severity. Male sex was assigned + 3 points, while females received 0 points. For chest pain type, asymptomatic patients had the highest risk and were assigned + 6 points, whereas typical angina was scored 0, atypical angina + 1, and non-anginal pain + 1. Thalassemia status was scored as reversible defect + 4 points (highest risk), fixed defect + 1, and normal perfusion 0. The number of major vessels was weighted as + 3 points for each affected vessel (range 0–9). Points in the scoring system were assigned by approximating each 0.3 increase in the log-odds (B coefficient) from the ordinal logistic regression as one point. In this way, stronger predictors with larger B coefficient contributed proportionally more points to the total score. For categorical variables, the reference category was always assigned zero points, and the other categories were scored relative to it. For SBP (B = 0.013, p = 0.099); each 1 mmHg increase adds 0.013 to the log-odds. To translate this into the scoring system (≈ 1 point for every 0.3 log-odds), we need about 23 mmHg (0.3/0.013 ≈ 23) to each 1 point. For simplicity, it was rounded to 20 mmHg = 1 point. For oldpeak (B = 0.276, p = 0.054); each 1 unit increase adds 0.276 to the log-odds. That’s already very close to 0.3, so it naturally translates to 1 point per 1 unit increase. This produced a total score range from 0 (lowest risk: female, typical angina, normal perfusion, no vessel involvement, resting systolic blood pressure < 120 mmHg, ST-depression during exercise < 1.3 mm) to 28 (highest risk: male, asymptomatic, reversible defect, three-vessel involvement, resting systolic blood pressure > 160 mmHg, ST-depression during exercise > 3.5mm) (Table 3).
Table 3
Weighted scoring system for predictors of cardiovascular risk
Predictor | Category | Points |
|---|
Sex | Female | 0 |
Male | + 3 |
Chest pain type | Typical angina | 0 |
Atypical angina | + 1 |
Non-anginal pain | + 1 |
Asymptomatic | + 6 |
Thalassemia status | Normal perfusion | 0 |
Fixed defect | + 1 |
Reversible defect | + 4 |
Number of major vessels | 0 | 0 |
1 | + 3 |
2 | + 6 |
3 | + 9 |
Resting systolic blood pressure mmHg | < 120 | 0 |
120–140 | 1 |
140–160 | 2 |
> 160 | 3 |
ST-depression during exercise mm | < 1.3 | 0 |
1.3–2.3 | 1 |
2.3–3.5 | 2 |
> 3.5 | 3 |
After applying this scoring system to the 303 patients, the score distribution within each diagnostic was examined. Patients with no disease had a mean score of 6.64 and a median of 6.00 (range: 1–22). Those with mild disease had a mean of 12.58 and a median of 12.00 (range: 4–24), while moderate disease corresponded to a mean of 16.44 and a median of 17.00 (range: 9–24). For significant disease, the mean score was 17.63 and the median was 17.00 (range: 8–27), and patients with severe disease had the highest score, with a mean of 18.82 and a median of 19.00 (range: 10–25). Using these observed distributions, cutoff points were derived to reflect the progression of CHD severity: scores of 0–9 were categorized as no disease, 10–13 as mild disease, 14–16 as moderate disease, 17–18 as significant disease, and 19–28 as severe disease. These cutoffs were therefore determined empirically from the patient data, ensuring that the scoring system align with the actual distribution of disease categories in the study population (Table 4).
Table 4
Disease severity classification based on total score
Category | Cutoff point |
|---|
No disease | 0–9 |
Mild disease | 10–13 |
Moderate disease | 14–16 |
Significant disease | 17–18 |
Severe disease | 19–28 |
The significant variables and categories have been converted into questions in this Google Form: https://docs.google.com/forms/d/e/1FAIpQLSfT1-pi_WM5lSH0OwNkzn-Iu1SHajNxaMm00kivlzpPkhbTOg/viewform. Each choice is assigned a weight or point value. Google Forms, by default, does not support weighted questions, so the grades or final scores are not displayed immediately after submission. However, there are add-ons for Google Forms, such as Formfacade, that can be linked to the form to provide immediate, interactive, and user-friendly results.