Overview

Brought to you by YData

Dataset statistics

Number of variables7
Number of observations7496
Missing cells5930
Missing cells (%)11.3%
Total size in memory726.5 KiB
Average record size in memory99.2 B

Variable types

Categorical6
Numeric1

Alerts

gender has 569 (7.6%) missing values Missing
age_at_diagnosis has 907 (12.1%) missing values Missing
tumor_grade has 4454 (59.4%) missing values Missing

Reproduction

Analysis started2025-06-19 17:54:42.474571
Analysis finished2025-06-19 17:54:42.556073
Duration0.08 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

category
Categorical

Distinct54
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size117.1 KiB
Blood or Bone marrow Acute myeloid leukemia
 
540
Uterus Endometrioid adenocarcinoma
 
362
Breast Infiltrating duct carcinoma
 
352
Head and Neck Squamous cell carcinoma
 
340
Thyroid gland Papillary carcinoma
 
337
Other values (49)
5565 

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowColon Adenocarcinoma
2nd rowColon Adenocarcinoma
3rd rowStomach Carcinoma
4th rowColon Adenocarcinoma
5th rowBrain Glioblastoma

Common Values

ValueCountFrequency (%)
Blood or Bone marrow Acute myeloid leukemia 540
7.2%
Uterus Endometrioid adenocarcinoma 362
4.8%
Breast Infiltrating duct carcinoma 352
4.7%
Head and Neck Squamous cell carcinoma 340
4.5%
Thyroid gland Papillary carcinoma 337
4.5%
Lung Adenocarcinoma 321
4.3%
Lung Squamous cell carcinoma 302
4.0%
Kidney Renal cell carcinoma 266
3.5%
Colon Adenocarcinoma 259
3.5%
Skin Malignant melanoma 237
3.2%
Lung Healthy 236
3.1%
Liver Hepatocellular carcinoma 231
3.1%
Stomach Carcinoma 229
3.1%
Cervix uteri Squamous cell carcinoma 221
2.9%
Prostate gland Adenocarcinoma 221
2.9%
Brain Glioblastoma 221
2.9%
Kidney Healthy 216
 
2.9%
Pancreas Infiltrating duct carcinoma 200
 
2.7%
Bladder Transitional cell carcinoma 187
 
2.5%
Kidney Clear cell adenocarcinoma 186
 
2.5%
Blood or Bone marrow Healthy 158
 
2.1%
Kidney Papillary adenocarcinoma 154
 
2.1%
Prostate gland Acinar cell carcinoma 131
 
1.7%
Brain Oligodendroglioma 122
 
1.6%
Breast Lobular carcinoma 119
 
1.6%
Brain Astrocytoma 105
 
1.4%
Adrenal gland Pheochromocytoma 96
 
1.3%
Blood or Bone marrow Chronic lymphocytic leukemia 86
 
1.1%
Kidney Wilms tumor 76
 
1.0%
Ovarian Serous cancer 66
 
0.9%
Breast Healthy 66
 
0.9%
Uterus Serous cystadenocarcinoma 58
 
0.8%
Head and Neck Healthy 57
 
0.8%
Esophagus Squamous cell carcinoma 53
 
0.7%
Pancreas Healthy 53
 
0.7%
Bones Osteosarcoma 50
 
0.7%
Adrenal gland Neuroblastoma 49
 
0.7%
Adrenal gland Adrenal cortical carcinoma 46
 
0.6%
Esophagus Adenocarcinoma 45
 
0.6%
Thymus Thymoma 44
 
0.6%
Testis Seminoma 40
 
0.5%
Kidney Malignant rhabdoid tumor 40
 
0.5%
Prostate gland Healthy 39
 
0.5%
Thyroid gland Healthy 37
 
0.5%
Liver Healthy 33
 
0.4%
Blood or Bone marrow Acute lymphocytic leukemia 30
 
0.4%
Cervix uteri Adenocarcinoma 27
 
0.4%
Blood or Bone marrow Acute myelomonocytic leukemia 27
 
0.4%
Colon Healthy 26
 
0.3%
Retroperitoneum Dedifferentiated liposarcoma 24
 
0.3%
Pleura Epithelioid mesothelioma 22
 
0.3%
Anterior mediastinum Thymoma 20
 
0.3%
Retroperitoneum Leiomyosarcoma 17
 
0.2%
Uterus Healthy 16
 
0.2%

gender
Categorical

Missing 

Distinct2
Distinct (%)< 0.1%
Missing569
Missing (%)7.6%
Memory size375.2 KiB
male
3633 
female
3294 

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfemale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 3633
48.5%
female 3294
43.9%
(Missing) 569
 
7.6%

Common Values (Plot)

2025-06-19T17:54:42.608504image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

age_at_diagnosis
Real number (ℝ)

Missing 

Distinct4891
Distinct (%)74.2%
Missing907
Missing (%)12.1%
Infinite0
Infinite (%)0.0%
Mean20585.29929
Minimum6
Maximum32872
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size375.2 KiB
2025-06-19T17:54:42.670346image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile4456
Q117745
median21979
Q325281
95-th percentile29082.2
Maximum32872
Range32866
Interquartile range (IQR)7536

Descriptive statistics

Standard deviation6876.112427
Coefficient of variation (CV)0.3340302384
Kurtosis1.170683759
Mean20585.29929
Median Absolute Deviation (MAD)3588
Skewness-1.149106617
Sum135636537
Variance47280922.11
MonotonicityNot monotonic
2025-06-19T17:54:42.736321image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32872 18
 
0.2%
20550 7
 
0.1%
21891 7
 
0.1%
22718 7
 
0.1%
23404 6
 
0.1%
20286 6
 
0.1%
21949 5
 
0.1%
19314 5
 
0.1%
20425 5
 
0.1%
26559 5
 
0.1%
22407 5
 
0.1%
21263 5
 
0.1%
32871 5
 
0.1%
18885 5
 
0.1%
25462 5
 
0.1%
26555 5
 
0.1%
28865 5
 
0.1%
19855 5
 
0.1%
23275 5
 
0.1%
21144 5
 
0.1%
22179 5
 
0.1%
20598 4
 
0.1%
25659 4
 
0.1%
20507 4
 
0.1%
17215 4
 
0.1%
27320 4
 
0.1%
22041 4
 
0.1%
22176 4
 
0.1%
29529 4
 
0.1%
21194 4
 
0.1%
23353 4
 
0.1%
20369 4
 
0.1%
24825 4
 
0.1%
24019 4
 
0.1%
23413 4
 
0.1%
24846 4
 
0.1%
25138 4
 
0.1%
20237 4
 
0.1%
22285 4
 
0.1%
21488 4
 
0.1%
22997 4
 
0.1%
730 4
 
0.1%
24663 4
 
0.1%
8551 4
 
0.1%
21429 4
 
0.1%
21183 4
 
0.1%
23458 4
 
0.1%
29350 4
 
0.1%
18932 4
 
0.1%
19161 4
 
0.1%
21464 4
 
0.1%
25020 4
 
0.1%
19500 4
 
0.1%
32848 4
 
0.1%
1757 4
 
0.1%
14933 4
 
0.1%
24055 4
 
0.1%
24927 4
 
0.1%
22558 4
 
0.1%
22238 4
 
0.1%
22433 4
 
0.1%
24328 4
 
0.1%
22057 4
 
0.1%
19213 4
 
0.1%
30462 4
 
0.1%
26748 4
 
0.1%
12822 4
 
0.1%
24990 4
 
0.1%
25495 4
 
0.1%
25827 4
 
0.1%
24311 4
 
0.1%
26781 4
 
0.1%
26268 4
 
0.1%
24969 4
 
0.1%
22056 3
 
< 0.1%
23380 3
 
< 0.1%
25091 3
 
< 0.1%
26722 3
 
< 0.1%
19107 3
 
< 0.1%
25300 3
 
< 0.1%
20513 3
 
< 0.1%
21936 3
 
< 0.1%
23953 3
 
< 0.1%
24957 3
 
< 0.1%
28406 3
 
< 0.1%
23006 3
 
< 0.1%
24388 3
 
< 0.1%
27556 3
 
< 0.1%
19550 3
 
< 0.1%
18638 3
 
< 0.1%
1159 3
 
< 0.1%
29674 3
 
< 0.1%
3748 3
 
< 0.1%
21607 3
 
< 0.1%
24773 3
 
< 0.1%
5452 3
 
< 0.1%
20364 3
 
< 0.1%
23958 3
 
< 0.1%
15065 3
 
< 0.1%
24187 3
 
< 0.1%
22220 3
 
< 0.1%
18843 3
 
< 0.1%
18723 3
 
< 0.1%
367 3
 
< 0.1%
25703 3
 
< 0.1%
22045 3
 
< 0.1%
20066 3
 
< 0.1%
19390 3
 
< 0.1%
20728 3
 
< 0.1%
26582 3
 
< 0.1%
18822 3
 
< 0.1%
24282 3
 
< 0.1%
24896 3
 
< 0.1%
24660 3
 
< 0.1%
26976 3
 
< 0.1%
24046 3
 
< 0.1%
23603 3
 
< 0.1%
26840 3
 
< 0.1%
20771 3
 
< 0.1%
23162 3
 
< 0.1%
23313 3
 
< 0.1%
21749 3
 
< 0.1%
22455 3
 
< 0.1%
22745 3
 
< 0.1%
22317 3
 
< 0.1%
21252 3
 
< 0.1%
25935 3
 
< 0.1%
23547 3
 
< 0.1%
21515 3
 
< 0.1%
21730 3
 
< 0.1%
25709 3
 
< 0.1%
24774 3
 
< 0.1%
25477 3
 
< 0.1%
25891 3
 
< 0.1%
28432 3
 
< 0.1%
15842 3
 
< 0.1%
28367 3
 
< 0.1%
26021 3
 
< 0.1%
22771 3
 
< 0.1%
26058 3
 
< 0.1%
23656 3
 
< 0.1%
26409 3
 
< 0.1%
21164 3
 
< 0.1%
16195 3
 
< 0.1%
16460 3
 
< 0.1%
17168 3
 
< 0.1%
6000 3
 
< 0.1%
23649 3
 
< 0.1%
27092 3
 
< 0.1%
25039 3
 
< 0.1%
20979 3
 
< 0.1%
28275 3
 
< 0.1%
4121 3
 
< 0.1%
549 3
 
< 0.1%
3890 3
 
< 0.1%
24755 3
 
< 0.1%
24932 3
 
< 0.1%
27104 3
 
< 0.1%
3310 3
 
< 0.1%
17859 3
 
< 0.1%
22652 3
 
< 0.1%
18742 3
 
< 0.1%
19454 3
 
< 0.1%
20483 3
 
< 0.1%
23849 3
 
< 0.1%
23634 3
 
< 0.1%
22106 3
 
< 0.1%
26902 3
 
< 0.1%
645 3
 
< 0.1%
22332 3
 
< 0.1%
5356 3
 
< 0.1%
26044 3
 
< 0.1%
18491 3
 
< 0.1%
25247 3
 
< 0.1%
23514 3
 
< 0.1%
22475 3
 
< 0.1%
2686 3
 
< 0.1%
28706 3
 
< 0.1%
5885 3
 
< 0.1%
13671 3
 
< 0.1%
4559 3
 
< 0.1%
24887 3
 
< 0.1%
17396 3
 
< 0.1%
25223 3
 
< 0.1%
16504 3
 
< 0.1%
23480 3
 
< 0.1%
24085 3
 
< 0.1%
21374 3
 
< 0.1%
18611 3
 
< 0.1%
658 3
 
< 0.1%
329 3
 
< 0.1%
24719 3
 
< 0.1%
23892 3
 
< 0.1%
21803 3
 
< 0.1%
1482 3
 
< 0.1%
19066 3
 
< 0.1%
16425 3
 
< 0.1%
5830 3
 
< 0.1%
28714 3
 
< 0.1%
5420 3
 
< 0.1%
21281 3
 
< 0.1%
22863 3
 
< 0.1%
26143 3
 
< 0.1%
25205 3
 
< 0.1%
22039 3
 
< 0.1%
3399 3
 
< 0.1%
18827 3
 
< 0.1%
23797 3
 
< 0.1%
664 3
 
< 0.1%
18659 3
 
< 0.1%
25633 3
 
< 0.1%
24779 3
 
< 0.1%
23360 3
 
< 0.1%
25889 3
 
< 0.1%
25515 3
 
< 0.1%
3262 3
 
< 0.1%
6631 3
 
< 0.1%
366 3
 
< 0.1%
21514 3
 
< 0.1%
21093 3
 
< 0.1%
27183 3
 
< 0.1%
22990 3
 
< 0.1%
13138 3
 
< 0.1%
27684 3
 
< 0.1%
22684 3
 
< 0.1%
20392 3
 
< 0.1%
21764 3
 
< 0.1%
24731 3
 
< 0.1%
3050 3
 
< 0.1%
18510 3
 
< 0.1%
24243 3
 
< 0.1%
15137 3
 
< 0.1%
26084 3
 
< 0.1%
23227 3
 
< 0.1%
22957 3
 
< 0.1%
24623 3
 
< 0.1%
25902 3
 
< 0.1%
25328 3
 
< 0.1%
18785 3
 
< 0.1%
22199 3
 
< 0.1%
20363 3
 
< 0.1%
22645 3
 
< 0.1%
22344 3
 
< 0.1%
23955 3
 
< 0.1%
18990 3
 
< 0.1%
25502 3
 
< 0.1%
20764 3
 
< 0.1%
26136 3
 
< 0.1%
25407 3
 
< 0.1%
24834 3
 
< 0.1%
Other values (4641) 5723
76.3%
(Missing) 907
 
12.1%
ValueCountFrequency (%)
6 1
< 0.1%
11 1
< 0.1%
25 1
< 0.1%
31 1
< 0.1%
43 1
< 0.1%
ValueCountFrequency (%)
32872 18
0.2%
32871 5
 
0.1%
32848 4
 
0.1%
32784 1
 
< 0.1%
32731 1
 
< 0.1%
Distinct25
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size375.2 KiB
Kidney
938 
Lung
859 
Blood or Bone marrow
841 
Breast
537 
Brain
448 
Other values (20)
3873 

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowColon
2nd rowColon
3rd rowStomach
4th rowColon
5th rowBrain

Common Values

ValueCountFrequency (%)
Kidney 938
12.5%
Lung 859
11.5%
Blood or Bone marrow 841
11.2%
Breast 537
7.2%
Brain 448
6.0%
Uterus 436
5.8%
Head and Neck 397
5.3%
Prostate gland 391
5.2%
Thyroid gland 374
 
5.0%
Colon 285
 
3.8%
Liver 264
 
3.5%
Pancreas 253
 
3.4%
Cervix uteri 248
 
3.3%
Skin 237
 
3.2%
Stomach 229
 
3.1%
Adrenal gland 191
 
2.5%
Bladder 187
 
2.5%
Esophagus 98
 
1.3%
Ovarian 66
 
0.9%
Bones 50
 
0.7%
Thymus 44
 
0.6%
Retroperitoneum 41
 
0.5%
Testis 40
 
0.5%
Pleura 22
 
0.3%
Anterior mediastinum 20
 
0.3%
Distinct38
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size375.2 KiB
Adenocarcinoma
989 
Healthy
937 
Squamous cell carcinoma
916 
Infiltrating duct carcinoma
552 
Acute myeloid leukemia
540 
Other values (33)
3562 

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAdenocarcinoma
2nd rowAdenocarcinoma
3rd rowAdenocarcinoma
4th rowAdenocarcinoma
5th rowGlioblastoma

Common Values

ValueCountFrequency (%)
Adenocarcinoma 989
13.2%
Healthy 937
12.5%
Squamous cell carcinoma 916
12.2%
Infiltrating duct carcinoma 552
7.4%
Acute myeloid leukemia 540
7.2%
Papillary adenocarcinoma 389
 
5.2%
Endometrioid adenocarcinoma 362
 
4.8%
Renal cell carcinoma 266
 
3.5%
Malignant melanoma 237
 
3.2%
Hepatocellular carcinoma 231
 
3.1%
Glioblastoma 221
 
2.9%
Clear cell adenocarcinoma 186
 
2.5%
Transitional cell carcinoma 149
 
2.0%
Acinar cell carcinoma 131
 
1.7%
Oligodendroglioma 122
 
1.6%
Lobular carcinoma 119
 
1.6%
Astrocytoma 105
 
1.4%
Papillary carcinoma 102
 
1.4%
Pheochromocytoma 96
 
1.3%
Chronic lymphocytic leukemia 86
 
1.1%
Wilms tumor 76
 
1.0%
Serous cancer 66
 
0.9%
Thymoma 64
 
0.9%
Serous cystadenocarcinoma 58
 
0.8%
Osteosarcoma 50
 
0.7%
Neuroblastoma 49
 
0.7%
Tubular adenocarcinoma 46
 
0.6%
Adrenal cortical carcinoma 46
 
0.6%
Seminoma 40
 
0.5%
Malignant rhabdoid tumor 40
 
0.5%
Papillary transitional cell carcinoma 38
 
0.5%
Carcinoma 37
 
0.5%
Mucinous adenocarcinoma 30
 
0.4%
Acute lymphocytic leukemia 30
 
0.4%
Acute myelomonocytic leukemia 27
 
0.4%
Dedifferentiated liposarcoma 24
 
0.3%
Epithelioid mesothelioma 22
 
0.3%
Leiomyosarcoma 17
 
0.2%

tumor_grade
Categorical

Missing 

Distinct4
Distinct (%)0.1%
Missing4454
Missing (%)59.4%
Memory size375.2 KiB
G2
1493 
G3
1181 
G1
267 
G4
 
101

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowG2
2nd rowG2
3rd rowG2
4th rowG2
5th rowG1

Common Values

ValueCountFrequency (%)
G2 1493
 
19.9%
G3 1181
 
15.8%
G1 267
 
3.6%
G4 101
 
1.3%
(Missing) 4454
59.4%

Common Values (Plot)

2025-06-19T17:54:42.789360image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

platform
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size375.2 KiB
450K
5369 
EPIC
2104 
EPICv2
 
23

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEPIC
2nd rowEPIC
3rd rowEPIC
4th rowEPIC
5th rowEPIC

Common Values

ValueCountFrequency (%)
450K 5369
71.6%
EPIC 2104
 
28.1%
EPICv2 23
 
0.3%

Common Values (Plot)

2025-06-19T17:54:42.833663image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/