Full Title page:

Title: Transformer-Enhanced Generative Adversarial Networks for Improving MR Image Quality in Prostate Imaging

Author list

Jie Bao12^#, Litao Zhao34^#, Ying Hou5, Yueting Su6, Libiao Ji7, Junkang Shen8, Ximing Wang12, Hailin Shen9, Pheng-Ann Hen34*, Chunhong Hu12*, Yu-Dong Zhang5*

1. Department of Radiology, the First Affiliated Hospital of Soochow University

188#, Shizi Road, Suzhou, Jiangsu, 215006, China

2. Institute of Medical Imaging, Soochow University

1#, Shizi Road, Suzhou, Jiangsu, 215006, China

3. Department of Computer Science and Engineering, The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

4. Institute of Medical Intelligence and XR, The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

5. Department of Radiology, the First Affiliated Hospital of Nanjing Medical University

300#, Guangzhou Road, Nanjing, Jiangsu, 210029, China

6. Department of Radiology, The People's Hospital of Taizhou

210#, Yingchun Road, Taizhou, Jiangsu, 225399, China

7. Department of Radiology, Changshu NO.1 People's Hospital

1#, Shuyuan Street, Changshu, Jiangsu, 215501, China

8. Department of Radiology, the Second Affiliated Hospital of Soochow University

1055#, Sanxiang Road, Suzhou, Jiangsu, 215004, China

9.Department of Radiology, Suzhou Kowloon Hospital, Shanghai Jiaotong University School of Medicine

118#, Wanshen street, Suzhou, Jiangsu, 215028, China

*Corresponding author

^# Equal contribution

Author list

JieBao1Emailbaojie7346@suda.edu.cn

LitaoZhao1Emaillitaozhao@cuhk.edu.hk

YingHou5Emailnjmu_hy@163.com

YuetingSu6Email291514032@qq.comEmail824415884@qq.com

LibiaoJi7Emailjilibiao@163.com

JunkangShen8

XimingWang1Emailwangximing1998@163.com

HailinShen9Emailhailinshen@163.com

Pheng-AnnHen1✉,2,4,10

ChunhongHu1✉,2,4,10Emailsdfyyhch@163.com

Yu-DongZhang1,2,4,5✉,10Emailnjmu_zyd@163.com

Pheng-AnnHeng1Emailpheng@cse.cuhk.edu.hk

1Department of Radiologythe First Affiliated Hospital of Soochow University188#, Shizi Road215006SuzhouJiangsuChina

2Institute of Medical ImagingSoochow University1#, Shizi Road215006SuzhouJiangsuChina

3Department of Computer Science and EngineeringThe Chinese University of Hong Kong Shatin, SAR999077New TerritoriesHong KongChina

4Institute of Medical Intelligence and XRThe Chinese University of Hong Kong, SAR999077Shatin, New TerritoriesHong KongChina

5Department of Radiologythe First Affiliated Hospital of Nanjing Medical University300#, Guangzhou Road210029NanjingJiangsuChina

6Department of RadiologyThe People’s Hospital of Taizhou210#, Yingchun Road225399TaizhouJiangsuChina

7Department of RadiologyChangshu NO.1 People’s Hospital 1#, Shuyuan Street215501ChangshuJiangsuChina

8Department of Radiologythe Second Affiliated Hospital of Soochow University1055#, Sanxiang Road215004SuzhouJiangsuChina

9Department of RadiologySuzhou Kowloon Hospital, Shanghai Jiaotong University School of Medicine118#, Wanshen street215028SuzhouJiangsuChina

10Department of Computer Science and EngineerinThe Chinese University of Hong Kong, SAR999077Shatin, New TerritoriesHong KongChina

Jie Bao

1.Department of Radiology

the First Affiliated Hospital of Soochow University

188#, Shizi Road, Suzhou, 215006, China

2. Institute of Medical Imaging, Soochow University

1#, Shizi Road, Suzhou, Jiangsu, 215006, China

E-mail: baojie7346@suda.edu.cn

Litao Zhao

1.Department of Computer Science and Engineerin

The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

2.Institute of Medical Intelligence and XR

The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

E-mail: litaozhao@cuhk.edu.hk

Ying Hou

Department of Radiology

the First Affiliated Hospital of Nanjing Medical University

300#, Guangzhou Road, Nanjing, Jiangsu, 210029, China

E-mail: njmu_hy@163.com

Yueting Su

Department of Radiology,

The People's Hospital of Taizhou

210# Yingchun Road, Taizhou, Jiangsu, 225399, China

E-mail: 291514032@qq.com

Libiao Ji

Department of Radiology

Changshu NO.1 People's Hospital

1# Shuyuan Street, Changshu, Jiangsu, 215501, China

E-mail: jilibiao@163.com

Junkang Shen

Department of Radiology

the Second Affiliated Hospital of Soochow University

1055# Sanxiang Road, Suzhou, Jiangsu, 215004, China

E-mail: 824415884@qq.com

Ximing Wang

1.Department of Radiology

the First Affiliated Hospital of Soochow University

188#, Shizi Road, Suzhou, 215006, China

2. Institute of Medical Imaging, Soochow University

1#, Shizi Road, Suzhou, Jiangsu, 215006, China

E-mail: wangximing1998@163.com

Hailin Shen

Department of Radiology

Suzhou Kowloon Hospital, Shanghai Jiaotong University School of Medicine

118# Wanshen street, Suzhou, Jiangsu, 215028, China

E-mail: hailinshen@163.com

Pheng-Ann Heng

1.Department of Computer Science and Engineerin

The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

2.Institute of Medical Intelligence and XR

The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

E-mail: pheng@cse.cuhk.edu.hk

Chunhong Hu

1.Department of Radiology

the First Affiliated Hospital of Soochow University

188#, Shizi Road, Suzhou, 215006, China

2. Institute of Medical Imaging, Soochow University

1#, Shizi Road, Suzhou, Jiangsu, 215006, China

E- mail: sdfyyhch@163.com

Yu-Dong Zhang

Department of Radiology

the First Affiliated Hospital of Nanjing Medical University

300#, Guangzhou Road, Nanjing, Jiangsu, 210029, China

mail: njmu_zyd@163.com

Correspond to:

Yu-Dong Zhang

Department of Radiology

the First Affiliated Hospital of Nanjing Medical University

300#, Guangzhou Road, Nanjing, Jiangsu, 210029, China

mail: njmu_zyd@163.com

Chunhong Hu

1.Department of Radiology

the First Affiliated Hospital of Soochow University

188#, Shizi Road, Suzhou, 215006, China

2. Institute of Medical Imaging, Soochow University

1#, Shizi Road, Suzhou, Jiangsu, 215006, China

E-mail: sdfyyhch@163.com

Pheng-Ann Heng

1.Department of Computer Science and Engineerin

The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

2.Institute of Medical Intelligence and XR

The Chinese University of Hong Kong

Shatin, New Territories, Hong Kong, SAR, 999077, China

F- mail: pheng@cse.cuhk.edu.hk

Funding

This work was supported by National Natural Science Foundation of China (grant 82402227, to J.B.), the Research Grants Council of the Hong Kong Special Administrative Region, China (grant T45-401/22-N, to Pheng-Ann Heng) and National Natural Science Foundation of China (grant 82302308, to Y, H.)

Declarations

This study was approved by the local Research Ethics Board and the informed patient consent was waived.

All procedures performed in the present study involving human participants were in accordance with the 1964 Helsinki Declaration and its later amendments.

Conflict of Interest

The authors declare that they have no Conflict of Interest.

Author Contribution

We respectfully request that J.B and L.T.Z be considered for co-first authorship. The contributions of the two authors were interdependent, synergistic, and of equal importance to the success of this project. The study bridges a critical gap between clinical radiology and artificial intelligence; Jie Bao provided the essential clinical context and data infrastructure, while Litao Zhao provided the technical capability to build and validate the models. Their continuous collaboration throughout the project ensured that the model was clinically relevant and that the data was structured to maximize technical performance.J.B, L.T.Z, Y.D.Z C.H.H and P.A.H conceived, designed and supervised the project; J.B., Y.T.S, L.B.J, J.K.S, X.M.W, H.L.S, Y.H and Y.D.Z collected and pre-processed all data and performed the research; J.B., Y.T.S, L.B.J, J.K.S, X.M.W, H.L.S, Y.H and Y.D.Z performed imaging data annotation and clinical data review; L.T.Z. proposed the model; J.B, L.T.Z and Y.D.Z drafted the paper; all authors reviewed, edited and approved the final version of the article.

J.B, L.T.Z, Y.D.Z C.H.H and P.A.H conceived, designed and supervised the project; J.B., Y.T.S, L.B.J, J.K.S, X.M.W, H.L.S, Y.H and Y.D.Z collected and pre-processed all data and performed the research; J.B., Y.T.S, L.B.J, J.K.S, X.M.W, H.L.S, Y.H and Y.D.Z performed imaging data annotation and clinical data review; L.T.Z. proposed the model; J.B, L.T.Z and Y.D.Z drafted the paper; all authors reviewed, edited and approved the final version of the article.

Acknowledgement

The authors thank all those who helped us during the writing of this research. We also thank the multicenter hospitals’ Department of Urology and Pathology for their valuable help and feedback.

Data Availability

The imaging studies and clinical data used for algorithm development are not publicly available because they contain private patient health information. Interested users may request access to these data, where institutional approvals, along with signed data use agreements and/or material transfer agreements may be needed/negotiated. Derived result data supporting the findings of this study are available upon reasonable request.

Ethics approval and Consent to participate

Ethics Committee approval was granted by the local institutional ethics review board (Center 1: 2024-SR-586, NCT06636747 of prospectively data was registered at ClinicalTrials.gov; Center 2: 2017 − 195; Center 3: JD-HG-2021-31; Center 4: 2019116; Center 5: KY2019046; Center 6:2023-L-1343; Center 7: XJ-2017-011), and the requirement of written informed consent was waived.

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Consent for publication:

Not applicable

Abstract

Consistently high-quality and standardized prostate MRI is crucial for reliable PI-RADS interpretation and accurate prostate cancer detection, particularly given the heterogeneity in acquisition protocols across institutions. In this study, we develop and validate an innovative anatomy-aware pyramid-spatially adaptive normalization and transformer-enhanced CycleGAN (PST-CyGAN) framework for standardizing and enhancing prostate MRI quality in large-scale multicenter datasets. This study included 2,207 patients for model training and 497 for validation from 8 centers, all of whom underwent prostate MRI between January 2017 and May 2025. PST-CyGAN was built upon CycleGAN with integrated anatomical attention, pyramid-spatially adaptive normalization, and transformer enhancement. High-resolution T2-weighted imaging (HR-T2WI; resolution: 0.35 × 0.35 × 2.0 mm) with quality control was acquired from 941 patients to ensure the model’s ability to generate high-quality images. Ablation tests were conducted comparing PST-CyGAN against three state-of-the-art (SOTA) models: CycleGAN, S-CyGAN, and PS-CyGAN. For clinical validation, five blinded radiologists evaluated the image quality of PST-CyGAN-generated HR-T2WI using a 10-point PI-QUAL scoring system. Changes in PI-QUAL scores compared to standard-resolution (SR) T2WI were analyzed using Bland–Altman plots and paired Wilcoxon tests. In ablation testing, PST-CyGAN consistently outperformed all three SOTA models across all metrics: PSNR (29.3 ± 0.5), SSIM (0.558 ± 0.116), FCS (0.915 ± 0.028), and LPIPS (0.169 ± 0.056) (all P < 0.01). Improvements were most notable in initially low-quality scans (PI-QUAL < 6). In multi-reader validation, PST-CyGAN trained with quality-controlled data achieved the highest clinical benefit, upgrading PI-QUAL scores in 38% of cases and downgrading in 20%, compared to 11% upgrades and 69% downgrades without quality control (P < 0.01). Prospective testing demonstrated PI-QUAL score improvements of 25.5%, 20.0%, and 29.1% for voxel sizes of 1.5×1.5×3.3 mm, 0.7×0.7×3.3 mm, and 0.4×0.4×2.0 mm, respectively, confirming robust generalizability. PST-CyGAN significantly and reproducibly enhances prostate T₂WI quality across diverse centers and acquisition protocols, exceeding all SOTA models on quantitative metrics. By generating standardized, high-resolution synthetic T₂WI—especially from initially low-quality scans—it reduces inter-site variability, supports consistent PI-RADS application, and may improve prostate MRI interpretation in clinical workflows.

Keywords:

Prostate cancer

multiparametric MRI

CycleGAN

deep learning

PI-RADS

image quality standardization

Abbreviations

PST-CyGAN = pyramid-spatially-adaptive normalization and transformer-enhanced CycleGAN, GAN = generative adversarial networks, S-CyGAN = PST-CyGAN without the Pyramid and Transformer module, PS-CyGAN = PST-CyGAN without the Transformer module.

Background

Magnetic resonance imaging (MRI) is widely recognized as the primary tool for the detection and risk stratification of prostate cancer (PCa) ¹. Among various sequences, T2-weighted imaging (T2WI) stands out due to its high spatial resolution ^2–4 and fundamental role in evaluating prostate diseases within the Prostate Imaging Reporting and Data System (PI-RADS) framework. Consequently, obtaining standardized, high-quality images is essential for any MRI-based PCa diagnostic pathway, as it directly affects the accuracy of cancer detection and subsequent clinical decisions ^3,5.

Despite its clinical importance, MRI-based diagnosis of PCa—particularly clinically significant PCa—faces considerable challenges. First, substantial heterogeneity exists in image quality and acquisition parameters across medical institutions ⁶. These variations arise from differences in scanner hardware (e.g., manufacturer, model, field strength), software configurations, local scanning protocols, and regional expertise ^3,7,8. Second, limited image quality impedes diagnostic accuracy. T2WI, being a cornerstone sequence, often suffers from inconsistent resolution, low signal-to-noise ratio (SNR), and artifacts. Suboptimal image quality—manifested as excessive noise, poor spatial resolution, motion artifacts, or inadequate contrast—can hinder the precise delineation of prostate anatomy (e.g., zonal boundaries, capsule integrity) and the reliable detection and characterization of lesions according to PI-RADS criteria ^9–11. Such limitations erode diagnostic confidence, increase inter-reader variability, and may result in missed diagnoses or unnecessary procedures. Thus, achieving consistent, high-quality prostate MRI is critical for reproducible interpretation, effective treatment planning, and improved patient outcomes. There remains a pressing clinical need for standardized high-resolution MRI.

To enhance MR image quality for improved PCa diagnosis and evaluation, various methods have been proposed for image synthesis and quality enhancement. Automated techniques can efficiently generate synthetic high-resolution images from low-resolution inputs, reducing scan time and mitigating additional acquisition demands ¹². Deep learning (DL) has been successfully employed in pattern recognition and medical image computing tasks, including image synthesis ^13,14. Most DL-based MR enhancement methods rely on supervised training with paired high- and low-resolution images from the same patients, requiring either repeated scans or data rebinning to simulate corresponding image pairs. This approach can be costly and may introduce misalignment between training and testing data, potentially compromising output quality.

To address these issues, unsupervised methods have been developed. Several generative DL models facilitate image synthesis, including deep belief networks (DBNs) ¹⁵, variational autoencoders (VAEs) ¹⁶, and generative adversarial networks (GANs) ^17,18. GANs, in particular, utilize a discriminator network to preserve high-frequency details—such as tumor margins and neurovascular bundles—through adversarial training ^14,19. CycleGAN, as a weakly supervised framework, maintains a generator-discriminator architecture²⁰ and offers advantages in unpaired image-to-image translation and bidirectional domain mapping ¹⁴. It establishes feature-level correspondences between domains without requiring pixel-wise alignment, thereby circumventing the need for paired datasets.

Despite its utility in unpaired image synthesis, CycleGAN still faces several limitations in capturing high-resolution features effectively. First, variations in scanner manufacturers and parameter settings across prostate MRI protocols often lead to misalignment in spatial information between high- and low-quality images, particularly within prostate gland structures. Second, CycleGAN primarily emphasizes pixel-level transformations, often overlooking semantic information such as organ position and tissue categorization, which can result in semantically inconsistent outputs. Third, conventional CycleGAN models rely heavily on convolutional neural networks (CNNs), which are limited in modeling long-range dependencies.

To overcome these challenges, we aimed to design and validate an innovative anatomy-aware pyramid-spatially-adaptive normalization and transformer-enhanced CycleGAN (PST-CyGAN) framework for standardizing and enhancing prostate MRI quality within highly heterogeneous multicenter datasets.

Results

Visual Quality and Quantitative Evaluation of Ablation Study

First, ablation test was performed between PST-CyGAN and three state-of-the-art (SOTA) models such as CycleGAN, S-CyGAN and PS-CyGAN. Image quality of reconstructed HR-T₂WI was quantitatively evaluated by comparing the PSNR, SSIM, LPIPS and FCS in validation datasets. Overall, PST-CyGAN achieved significantly higher PSNR, SSIM, and FCS score than CycleGAN, S-CyGAN, and PS-CyGAN, respectively (Mann-Whitney U test, all P < 0.01). In addition, PST-CyGAN produced lower LPIPS than three SOTA models (Mann-Whitney U test, all P < 0.01), demonstrating its ability in producing higher image quality versus three SOTA models. Details of results are presented in Fig. 1 and supplemental materials (Table E1). Representative images reconstructed from CycleGAN, S-CyGAN, PS-CyGAN and PST-CyGAN are shown in supplemental material (Figure E1).

Fig. 1

Dot plots show image quality evaluation in independent validation datasets. PST-CyGAN model demonstrated significantly higher mean values for PSNR, SSIM, and FCS compared to the CycleGAN, S-CyGAN, and PS-CyGAN models (all P < 0.01). Conversely, the LPIPS score of the PST-CyGAN model was significantly lower than those of the other three models (P < 0.01). All P-values were calculated using the Mann-Whitney U test.

Comparison of image semiquantitative evaluations

Synthetic HR-T₂WI images were generated with two PST-CyGAN variants: a quality-uncontrolled model (PST-CyGAN_L1) and a quality-controlled model (PST-CyGAN_L2). Each variant was evaluated at three spatial-resolution levels: (i) baseline (PST-CyGAN_L1 vs PST-CyGAN_L2), (ii) high-resolution (PST-CyGAN_L1_2 vs PST-CyGAN_L2_2), and (iii) ultra-high-resolution (PST-CyGAN_L1_3 vs PST-CyGAN_L2_3). All images in validation datasets-including a SR-T₂WI and six synthetic HR-T₂WI were independently assessed by five readers in NUH and SUH_1st using the “WL-free” 10-point PI-QUAL scoring scheme. This results in a total of 464 validation datasets with 2320 paired PI-QUAL assessments from Center 4 (n = 29), Center 6 (n = 46), Center 2 (n = 41), Center 2 (n = 16), TZH (n = 30), PICAI (n = 30) and Center 1 (n = 272), respectively. PI-QUAL score represented an intermediate consistency among five readers, with an inter-reader reliability (ICC) of 0.655 for SR-T₂WI and 0.511 to 0.641 for six synthetic HR-T₂WI (supplemental Figure E2).

The absolute PI-QUAL scoring downgrades and upgrades of SR-T₂WI by PST-CyGAN reconstructions were plotted in Fig. 2. Among 464 validation datasets from 7 centers, 5 readers produced a total of 2320 pairwise measurements. Generally, the likelihood of a downgrade monotonically decreased—and the likelihood of an upgrade increased—as the original SR-T₂WI PI-QUAL score decreased. In other words, the image quality gain afforded by synthetic PST-CyGAN reconstructions was inversely related to the baseline image quality; the largest image quality improvements were observed for initially low-quality (PI-QUAL score < 6) SR-T₂WI. While the quality improvement was limited or even counterproductive for initially high-quality (PI-QUAL score ≥ 6) SR-T₂WI. Secondly, PST-CyGAN_L2 reconstruction achieved obviously larger quality upgrades (L2, 28% [658/2320] vs L1, 6% [144/2320]; L2_2, 35% [802/2320] vs L1_2, 10% [241/2320]; L2_3, 38% [874/2320] vs L1_3, 11% [262/2320]; all P values < 0.01 at McNemar test) and lower quality downgrades (L2, 25% [578/2320] vs L1, 75% [1740/2320]; L2_2, 23% [527/2320] vs L1_2, 71% [1646/2320]; L2_3, 20% [467/2320] vs L1_3, 69% [1599/2320]; all P values < 0.01) than PST-CyGAN_L1 reconstruction. Specially, PST-CyGAN_L2_3 reconstruction achieved the best in image quality improvement among all six reconstruction modes. Details of PI-QUAL scores from 10 independent readers were summarized in supplemental materials (Figs. 2 and 3).

Fig. 2

Matrix confusion plots of PI-QUAL scores between SR-T₂WI and PST-CyGAN reconstructions. Plots are the summary results of five readers assessed in 464 datasets from 7 validation centers based on WL-free PI-QUAL scoring. It shows the improvement was obvious in initially low-quality (PI-QUAL score < 6) SR-T₂WI, while was limited or even counterproductive in initially high-quality (PI-QUAL score ≥ 6) SR-T₂WI. PST-CyGAN_L2, specially L2_3 reconstruction, produced larger quality upgrades and lower quality downgrades than PST-CyGAN_L1 reconstruction.

Fig. 3

Matrix confusion plots of PI-QUAL scores between SR-T₂WI and PST-CyGAN reconstructions. (A) Plots are results of PI-QUAL score based on PST-CyGAN index images. (B) Bland-Altman plot of SR-T2WI against PST-CyGAN index images, using ± 2 score as the threshold for clinically relevant quality upgrade or downgrade, PST-CyGAN resulted in obviously improvements in PI-QUAL score. (C) Representative cases of SR-T2WI and the corresponding synthetic T2WI. The left column presents the SR-T2W images of a patient and synthetic images reconstructed with six PST-CyGAN modes. The right column represents the SR-T2W images of five patients from different validation centers compared to the index synthetic images reconstructed with PST-CyGAN.

To determine the largest capacity of PST-CyGAN for image reconstruction, generated HR-T₂WI with highest PI-QUAL score among six PST-CyGAN reconstruction modes, i.e., L1, L1_2, L1_3, L2, L2_2, L2_3, was specially selected as Index image to take the comparison with SR-T₂WI. Based on this criterion, PST-CyGAN achieved maximumly 958 (41%) upgrades and 355 (15%) downgrades (Fig. 3A). In addition, using ± 2 score as the threshold for clinically relevant quality upgrade or downgrade, PST-CyGAN resulted in 668 (28.8%) upgrades with only 30 (1.3%) downgrades (Fig. 3B). Representative image cases for PI-QUAL score upgrades or downgrades between SR-T₂WI and the corresponding synthetic T₂WI are displayed in Fig. 3C.

Last, the generatability of PST-CyGAN was tested in 11 patients prospectively scanned with the voxel size of 1.5 × 1.5 × 3.3 mm, 0.7 × 0.7 × 3.3 mm, and 0.4 × 0.4 × 2 mm, respectively. The median PI-QUAL score is 5 (95% CI, 4 to 5.2), 7 (95% CI, 6.0 to 7.2) and 6 (95% CI, 4.8 to 7.0), for SR-T₂WI 1.5 × 1.5 × 3.3 mm, 0.7 × 0.7 × 3.3 mm and 0.4 × 0.4 × 2 mm, respectively. By PST-CyGAN reconstruction, the PI-QUAL score was upgraded by a mean of 25.5% (95% CI, 19.2% to 31.7%), 20% (95% CI, 13.3% to 26.7%), 29.1% (95% CI, 16.2% to 42%), for SR-T₂WI 1.5 × 1.5 × 3.3 mm, 0.7 × 0.7 × 3.3 mm and 0.4 × 0.4 × 2 mm, respectively (Fig. 4).

Fig. 4

Results of PST-CyGAN reconstructions in images acquired at different voxel sizes. HRDL indicates the PI-QUAL score of images acquired with deep learning-based denoising T2WI with voxel size of 0.3 × 0.3 × 2 mm. SR Q1, Q2, and Q3 indicates conventional scans at voxel size of 1.5 × 1.5 × 3.3 mm, 0.7 × 0.7 × 3.3 mm, and 0.4 × 0.4 × 2 mm, respectively. GR Q1, Q2, Q3 indicates corresponding synthetic images reconstructed with index PST-CyGAN mode. Right columns are images of a patient with prostate cancer (white arrow head) acquired at three different spatial resolutions. It shows that the image quality of primary images was obviously improved by PST-CyGAN reconstruction.

Discussion

In this multicenter study, we developed and validated an innovative anatomy-aware pyramid-spatially-adaptive normalization and transformer-enhanced CycleGAN (PST-CyGAN) framework for standardizing prostate MRI and enhancing image quality across highly heterogeneous multi-center datasets. Through comprehensive ablation studies and multi-reader evaluations, our model demonstrated superior performance compared to existing CycleGAN variants and exhibited strong generalizability. These results underscore its potential to address the persistent challenge of image quality variability in routine clinical practice.

Regarding synergistic contributions of integrated model components, our ablation experiments revealed that PST-CyGAN significantly outperformed all comparator models—including CycleGAN and its variants—across four key metrics: PSNR, SSIM, LPIPS, and FCS. This consistent advantage affirms the effectiveness of the PST-CyGAN architecture, particularly its integration of three core modules: SPADE, Pyramid, and Transformer. The SPADE module is designed to preserve anatomical structural information from input standard-resolution T2W images. The performance gain observed when integrating SPADE into CycleGAN (S-CyGAN > CycleGAN), along with PST-CyGAN’s superiority in SSIM and FCS, highlights its ability to generate images with high structural and semantic consistency relative to real high-quality references. This enhancement stems from SPADE’s use of prostate masks to adaptively modulate normalization parameters, thereby preserving spatial layout and fine details in critical anatomical regions ²¹. The Pyramid module captures multi-scale information ²², essential for contextual tasks such as whole-gland localization and fine edge delineation. The improvement seen after incorporating Pyramid into S-CyGAN (PS-CyGAN > S-CyGAN), coupled with PST-CyGAN’s strong performance in PSNR and SSIM, confirms its role in maintaining high fidelity across scales, reducing local distortions and artifacts, and aligning outputs more closely with reference images at both pixel and structural levels. The Transformer module captures long-range contextual relationships via self-attention ²³, enabling interactions between any pixel and all others in the image—critical for high perceptual quality. The improvement in LPIPS after integrating Transformer (PST-CyGAN > PS-CyGAN) demonstrates its efficacy in reducing local inconsistencies and enhancing perceptual performance. Finally, the stepwise performance improvement across ablation models (CycleGAN < S-CyGAN < PS-CyGAN < PST-CyGAN) confirms the complementary and indispensable contribution of each module to image quality enhancement.

Regarding the potential improvement in image quality, a comprehensive semiquantitative evaluation was conducted on synthetic HR-T₂WI images generated by two PST-CyGAN variants (PST-CyGAN_L1 and PST-CyGAN_L2) across three resolution levels and key observations emerged: First, the most notable improvements occurred in low-quality images (PI-QUAL < 6), which typically exhibit substantial noise, artifacts, and blur. The model effectively suppressed noise and recovered anatomical structures with high fidelity. In contrast, improvements were limited in high-quality images (PI-QUAL ≥ 6), likely because these already contain minimal noise and well-defined structures. In such cases, excessive smoothing or synthetic artifacts introduced by the model occasionally led to score reductions. Second, the PST-CyGAN_L2 variant consistently outperformed L1 across all resolution levels, highlighting the importance of curated training data in medical image generation. While the L1 model—trained on the full dataset—produced variable output quality due to learning from mixed-quality examples, the L2 model was trained exclusively on radiologist-curated high-quality (PI-QUAL Q3) images. This enabled L2 to map all inputs into a refined high-quality image space, yielding more consistent and controllable enhancements. Third, as output resolution increased (from L2 to L2_2 to L2_3), performance improved steadily: upgrade rates rose (28% → 35% → 38%) while downgrade rates declined (25% → 23% → 20%). This indicates that higher-resolution outputs better reconstruct high-frequency details while minimizing artifacts, confirming the scalability of our architectural design and establishing PST-CyGAN_L2_3 as the current optimal model.

Limitations

This study has several limitations. First, the model was developed and validated using only T₂WI. Although T₂WI offers excellent anatomical detail, incorporating functional sequences such as DWI and ADC maps could provide complementary information and improve lesion characterization within a multiparametric framework. Second, the model provided limited benefit for already high-quality images (PI-QUAL ≥ 6) and sometimes introduced unwanted alterations. Future work will focus on developing adaptive architectures that can intelligently assess input quality and apply enhancements selectively to avoid unnecessary processing of high-quality scans. Third, as the model was trained on unpaired data—with low- and high-resolution images originating from different patients—the learned feature mappings may be suboptimal. Future studies will involve prospectively collected paired datasets to train more accurate and powerful generative models.

Conclusion

In conclusion, the PST-CyGAN framework significantly and reproducibly enhances prostate T₂WI quality across diverse clinical settings. By generating standardized, high-resolution images, it offers a valuable tool to reduce inter-site variability, support consistent PI-RADS application, and may improve the accuracy of PCa detection and characterization in clinical workflows.

Materials and Methods

Study design

This multicenter study retrospectively and prospectively collected prostate MR images from seven tertiary medical centers between January 2017 and May 2025, including: the Center 1, Center 2, Center 3, Center 4, Center 5, Center 6, and Center 7. Additionally, data from the public Prostate Imaging: Cancer AI (PI-CAI) dataset were incorporated.

The study protocol comprised three main steps: First, to obtain high-quality source data for generative adversarial learning, a DL-accelerated high-resolution T₂-weighted imaging (HR-T₂WI) was prospectively acquired from consecutive men with clinically suspected PCa at the derivation center (Center 1). These prospectively collected HR-T₂WI scans, along with standard-resolution T₂WI (SR-T₂WI) from five cohort datasets exhibiting highly heterogeneous image quality, served as paired adversarial data for developing the PST-CyGAN model. Next, a post hoc analysis of multicenter data was conducted in a blinded, randomized manner following predefined standardized criteria and a multi-reader peer-review process to evaluate the ability of PST-CyGAN to enhance the image quality of SR-T₂WI across eight cohort datasets. Finally, the generalizability of PST-CyGAN was further assessed using images with predefined varying voxel sizes. Code to run the model is available at https://github.com/LuckLT/PST-CyGAN.git.

The data inclusion criteria were: male participants over 50 years of age with clinical suspicion of PCa (e.g., elevated PSA levels or abnormal digital rectal examination) who had undergone prostate MRI. Patients with a history of prostatectomy or radiotherapy were excluded, as were those with missing or incomplete imaging data. Finally, a total of 2,704 patients meeting the inclusion criteria were included from the participating centers: 1,634 from Center 1, 129 from Center 4, 41 from Center 2, 115 from Center 3, 77 from Center 5, 47 from Center 6, 40 from Center 7, and 621 from the PI-CAI dataset. The datasets were randomly allocated into training (n = 2,207, including 941 HR-T₂WI and 1,266 SR-T₂WI), internal validation (n = 292), external validation (n = 194), and a prospective validation group (n = 11). There was no overlap between the training and testing datasets.

The study complied with ethical standards in accordance with local regulations at each center.

Given the retrospective and anonymized nature of the secondary data analysis, formal protocol approval was not required, and informed consent was waived under national guidelines and institutional review board (IRB) policies at each collaborating institution (Center 1: 2024-SR-586 and NCT06636747 of prospectively data was registered at ClinicalTrials.gov; Center 2: 2017 − 195; Center 3: JD-HG-2021-31; Center 4: 2019116; Center 5: KY2019046; Center 6:2023-L-1343; Center 7: XJ-2017-011). For patients prospectively undergoing HR-T₂WI in addition to standard-of-care MRI at NUH, written informed consent was obtained. Figure 5 illustrates the data collection process and study flowchart.

Fig. 5

Flowchart of data collection and group separation in this multicenter study. Abbreviation: Multicenter institutions included: Center 1, Center 2, Center 3, Center 4, Center 5, Center 6, Center 7 and PI-CAI, Prostate Imaging: Cancer AI. Q1, Q2, Q3 indicates three image-quality group coarsely divided by PI-QUAL scoring scheme, whereas the Q3 indicates the highest image quality.

MRI examination

The patients in six tertiary centers underwent a standardized prostate MRI protocol at 3.0 T MR scanners.

The scanning protocols of all centers are not specific, compliant with European Society of Uro-Radiology guidelines. It is a combination of transverse T1-weighted, transverse, coronal, and sagittal T₂WI and transverse DWI sequences. The apparent diffusion coefficient (ADC) was measured from DWI using a mono-exponential fitting model. Specially, HR-T₂WI was scanned with a DL-accelerated TSE sequence, by which, the calibration data for estimating coil sensitivity maps were acquired immediately after the imaging echo trains. Image reconstruction was conducted using a variational deep learning network (Var-Net). K-space data, bias-field, and pre-calculated coil-sensitivity maps were combined into the Var-Net for the image reconstruction. The network parameters for image reconstruction were previously determined via supervised training using approximately 10000 slices obtained with conventional TSE ^24,25. This DL-based acceleration and denoising algorithm resulted in a fast-scan (2:53 min) high resolution (voxel size, 0.35–0.38 × 0.35–0.8 × 2.0 mm), while without compromise in signal-to-noise ratio (SNR) of image versus conventional TSE sequence. Details of MR manufacturers, scanning protocol and imaging parameters are summarized in Supplemental Materials (Table E3).

Data preprocess

To protect patients’ privacy, sensitive data including identifiers and contact information was removed from the private dataset by converting the original Digital Imaging and Communications in Medicine (DICOM) files into the Neuroimaging Informatics Technology Initiative (NIFTI) format. Subsequently, the 3D T₂WI volumes of all patients were decomposed into 2D axial slices for further processing. Cubic spline interpolation was employed to resize all 2D slices to a uniform size of 256 × 256, ensuring consistency with deep learning model input requirements. Each 2D T₂WI slice underwent intensity normalization using the min-max normalization method. This process adjusted pixel values to a standardized range of [0, 1], enhancing model training stability.

To achieve a goal of HR-T₂WI generation model construction, unpaired SR-T₂WI and HR-T₂WI were utilized for GAN model development. Before the PST-CyGAN model training, 941 HR-T₂WI cases (31244 image slices) from NUH were coarsely divided into three quality groups using a 3-point Prostate Imaging Quality (PI-QUAL) scoring system (3) by 9 radiologist readers (Reader 1–9, with 2 to 20 years of experience in prostate MRI) with anonymization and random assignment. The 3-point PI-QUAL score was performed as following criteria: PI-QUAL Q1 (low quality), images with significant artifacts (e.g., motion, distortion) and/or noise severely degrading diagnostic utility; PI-QUAL Q2 (intermediate quality), images with noticeable artifacts and/or noise potentially impacting subtle feature detection; and PI-QUAL Q3 (high quality), images with minimal to no artifacts or noise pollution, excellent anatomical detail suitable for definitive diagnosis (Figure E3). This stratifying process prescreened out 487 cases (16038 images) with PI-QUAL Q3, 297 cases (8973 images) with PI-QUAL Q2, and 157 cases (4720 images) with PI-QUAL Q1 images, respectively, ensuring the PST-CyGAN model to learn the image features from “true” high-quality HR-T₂WI. SR-T₂WI were randomly selected from 5 cohorts: 390 cases (10395 slice-level images) from Center 1, 99 cases (2361 slice-level images) from Center 4, 99 cases (3103 slice-level images) from Center 3, and 40 cases (937 slice-level images) from Center 7, supplemented with 591 patient-level cases (12939 slice-level images) from the PI-CAI public dataset. Among them, an independent testing set of 6 cases (170 slice-level images) was reserved for model fine tuning. Details for cohort data grouping can be seen in Fig. 5.

Fig. E3

Presentative image cases showing the 3-point PI-QUAL score for HR-T2WI selection and 10-point PI-QUAL score for image quality evaluation. The 3-point PI-QUAL (Q1, Q2, Q3) assessment was performed in deep-accelerated HR-T₂WI only, whereas Q1 indicates the image with significant artifact (#1, red arrow) or low SNR, and Q3 presents the image with excellent spatial resolution, tissue contrast and high SNR (#3), and Q2 is the case between Q1 and Q3 (A). The 10-point PI-QUAL assessment is performed in validation data only for the evaluation of the quality of GAN-generated images. Similar to 3-point PI-QUAL score, PI-QUAL score 10 indicates the best image quality, vice versa, PI-QUAL score 1 indicates lowest image quality (B).

Architecture of PST-CyGAN model

We proposed a novel PST-CyGAN, an anatomy-aware pyramid-spatially-adaptive normalization and transformer-enhanced CycleGAN framework, for HR-T₂WI generation from unpaired data, the network frame of which are shown in Fig. 6 (a-d). As illustrated in Fig. 6(a), the model comprises two core components: bidirectional Generators (

$\:{G}_{lr\to\:hr}$

$\:{G}_{hr\to\:lr}$

) with cycle-consistent learning and dual Discriminators (

$\:{D}_{1}$

$\:{D}_{2}$

) for adversarial training.

$\:{G}_{lr\to\:hr}$

allows to convert SR-T₂WI to HR-T₂WI, and

$\:{G}_{hr\to\:lr}$

allows to convert HR-T₂WI to SR-T₂WI. Both generators take anatomies (masks) as additional inputs, enabling precise alignment with prostate gland.

$\:{D}_{1}$

$\:{D}_{2}$

evaluates global image realism (real x, y VS. fake x’, y’). Cycle-consistency learning enforces both forward and backward transformations to preserve content fidelity between domains, formulated as:

Fig. 6

Framework of the Anatomy-Aware Pyramid-SPADE with Transformer Enhanced CycleGAN (PST-CyGAN) and its detail components. (a) Framework of PST-CyGAN. (b) SPADE module. (c) Architecture of Discriminator. d Architecture of Generator

Forward cycle:

$\:{G}_{hr\to\:lr}\left({G}_{lr\to\:hr}\left(x,\:mask\right)\right)={x}^{{\prime\:}}\approx\:x$

(1)

Backward cycle:

$\:{G}_{lr\to\:hr}\left({G}_{hr\to\:lr}\left(y,\:mask{\prime\:}\right)\right)={y}^{{\prime\:}}\approx\:y$

(2)

The generator architecture, as illustrated in Fig. 6(d), integrates spatially-adaptive normalization (SPADE) for anatomy-guided spatial alignment, a multi-scale feature pyramid for fusing the shallow high-resolution features and the deep semantic features, and lightweight Transformer for capturing global features. The features of SR images are first extracted through the Encoder. Then, Pyramid-SPADE is utilized to fuse the anatomical structure information into the generation process, thereby generating more realistic images that are aligned with the input mask. Specifically, the detail of SPADE module is shown in Fig. 6(b), which is used to adaptively modulate the space semantic information of anatomy (mask) into the feature maps (F) generated by the Encoder. Firstly, the mask is transformed through the convolutional layer to generate modulation parameters γ and β that match F, which are used to adjust the scale and shift of the features. Secondly, F is normalized (

$\:\widehat{F}$

) to eliminate the deviation of the original distribution. Finally, the generated γ and β are used to modulate the normalized feature map spatially adaptively. Through this mechanism, this module realizes the efficient alignment and fusion of anatomy and Feature map. The formula is as follows:

$\:\widehat{F}=\:\frac{F-\mu\:}{\sigma\:}$

$\:output=\gamma\:\bullet\:\widehat{F}+\beta\:$

PST-CyGAN integrates the multi-resolution SPADE module and multi-scale feature Pyramid (Pyramid-SPADE), dynamically integrating the spatial information of the anatomical structure into each feature layer. By extracting multi-scale features, it effectively solves the problem of information deficiency of the traditional single-scale block. This module ensures that the output image strictly adheres to the spatial constraints of the input anatomical mask while maintaining the realistic details of the tissue texture.

To further enhance the global modeling ability of PST-CyGAN, a lightweight Transformer module was introduced, into which the outputs of the Encoder and Pyramid-SPADE were fused. Spatial information was added to the fused features through positional embedding. The multi-head attention mechanism and fatten processing are used to accurately capture global features, significantly improving the performance of pure convolutional networks in global context understanding.

The feature maps output by the Transformer module were upsampled through the transposed convolution of the Decoder and high-resolution prostate images were generated. This Generator achieves pixel-level alignment of the generated image with the input anatomy through the collaborative optimization of SPADE, Pyramid and Transformer.

The Discriminator of PST-CyGAN (Fig. 6c) based on convolutional neural networks, mainly used to distinguish real images from generated images. It consists of four convolutional layers, and each layer is followed by the LeakyReLU. The second to fourth layers are normalized using InstanceNorm. Finally, the feature map is transformed into a single discriminant through global average pooling. This structure adopts the design of PatchGAN, realizes the discrimination of local areas of the image through the fully convolutional network, and finally aggregates the local results to form the overall judgment.

PST-CyGAN optimizes the generator and discriminator by combining different loss functions. The generator adopts adversarial loss (MSE loss) to ensure the overall fidelity of the generated image, and combines pixel-level constraints (L1 loss) to ensure the structural consistency of bidirectional conversion. The discriminator focuses on the classification of true and false through MSEloss.

To reveal potential effect of HR-T₂WI on model construction, PST-CyGAN was trained at two data-driven levels: whereas in level-1, image quality of HR-T₂WI used for model training were not specially controlled, i.e., all HR-T₂WI data were used for model training, resulting in a quality-uncontrolled PST-CyGAN model, namely PST-CyGAN_L1. Conversely in level-2, only the HR-T₂WI with high image quality pre-stratified by radiologists (method is introduced in “data preprocess” section) were used for model training, resulting in a quality-controlled PST-CyGAN model, namely PST-CyGAN_L2.

Last, to achieve a goal of high-resolution image reconstruction, image generation was performed by PST-CyGAN at three spatial resolution level:

baseline resolution level: the resolution of generated image is same to the raw SR T₂WI, generally, 0.5–0.8 × 0.5–0.8 × 3.0–5.0 mm.

HR level: the resolution of generated image is fixed to a median size of all true HR-T₂WI, i.e., 0.4 × 0.4 × 2 mm.

Ultra HR level, where the resolution of generated image is fixed to an upper limitation of all true HR-T₂WI, i.e., 0.3 × 0.3 × 1 mm.

Ablation study

In this section, to evaluate the robustness of PST-CyGAN, ablation test was performed by progressively removing the key components of PST-CyGAN, such as CycleGAN (baseline, PST-CyGAN without the SPADE, Pyramid and Transformer module), S-CyGAN (PST-CyGAN without the Pyramid and Transformer module), PS-CyGAN (PST-CyGAN without the Transformer module), PST-CyGAN (the complete proposed model). Specifically, following qualitive indexes were pairwisely compared:

Peak Signal-to-Noise Ratio (PSNR): Measures pixel-level fidelity (higher = better).

Structural Similarity Index (SSIM): Assesses perceptual similarity in luminance, contrast, structure (range 0–1, higher = better).

Learned Perceptual Image Patch Similarity (LPIPS): Uses deep features to measure perceptual similarity (lower = better, closer to human judgment).

Feature Cosine Similarity (FCS): Measures directional alignment between feature vectors in high-dimensional space (higher = better, indicates greater semantic consistency).

Multicenter multi-reader validation

In real-world practice, prostate MRI are always challenged by diagnostic heterogeneity partly attributed to the variations in image quality among different regions, MR scanners, or imaging parameter sets.

Therefore, to evaluate the effectiveness of PST-CyGAN for the improvement of image quality, multicenter multi-reader assessments were performed in multiscale-reconstructed images using a blinded randomized manner following a priori defined standardized peer review process. As the ground truth, i.e., HR-T₂WI, was unavailable from retrospective datasets, a 10-point PI-QUAL score was employed as primary endpoint for the image quality evaluation. This 10-point PI-QUAL scoring scheme involves comprehensive side-by-side comparisons between generated and original scans, supplemented by detailed zoomed-in examinations of clinically relevant anatomical regions. The 10-point PI-QUAL score was performed as following step: after image reconstruction, all true and generated images of validation groups were randomly assigned to five readers (Reader 10–12 from NUH; Reader 13–14 from SUH1st ) using the 10-point scoring system. Similar to 3-point PI-QUAL scoring scheme, the criteria for 10-point PI-QUAL mainly relies on readers’ general impression by evaluating image quality-related factors such as contrast, resolution signal intensity, noise, artifact, and distortion, whereas higher score indicates better image quality, vice versa, lower score indicates lower image quality (Figure E3). Before PI-QUAL assessment, all readers were provided with a centralized training program, in which 20 representative image cases were shown to explain in detail PI-QUAL assessment. During PI-QUAL assessment, all readers were blinded to the information such as patient information, MR scanner, hospital, and data origin (MR scanned or model generated, the GAN model and algorithm).

To further test the reconstruction capability of PST-CyGAN maximally, we proposed a stress test by reconstructing T₂WI prospectively scanned at voxel size 1.5 × 1.5 × 3.3 mm; 0.7 × 0.7 × 3.3 mm, and 0.4 × 0.4 × 2 mm, respectively.

Statistical analysis

The measurements for PSNR, SSIM, LPIPS and FCS at truly scanned and the GAN model-generated images were averaged over the volumetric prostate mask, excluding the boundary voxels to avoid the partial volume effect. Post hoc comparison of the difference among pairwise images were performed by using one-way analysis of variance followed by pairwise t tests with Tukey correction.

Results of PI-QUAL assessment were summarized in all readers to test the inter-reader agreement with a Cronbach’s alpha analysis. To test the effectiveness of PST-CyGAN on image quality improvement, PI-QUAL upgrades and downgrades were compared between SR-T₂WI and generated HR-T₂WI using a Bland-Altman plot with paired Wilcoxon test. Specially, subgroup analysis was performed in multi-model, multicenter, and multi-reader comparison, respectively. For each comparison, contingency tables were used to present the results and calculate the quality differences. All the statistics were two-sided, and a P-value less than 0.05 was considered statistically significant. All statistical analyses were performed using MedCalc software (V.15.2; 2011 MedCalc Software bvba, Mariakerke, Belgium).

Electronic Supplementary Material

Below is the link to the electronic supplementary material

Supplementary Material 1

Supplementary Material 2

Supplementary Material 3

Supplementary Material 4

References

Turkbey, B. et al. Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol 76, 340–351 (2019). https://doi.org/10.1016/j.eururo.2019.02.033

Barentsz, J., de Rooij, M., Villeirs, G. & Weinreb, J. Prostate Imaging-Reporting and Data System Version 2 and the Implementation of High-quality Prostate Magnetic Resonance Imaging. European urology 72, 189–191 (2017). https://doi.org/10.1016/j.eururo.2017.02.030

de Rooij, M. et al. PI-QUAL version 2: an update of a standardised scoring system for the assessment of image quality of prostate MRI. European radiology 34, 7068–7079 (2024). https://doi.org/10.1007/s00330-024-10795-4

Giganti, F., Allen, C., Emberton, M., Moore, C. M. & Kasivisvanathan, V. Prostate Imaging Quality (PI-QUAL): A New Quality Control Scoring System for Multiparametric Magnetic Resonance Imaging of the Prostate from the PRECISION trial. European urology oncology 3, 615–619 (2020). https://doi.org/10.1016/j.euo.2020.06.007

Barrett, T. et al. Quality checkpoints in the MRI-directed prostate cancer diagnostic pathway. Nature reviews. Urology 20, 9–22 (2023). https://doi.org/10.1038/s41585-022-00648-4

Giganti, F. et al. Global Variation in Magnetic Resonance Imaging Quality of the Prostate. Radiology 309, e231130 (2023). https://doi.org/10.1148/radiol.231130

Lin, Y., Yilmaz, E. C., Belue, M. J. & Turkbey, B. Prostate MRI and image Quality: It is time to take stock. European journal of radiology 161, 110757 (2023). https://doi.org/10.1016/j.ejrad.2023.110757

Woernle, A. et al. Picture Perfect: The Status of Image Quality in Prostate MRI. Journal of magnetic resonance imaging: JMRI 59, 1930–1952 (2024). https://doi.org/10.1002/jmri.29025

Brendel, J. M. et al. Deep learning reconstruction for accelerated high-resolution upper abdominal MRI improves lesion detection without time penalty. Diagnostic and interventional imaging 106, 85–92 (2025). https://doi.org/10.1016/j.diii.2024.09.008

10.

Matsumoto, S. et al. Ultra-High-Resolution T2-Weighted PROPELLER MRI of the Rectum With Deep Learning Reconstruction: Assessment of Image Quality and Diagnostic Performance. Investigative radiology 59, 479–488 (2024). https://doi.org/10.1097/rli.0000000000001047

11.

Malekian, V., Rastegar, F., Shafieizargar, B. & Nasiraei-Moghaddam, A. SSFP fMRI at 3 tesla: Efficiency of polar acquisition-reconstruction technique. Magnetic resonance imaging 74, 171–180 (2020). https://doi.org/10.1016/j.mri.2020.09.005

12.

Bischoff, L. M. et al. Deep Learning Super-Resolution Reconstruction for Fast and Motion-Robust T2-weighted Prostate MRI. Radiology 308, e230427 (2023). https://doi.org/10.1148/radiol.230427

13.

Dayarathna, S. et al. Deep learning based synthesis of MRI, CT and PET: Review and analysis. Medical image analysis 92, 103046 (2024). https://doi.org/10.1016/j.media.2023.103046

14.

Phipps, B. et al. AI image generation technology in ophthalmology: Use, misuse and future applications. Progress in retinal and eye research 106, 101353 (2025). https://doi.org/10.1016/j.preteyeres.2025.101353

15.

Mathews, S. M., Kambhamettu, C. & Barner, K. E. A novel application of deep learning for single-lead ECG classification. Computers in biology and medicine 99, 53–62 (2018). https://doi.org/10.1016/j.compbiomed.2018.05.013

16.

Han, K. & Xiang, W. Inference-Reconstruction Variational Autoencoder for Light Field Image Reconstruction. IEEE transactions on image processing: a publication of the IEEE Signal Processing Society 31, 5629–5644 (2022). https://doi.org/10.1109/tip.2022.3197976

17.

Yi, X., Walia, E. & Babyn, P. Generative adversarial network in medical imaging: A review. Medical image analysis 58, 101552 (2019). https://doi.org/10.1016/j.media.2019.101552

18.

Lucas, A. et al. Multisequence 3-T Image Synthesis from 64-mT Low-Field-Strength MRI Using Generative Adversarial Networks in Multiple Sclerosis. Radiology 315, e233529 (2025). https://doi.org/10.1148/radiol.233529

19.

Alajaji, S. A. et al. Generative Adversarial Networks in Digital Histopathology: Current Applications, Limitations, Ethical Considerations, and Future Directions. Modern pathology: an official journal of the United States and Canadian Academy of Pathology, Inc 37, 100369 (2024). https://doi.org/10.1016/j.modpat.2023.100369

20.

Cao, R. et al. Label-free intraoperative histology of bone tissue via deep-learning-assisted ultraviolet photoacoustic microscopy. Nature biomedical engineering 7, 124–134 (2023). https://doi.org/10.1038/s41551-022-00940-z

21.

Park, T., Liu, M. Y., Wang, T. C. & Zhu, J. Y. Semantic Image Synthesis with Spatially-Adaptive Normalization. (2019).

22.

Lin, T. Y. et al. in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 936–944.

23.

Wang, X. et al. Lightweight Multi-Stage Aggregation Transformer for robust medical image segmentation. Medical image analysis 103, 103569 (2025). https://doi.org/10.1016/j.media.2025.103569

24.

Herrmann, J. et al. Feasibility and Implementation of a Deep Learning MR Reconstruction for TSE Sequences in Musculoskeletal Imaging. Diagnostics (Basel) 11 (2021). https://doi.org/10.3390/diagnostics11081484

25.

Gassenmaier, S. et al. Accelerated T2-Weighted TSE Imaging of the Prostate Using Deep Learning Image Reconstruction: A Prospective Comparison with Standard T2-Weighted TSE Imaging. Cancers (Basel) 13 (2021). https://doi.org/10.3390/cancers13143593

Abbreviation: PST-CyGAN Anatomy-Aware Pyramid-SPADE with Transformer Enhanced CycleGAN, SPADE Spatial Adaptive Normalization.

Yes

Transformer-Enhanced Generative Adversarial Networks for Improving MR Image Quality in Prostate Imaging