1.
1. Department of Biomedical Engineering, College of Engineering and Applied Science, University of Wisconsin-Milwaukee, Milwaukee, WI 53211
2.
2. Department of Computer Science, College of Engineering and Applied Science, University of Wisconsin-Milwaukee, Milwaukee, WI 53211
3.
3. Whitefish Bay High School, Whitefish Bay, WI 53217
*: Corresponding Author
Abstract
Accurate segmentation of nuclei images is essential for analyzing cellular responses to perturbation in both in vitro and in vivo experiments. Although traditional methods, including watershed, thresholding, clustering, morphological operations, and active contour models, have long been used in segmenting nuclei in digital images, these methods are labor-intensive and time-consuming. Therefore, current research has shifted to deep learning techniques for improved nuclei segmentation. However, training deep learning models requires high-quality annotated ground truth datasets, which are often scarce and not available for public use. In this study, we introduce the Breast Mammary Gland Dataset (BMGD), an annotated collection of DAPI-stained nuclei images of mammary organoids. The dataset contains 819 image patches with more than 9,500 manually segmented nuclei cultured in various stiffness conditions. Each original image in the BMGD is paired with one carefully annotated ground truth mask. This dataset will enable researchers to develop and evaluate automated nuclei segmentation algorithms, particularly for studying cellular responses in breast cancer research and treatment.
Background & Summary
Confocal fluorescent microscopy, a staple in life science research for its high-quality images, often requires the labor-intensive and error-prone task of manually analyzing each nucleus. This challenge underscores the growing need for an automatic approach to analyze biological images1. In computational biology, nuclei segmentation plays a fundamental role in analyzing morphological changes and quantifying molecular expressions 2. These detailed biological data can be used for cell and tissue identification, cancer diagnosis, and therapeutic assessment 3,4. Researchers can further map molecular activities, organelle information, cellular phenotypes, and multicellular structures linked to cellular migration, division, and tissue development under environmental stimuli5.
Although nuclei segmentation and quantification can be performed manually, these processes are labor-intensive, time-consuming, and prone to error6,7. Additionally, even if the semantic segmentation of cell nuclei is done properly, several challenges can still arise. For example, unconventional morphologies in diseased environments cannot be differentiated, a high noise‑to‑signal ratio and heterogeneous staining further degrade segmentation accuracy, limiting the reliability of the produced masks, and overlapping nuclei cannot be delineated, leading to fragmented or merged instance predictions8–10. To enhance the accuracy and efficacy of automated nuclei segmentation, we should apply deep learning algorithms to overcome current limitations11.
In recent years, CNNs trained on large and supervised image datasets have achieved state-of-the-art results in medical image classification and segmentation. However, a shortage of well‑annotated histopathology datasets limits their advancement, and among publicly available breast‑tissue sets, annotation quality varies, with only a few providing exhaustive nucleus‑boundary markings12. Most of the labs are striving towards incorporating deep learning image segmentation for day-to-day use, but open-source datasets of nuclei images are not available for training and validating the model accuracy10,13. To overcome this issue, we will release our dataset to benefit the broader research community, inspiring new developments and advancements in the field of nuclei segmentation.
We present the Breast Mammary Gland Dataset (BMGD), an annotated dataset for nuclei segmentation in DAPI-stained fluorescent images. The dataset includes high-quality 40X images acquired with a Zeiss LSM 710 confocal microscope, featuring mammary epithelial cell cultures exposed to different microenvironmental stiffnesses14. Additionally, the dataset contains 819 images with more than 9,500 cell nuclei boundaries, which are annotated with human labor to ensure precision and accuracy for a reliable learning process. The BMGD can be further utilized in evaluating, training, and testing machine learning algorithms for nuclei segmentation methods, and additionally estimating the transferability and adaptability of previously developed nuclei segmentation methods.
Methods
Data Collection
The dataset proposed in this study originates from one of our prior research projects14. The three-dimensional volume of a single mammary colony was captured utilizing a Zeiss LSM 710 confocal microscope equipped with a Zeiss Apochromat 40X/1.1 (0.8mm working distance) water-immersion objective lens. Excitation filters were configured at 405 nm, while emission filters were set to detect signals between 420–480 nm. The laser intensity was maintained at 1%, and a twin-gate main beam splitter featuring two wheels, each containing 10 filter positions (resulting in 100 possible combinations), was employed to separate the excitation and emission beams. The images were taken at 12-bit resolution. The pinhole aperture was set at "1", and digital gain was adjusted to approximately ¾ of the maximum gain, ensuring a dynamic range of pixel values between 500–2000. The voxel size was set to 0.25µՠ × 0.25µՠ × 1µՠ, yielding high-resolution
capture of the cellular structures. For each colony, Z-stack images were acquired to encompass the entire volume of the cellular structure. The resulting image files were saved in Laser Scanning Microscope (.lsm) format with their corresponding metadata.
Data Processing and Labeling
To facilitate the development and evaluation of comprehensive nuclei segmentation algorithms, we employed the Labkit extension within the FIJI platform for careful annotation of all images15. This tool provides intuitive manual and semi-automated image segmentation capabilities. Each image in our collection is paired with a hand-crafted ground-truth mask that precisely outlines cellular structures of interest. The dataset encompasses delineations of more than 9,500 nuclei perimeters, and the annotation phase alone consumed over 800 hours of work. The complete annotation workflow was designed to ensure maximum precision and reproducibility in the segmentation process, as shown in Fig. 1.
The first step is data preprocessing. Initially, a Python script was used to isolate 2D image slices from the 3D dataset13. Then, we applied an intensity threshold of 1500 or higher to preprocess the data for better visualization. This was followed by manual filtering to remove noise, using mean filters with a radius ranging from 0.5 to 2.0 and intensity subtraction to ensure optimal data quality. Gaussian blur filters were also applied to enhance edge detection. Second, foreground and background regions are separated to enable nuclei boundary detection, then the annotation process involved pixel-based delineation of nuclei boundaries, with the nuclei themselves being marked as foreground in red and the background colored in blue. Third, a random forest classifier integrated into Labkit was then utilized to generate preliminary masks, followed by manual verification against the original image16. Lastly, we binarized images to finalize the mask. The final binary mask is produced from these annotations, where white pixels represent segmented nuclei and black pixels represent background.
Data Records
The BMGD (https://github.com/zt089/Breast-Mammary-Gland-Dataset-BMGD) is now available for the public to access on GitHub. The dataset includes 819 DAPI-stained fluorescent microscopy images of mammary gland cells cultured under different stiffness conditions, ranging from 250Pa to 1800Pa. There is a total of > 9,500 manually annotated nuclei, distributed across four different stiffness conditions: 250Pa (453 images, 5,426 nuclei), 950Pa (54 images, 453 nuclei), 1200Pa (114 images, 1,538 nuclei), and 1800Pa (198 images, 2,144 nuclei). On average, each image contains between 8 and 14 nuclei, with the lowest average density observed in the 950Pa condition, 8.4 nuclei per image, and the highest in the 1200Pa condition,13.49 nuclei per image. Table 1 shows the quantification of nuclei across different substrate stiffness conditions. Each image is paired with corresponding binary masks and labeled segmentation data, making it suitable for both instance and semantic segmentation tasks. All images and their associated annotations were standardized to 256 × 256 pixels and maintained their original 12-bit dynamic range.
Table 1
Quantification of nuclei across different substrate stiffness conditions.
|
Stiffness Condition
|
Images
|
Nuclei
|
Avg. Nuclei per image
|
|
250Pa
|
453
|
5426
|
11.98
|
|
950Pa
|
54
|
453
|
8.4
|
|
1200Pa
|
114
|
1538
|
13.49
|
|
1800Pa
|
198
|
2144
|
10.83
|
Technical Validation
To validate the dataset's reliability for supervised segmentation tasks, we benchmarked several convolutional neural networks within the U-Net architecture, including ResNet50, MobileNetV2, Inception-ResNetV2, InceptionV3, DenseNet121, and VGG1913. We divided the dataset into training, validation, and testing subsets following an 80:10:10 ratio, corresponding to 655 images for training, 82 for validation, and 82 images reserved for testing. The model performance was evaluated using the F1-score, Intersection over Union (IoU), and validation loss, enabling a direct comparison of pixel-level segmentation accuracy and region-level agreement. Figure 2 presents the generated mask and overlay images produced with our in-house segmentation algorithm with Inception-ResNetV2 as encoder, compared with the ground truth mask. The segmented results of different stiffness conditions are shown to demonstrate the generalization of the code, which is published on GitHub (https://github.com/zt089/BMGD-nuclei-segmentation).
The performance of each model, along with its evaluation matrices, is shown in Table 2. The comparative results demonstrate consistently strong performance across all models, with F1 scores ranging around the same level, from 92.90% to 93.66% and IoU values between 86.73% and 88.08%. Among the tested backbones, Inception-ResNetV2 achieved the highest overall segmentation accuracy, yielding an F1 score of 93.66% and the best IoU of 88.08%, indicating the best pixel-wise segmentation with ground-truth annotations. MobileNetV2 and DenseNet121 also performed competitively, maintaining F1 scores above 93% with IoU values exceeding 87%, while VGG19, although slightly lower in accuracy, obtained the lowest validation loss (0.06006), suggesting more stable optimization on this dataset. These results indicate that the dataset supports robust training across multiple architectures, with Inception-ResNetV2 providing the most reliable and consistent segmentation performance. The reported metrics serve as a quantitative baseline for future methodological comparisons and for assessing improvements from advanced architecture or training strategies.
Table 2
Dataset performance on different models.
|
Model
|
F1 score
|
IoU
|
Validation loss
|
|
ResNet50
|
93.22%
|
87.31%
|
0.06650
|
|
MobileNetV2
|
93.31%
|
87.45%
|
0.07459
|
|
Inception-ResNetV2
|
93.66%
|
88.08%
|
0.08286
|
|
InceptionV3
|
93.61%
|
87.99%
|
0.07875
|
|
DenseNet121
|
93.55%
|
87.89%
|
0.07849
|
|
VGG19
|
92.90%
|
86.73%
|
0.06006
|
Initially, the dataset was evaluated previously, where EfficientNetB5, ResNet50, InceptionResNetV2, VGG19, DenseNet121, and MobileNet were used as U-Net backbone encoders, and EfficientNetB5 showed the most promising result with an F1-score of 87.11% and a mean IoU of 80.89%13. The new benchmarks from this study show clear improvement over earlier results. All tested backbones now perform much better than the previous baseline, with F1-scores above 92% and IoU values over 86%. These results show that the dataset is robust and the updated training strategy works well, leading to much stronger segmentation accuracy than previously reported. To compare each model’s performance, we selected one random image, and the results are presented in Fig. 3. The original DAPI image with contrast enhancement (top) and the corresponding ground truth mask are shown as references. Predicted masks and image-mask overlays are presented for U-Net models with different encoders, ResNet50, MobileNetV2, InceptionResNetV2, InceptionV3, DenseNet121, and VGG19. There are differences in segmentation quality, including nucleus shape preservation, boundary sharpness, and detection completeness across different backbone architectures.
Usage Notes
The BMGD with the raw images, corresponding binary and segmented mask, is accessible through our published repository in GitHub (https://github.com/zt089/Breast-Mammary-Gland-Dataset-BMGD). To ensure effective use of our dataset, we will provide comprehensive documentation (Read Me files) and supporting materials to assist researchers in maximizing the value of these resources. When applying data augmentation, we suggest following our documented protocol, which includes horizontal flipping, random cropping, elastic transformations, and brightness contrast adjustments. These augmentation techniques have been validated to improve model generalization without introducing artifacts that could compromise segmentation accuracy. For initial data processing, we recommend utilizing the provided Python scripts, which standardize image dimensions and normalize intensity values. The dataset is organized for compatibility with widely used deep learning frameworks, with all images pre-processed to 256×256 pixels. Users should be aware that the images retain their original 12-bit dynamic range, which preserves detailed intensity information essential for accurate nuclei segmentation. The training and evaluation pipelines were implemented using TensorFlow and Keras, with encoders derived from the Segmentation Models library. Image augmentation was performed using Albumentations, and image input/output and preprocessing utilized OpenCV and NumPy. The encoder backbone can be readily exchanged among ResNet50, MobileNetV2, InceptionResNetV2, InceptionV3, DenseNet121, and VGG19 without requiring modifications to the core training loop. We recommend using our Inception-ResNetV2 implementation, as it provides an optimal balance between performance and computational efficiency, particularly for researchers with limited computational resources. It should be highlighted that the dataset can be used either independently for training, validation, and testing of segmentation algorithms, or as a complementary dataset to assess model generalization. Researchers can freely incorporate the dataset into their own machine-learning or deep-learning pipelines for nuclei segmentation tasks. The dataset format is lightweight, enabling seamless integration into custom analysis environments and different computational setups.
Data Availability
The BMGD with original images, corresponding binary and segmented masks, is accessible through GitHub (https://github.com/zt089/Breast-Mammary-Gland-Dataset-BMGD). All images under different stiffness conditions, with their underlying mask images, are uploaded to separate folders.
Code Availability
The code implemented for this dataset can also be found on GitHub (https://github.com/zt089/BMGD-nuclei-segmentation). It includes a systematic script to evaluate the BGMD for training, validation, testing, and generating new masks.
A
Data Availability
The BMGD with original images, corresponding binary and segmented masks, is accessible through GitHub ([https://github.com/zt089/Breast-Mammary-Gland-Dataset-BMGD]. All images under different stiffness conditions, with their underlying mask images, are uploaded to separate folders.
References
1.Xu, X. et al. Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation. in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 8300–8308 (IEEE Computer Society, 2018). doi:10.1109/CVPR.2018.00866.
2.Mergenthaler, P. et al. Rapid 3D phenotypic analysis of neurons and organoids using data-driven cell segmentation-free machine learning. PLoS Comput Biol 17, e1008630 (2021).
3.Shi, F. et al. Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation, and Diagnosis for COVID-19. IEEE Reviews in Biomedical Engineering vol. 14 4–15 Preprint at https://doi.org/10.1109/RBME.2020.2987975 (2021).
4.Minaee, S. et al. Image Segmentation Using Deep Learning: A Survey. IEEE Trans Pattern Anal Mach Intell 44, 3523–3542 (2022).
5.Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. http://arxiv.org/abs/1505.04597 (2015).
6.Ng, H. P., Ong, S. H., Foong, K. W. C., Goh, P. S. & Nowinski, W. L. Medical image segmentation using k-means clustering and improved watershed algorithm. in Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation vol. 2006 61–65 (2006).
7.Chang, S. et al. Deformable multi-level feature network applied to nucleus segmentation. Front Microbiol 15, (2024).
8.Zhang, W. et al. Keep it accurate and robust: An enhanced nuclei analysis framework. Comput Struct Biotechnol J 24, 699–710 (2024).
9.Gabdullin, M. T. et al. Automatic cancer nuclei segmentation on histological images: comparison study of deep learning methods. Biotechnology and Bioprocess Engineering 29, 1034–1047 (2024).
10.Mahbod, A. et al. NuInsSeg: A fully annotated dataset for nuclei instance segmentation in H&E-stained histological images. Sci Data 11, (2024).
11.He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition vols 2016-December 770–778 (IEEE Computer Society, 2016).
12.Lagree, A. et al. A review and comparison of breast tumor cell nuclei segmentation performances using deep convolutional neural networks. Sci Rep 11, (2021).
13.Shrestha, A., Bao, X., Cheng, Q. & McRoy, S. CNN-Modified Encoders in U-Net for Nuclei Segmentation and Quantification of Fluorescent Images. IEEE Access 12, 107089–107097 (2024).
14.Cheng, Q. et al. Stiffness of the microenvironment upregulates ERBB2 expression in 3D cultures of MCF10A within the range of mammographic density. Sci Rep 6, (2016).
15.Schindelin, J. et al. Fiji: An open-source platform for biological-image analysis. Nature Methods vol. 9 676–682 Preprint at https://doi.org/10.1038/nmeth.2019 (2012).
16.Hollandi, R., Diósdi, Á., Hollandi, G., Moshkov, N. & Horváth, P. Annotator J: An image J plugin to ease hand annotation of cellular compartments. Mol Biol Cell 31, 2179–2186 (2020).
A
Author Contribution
Q.C. conceived the project concept and acquired microscopy data. Z.T., J.F., and A.S. performed the image annotations and created the segmentation masks. A.S. initially developed the code, with Z.T. refining it significantly. J.F. drafted the initial manuscript, after which Z.T. and Q.C. revised and edited it. J.Z. contributed to mask generation and further manuscript editing.
Acknowledgement
We greatly appreciate Dr. Bahram Parvin from the University of Nevada, Reno, for his advice in the past decade.