Dataset - Parse2022 - Grand Challenge

[Notice] Anyone using this dataset, please cite the following challenge overview paper on arXiv.¶

Luo, Gongning, et al. "Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge." arXiv preprint arXiv:2304.03708 (2023).

Chu, Y., Luo, G., Zhou, L. et al. Deep learning-driven pulmonary artery and vein segmentation reveals demography-associated vasculature anatomical differences. Nat Commun 16, 2262 (2025). https://doi.org/10.1038/s41467-025-56505-6

2025/09/10 The dataset can now be downloaded freely: The training set can be downloaded from one of the following websites! Googledrive: https://drive.google.com/file/d/1_-w8kNc2k4ttHTrVnRWaEaLbjGmNWRZD/view?usp=sharing Baidu Cloud: https://pan.baidu.com/s/1DKSIOlu0bz1_w2SaoSmw6A?pwd=aiuu

The validation set can be downloaded from one of the following links: Baidu Cloud： https://pan.baidu.com/s/1DtGgBLbNb8D59OIv69yb3A?pwd=rd8f Google driver: https://drive.google.com/file/d/1_svniY7cWguxxkUWC6YgWz_jSki1gUXi/view?usp=sharing

In addition, we highly recommend the HiPaS dataset, which provides artery and vein segmentation for 250 cases, covering both non-contrast CT and CTPA. It can be accessed directly via: https://zenodo.org/records/14879605

2023/02/01 For future researchers, the training and validation sets are open and available. Remember to send the signed document to PARSE2022@hotmail.com for participation. Only after the agreement is received will the dataset download link be emailed.

Attention, Please! Data Access Rules: The participants should click the Join button, fill out the online registration form and send the signed document to PARSE2022@hotmail.com on the Participation Rules page. After that, we will send you the data download link by email.

Our dataset contains 200 3D volumes with refined pulmonary artery label, these contrast-enhanced CT Pulmonary Angiography (CTPA) data are obtained from a dual-source 64-slice CT scanner in Harbin Medical University, Harbin, China. 10 experts with more than 5 years of clinical experience participated in the labeling work. The annotation is performed on the basis of a region growing algorithm using MIMICS software.

The image sizes are between 512×512×228 and 512×512×376. Pixel sizes of these images are between 0.50mm/pixel and 0.95mm/pixel, and their slice thicknesses are 1mm/pixel. The images will be stored in .nii.gz files. Voxel-level segmentation annotations are: 0 - Background, 1 - Pulmonary artery.

The proportion of training, validation, and test cases is shown as follows:

Training cases: 100 (The relatively large number of data was used for training a robust model).
Opened validation cases: 30 (The relatively small number of data points was used for validation of the algorithm from different participants to verify the evaluation code through the validation dataset and ensure the fairness of the challenge. At the same time, the relatively small number of data can avoid the disclosure of test set data distribution.)
Closed test cases: 70 (The relatively large number of data was used for a fair final leaderboard).

To access the full datasets for the anatomical study, the participants fill out the online registration form and send the signed document to PARSE2022@hotmail.com. Please denote the data usage (e.g., for the artery-vein segmentation and anatomical study). After that, we will send you the data download link by email.