List of the 23 Dataset Sources #47

mfmezger · 2020-04-20T07:31:48Z

Hi,

could you share a list of the dataset sources that build your 23 dataset.

Thank you in advance!

PeterMcGor · 2020-11-17T19:28:16Z

Any new on this?

mfmezger · 2020-11-25T10:02:27Z

no not yet. @cshwhale can you help?

gallegi · 2021-08-26T00:12:09Z

I wanna know about this too

mfmezger · 2022-01-05T13:17:41Z

I think this could be the same datasets that Fabian Iseensee used in the nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation paper. Link to the paper Nature

Quote from the data availability chapter:
"All 23 datasets used in this study are publicly available and can be accessed via their respective challenge websites as follows. D1–D10 Medical Segmentation Decathlon, http://medicaldecathlon.com/; D11 Beyond the Cranial Vault (BCV)-Abdomen, https://www.synapse.org/#!Synapse:syn3193805/wiki/; D12 PROMISE12, https://promise12.grand-challenge.org/; D13 ACDC, https://acdc.creatis.insa-lyon.fr/; D14 LiTS, https://competitions.codalab.org/competitions/17094; D15 MSLes, https://smart-stats-tools.org/lesion-challenge; D16 CHAOS, https://chaos.grand-challenge.org/; D17 KiTS, https://kits19.grand-challenge.org/; D18 SegTHOR, https://competitions.codalab.org/competitions/21145; D19 CREMI, https://cremi.org/; D20–D23 Cell Tracking Challenge, http://celltrackingchallenge.net/."

lyhyl · 2024-11-19T14:14:21Z

I think this could be the same datasets that Fabian Iseensee used in the nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation paper. Link to the paper Nature

Quote from the data availability chapter: "All 23 datasets used in this study are publicly available and can be accessed via their respective challenge websites as follows. D1–D10 Medical Segmentation Decathlon, http://medicaldecathlon.com/; D11 Beyond the Cranial Vault (BCV)-Abdomen, https://www.synapse.org/#!Synapse:syn3193805/wiki/; D12 PROMISE12, https://promise12.grand-challenge.org/; D13 ACDC, https://acdc.creatis.insa-lyon.fr/; D14 LiTS, https://competitions.codalab.org/competitions/17094; D15 MSLes, https://smart-stats-tools.org/lesion-challenge; D16 CHAOS, https://chaos.grand-challenge.org/; D17 KiTS, https://kits19.grand-challenge.org/; D18 SegTHOR, https://competitions.codalab.org/competitions/21145; D19 CREMI, https://cremi.org/; D20–D23 Cell Tracking Challenge, http://celltrackingchallenge.net/."

This article was initially published on arXiv in 2019, while nnUNet was officially published in Nature Methods in 2020. Although nnUNet was first presented on arXiv in 2018, the datasets used at that time do not appear to have been explicitly listed.

Among the 23 datasets referenced in the nnUNet study, the datasets D21-D23 correspond to the fluorescence microscopy cell tracking challenge. These datasets seem less appropriate for pretraining tasks focused on organ segmentation. Furthermore, the arXiv paper of this work describes the aggregation methodology for the "3DSeg-8" dataset. It is likely that the "3DSeg-8" datasets represent a subset of the 23 datasets mentioned, although they seem to differ from the datasets directly referenced in nnUNet.

The "3DSeg-8" dataset mentioned in the original article:

[6] Liver tumor segmentation challenge. https://competitions.codalab.org/competitions/17094#results.
[19] Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, et al. The multimodal brain tumor image segmentation benchmark (brats). IEEE Transactions on Medical Imaging, 34(10):1993, 2015.
[33] Medical segmentation decathlon. http://medicaldecathlon.com/index.html
*[36] Catalina Tobon-Gomez, Arjan J Geers, Jochen Peters, Jürgen Weese, Karen Pinto, Rashed Karim, Mohammed Ammar, Abdelaziz Daoudi, Jan Margeta, Zulma Sandoval, et al. Benchmark for algorithms segmenting the left atrium from 3d ct and mri datasets. IEEE transactions on medical imaging, 34(7):1460–1473, 2015.

Please point out any inaccuracies.

BTW, dataset link of reference [36], https://github.com/catactg/lasc :

The data agreement for CT datasets expired on September 2018. Therefore, we can not share these datasets anymore.

mfmezger mentioned this issue Jan 5, 2022

The dataset list of 23 datasets for the pre-trained model #65

Open

mfmezger mentioned this issue Mar 2, 2023

on which datasets the models are pretrained ? #70

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List of the 23 Dataset Sources #47

List of the 23 Dataset Sources #47

mfmezger commented Apr 20, 2020

PeterMcGor commented Nov 17, 2020

mfmezger commented Nov 25, 2020

gallegi commented Aug 26, 2021

mfmezger commented Jan 5, 2022

lyhyl commented Nov 19, 2024

List of the 23 Dataset Sources #47

List of the 23 Dataset Sources #47

Comments

mfmezger commented Apr 20, 2020

PeterMcGor commented Nov 17, 2020

mfmezger commented Nov 25, 2020

gallegi commented Aug 26, 2021

mfmezger commented Jan 5, 2022

lyhyl commented Nov 19, 2024