Skip to content

List of the 23 Dataset Sources #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mfmezger opened this issue Apr 20, 2020 · 5 comments
Open

List of the 23 Dataset Sources #47

mfmezger opened this issue Apr 20, 2020 · 5 comments

Comments

@mfmezger
Copy link

Hi,

could you share a list of the dataset sources that build your 23 dataset.

Thank you in advance!

@PeterMcGor
Copy link

Any new on this?

@mfmezger
Copy link
Author

no not yet. @cshwhale can you help?

@gallegi
Copy link

gallegi commented Aug 26, 2021

I wanna know about this too

@mfmezger
Copy link
Author

mfmezger commented Jan 5, 2022

I think this could be the same datasets that Fabian Iseensee used in the nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation paper. Link to the paper Nature

Quote from the data availability chapter:
"All 23 datasets used in this study are publicly available and can be accessed via their respective challenge websites as follows. D1–D10 Medical Segmentation Decathlon, http://medicaldecathlon.com/; D11 Beyond the Cranial Vault (BCV)-Abdomen, https://www.synapse.org/#!Synapse:syn3193805/wiki/; D12 PROMISE12, https://promise12.grand-challenge.org/; D13 ACDC, https://acdc.creatis.insa-lyon.fr/; D14 LiTS, https://competitions.codalab.org/competitions/17094; D15 MSLes, https://smart-stats-tools.org/lesion-challenge; D16 CHAOS, https://chaos.grand-challenge.org/; D17 KiTS, https://kits19.grand-challenge.org/; D18 SegTHOR, https://competitions.codalab.org/competitions/21145; D19 CREMI, https://cremi.org/; D20–D23 Cell Tracking Challenge, http://celltrackingchallenge.net/."

@lyhyl
Copy link

lyhyl commented Nov 19, 2024

I think this could be the same datasets that Fabian Iseensee used in the nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation paper. Link to the paper Nature

Quote from the data availability chapter: "All 23 datasets used in this study are publicly available and can be accessed via their respective challenge websites as follows. D1–D10 Medical Segmentation Decathlon, http://medicaldecathlon.com/; D11 Beyond the Cranial Vault (BCV)-Abdomen, https://www.synapse.org/#!Synapse:syn3193805/wiki/; D12 PROMISE12, https://promise12.grand-challenge.org/; D13 ACDC, https://acdc.creatis.insa-lyon.fr/; D14 LiTS, https://competitions.codalab.org/competitions/17094; D15 MSLes, https://smart-stats-tools.org/lesion-challenge; D16 CHAOS, https://chaos.grand-challenge.org/; D17 KiTS, https://kits19.grand-challenge.org/; D18 SegTHOR, https://competitions.codalab.org/competitions/21145; D19 CREMI, https://cremi.org/; D20–D23 Cell Tracking Challenge, http://celltrackingchallenge.net/."

This article was initially published on arXiv in 2019, while nnUNet was officially published in Nature Methods in 2020. Although nnUNet was first presented on arXiv in 2018, the datasets used at that time do not appear to have been explicitly listed.

Among the 23 datasets referenced in the nnUNet study, the datasets D21-D23 correspond to the fluorescence microscopy cell tracking challenge. These datasets seem less appropriate for pretraining tasks focused on organ segmentation. Furthermore, the arXiv paper of this work describes the aggregation methodology for the "3DSeg-8" dataset. It is likely that the "3DSeg-8" datasets represent a subset of the 23 datasets mentioned, although they seem to differ from the datasets directly referenced in nnUNet.

The "3DSeg-8" dataset mentioned in the original article:

[6] Liver tumor segmentation challenge. https://competitions.codalab.org/competitions/17094#results.
[19] Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, et al. The multimodal brain tumor image segmentation benchmark (brats). IEEE Transactions on Medical Imaging, 34(10):1993, 2015.
[33] Medical segmentation decathlon. http://medicaldecathlon.com/index.html
*[36] Catalina Tobon-Gomez, Arjan J Geers, Jochen Peters, Jürgen Weese, Karen Pinto, Rashed Karim, Mohammed Ammar, Abdelaziz Daoudi, Jan Margeta, Zulma Sandoval, et al. Benchmark for algorithms segmenting the left atrium from 3d ct and mri datasets. IEEE transactions on medical imaging, 34(7):1460–1473, 2015.

Please point out any inaccuracies.

BTW, dataset link of reference [36], https://github.com/catactg/lasc :

The data agreement for CT datasets expired on September 2018. Therefore, we can not share these datasets anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants