Releases: aws-samples/awsome-distributed-training
Releases · aws-samples/awsome-distributed-training
Release before the mass migration work
This release is pointing out the old directory structure + test cases.
This release creates a new "opt-in" openZFS filesystem as a home-directory on SageMaker HyperPod Slurm clusters, to address the Lots of Small Files (LoSF) issue encountered frequently when creating Conda Environments on default home directories where Lustre exists.