DeepSpeed Data Efficiency: A composable library that makes better use of data, increases training efficiency, and improves model quality
Partition-aware ZeRO with up to 2x reduction in communication time!
DeepSpeed was used to train the world’s largest language model.