Paper Club with Peter - ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

Paper Club with Peter - ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Share: