Large scale training; Dropout; CNN; Transformer

We introduce a dropout-inspired scheme to train large neural networks faster with no loss in accuracy.