Large scale training; Transformer; Protein

We introduce RITA, a suite of autoregressive generative models for protein sequences, with up to 1.2 billion parameters