3

RNAGym: Large-scale Benchmarks for RNA Fitness and Structure Prediction

RNAGym: Large-scale Benchmarks for RNA Fitness and Structure Prediction

Rohit Arora, Murphy Angelo, Christian Andrew Choe, Courtney A. Shearer, Aaron W. Kollasch, Fiona Qu, Ruben Weitzman, Artem Gazizov, Sarah Gurev, Erik Xie, Debora Marks, Pascal Notin

Preprint. Large-scale benchmarks to assess models for RNA fitness & structure prediction.

Large-scale discovery, analysis, and design of protein energy landscapes

Large-scale discovery, analysis, and design of protein energy landscapes

Állan J. R. Ferrari, Sugyan M. Dixit, Jane Thibeault, Mario Garcia, Scott Houliston, Robert W. Ludwig, Pascal Notin, Claire M. Phoumyvong, Cydney M. Martell, Michelle D. Jung, Kotaro Tsuboyama, Lauren Carter, Cheryl H. Arrowsmith, Miklos Guttman, Gabriel J. Rocklin

Preprint. A multiplexed experimental approach to analyze conformational fluctuations across thousands of protein domains, revealing hidden variations that affect protein cooperativity and function.

Predicting Promoter Variant Effects from Evolutionary Sequences

Predicting Promoter Variant Effects from Evolutionary Sequences

Courtney A. Shearer, Felix Teufel, Rose Orenbuch, Daniel Ritter, Aviv Spinner, Erik Xie, Jonathan Frazer, Mafalda Dias, Pascal Notin, Debora S. Marks

Preprint. A conditional autoregressive transformer model trained on 14.6 million mammalian promoter sequences that achieves state-of-the-art performance in predicting the effects of indels in human promoter regions.

TranceptEVE: Combining Family-specific and Family-agnostic Models of Protein Sequences for Improved Fitness Prediction

TranceptEVE: Combining Family-specific and Family-agnostic Models of Protein Sequences for Improved Fitness Prediction

Pascal Notin, Lood van Niekerk, Aaron W Kollasch, Daniel Ritter, Yarin Gal, Debora S. Marks

NeurIPS, LMRL, 2022. A hybrid family-specific and family-agnostic model to achieve SOTA performance on protein fitness prediction and human variant annotation.

RITA: a Study on Scaling Up Generative Protein Sequence Models

RITA: a Study on Scaling Up Generative Protein Sequence Models

Daniel Hesslow, Niccoló Zanichelli, Pascal Notin, Iacopo Poli, Debora Marks

ICML, WCB, 2022. The first paper investigating scaling laws in protein language modeling.

Viral Evolution and Antibody Escape Mutations using Deep Generative Models

Viral Evolution and Antibody Escape Mutations using Deep Generative Models

Nicole Thadani, Nathan Rollins, Sarah Gurev, Pascal Notin, Yarin Gal, Debora Marks

We leverage deep generative models of evolutionary sequences to predict viral escape mutations.

Improving Compute Efficacy Frontiers with SliceOut

Improving Compute Efficacy Frontiers with SliceOut

Pascal Notin, Aidan N. Gomez, Joanna Yoo, Yarin Gal

Preprint. A memory-efficient dropout-inspired scheme to train large neural networks faster with no loss in accuracy.

Principled Uncertainty Estimation for High Dimensional Data

Principled Uncertainty Estimation for High Dimensional Data

Pascal Notin, José Miguel Hernández-Lobato, Yarin Gal

We introduce an importance sampling-based estimator to estimate the epistemic uncertainty of deep learning models for high-dimensional discrete datasets.