Liam's blog

Written by Liam Bai who works on software at Ginkgo Bioworks and writes about math, AI, and biology. He's on LinkedIn and Twitter.

Why say lot word when few do trick?
June 08, 2025
The Minimum Description Length (MDL) principle, Kolmogorov complexity, linear regression, data compression and learning.
Protein language models through the logit lens
May 21, 2025
Applying the logit lens to ESM-2. Logit visualization & interpretation, attention analysis, looking inside the mind of a protein language model.
Protein VAEs
February 11, 2024
Predicting variant effects with variational autoencoders. DeepSequence, EVE, EVEScape. Machine learning for clinical decisions, improving pandemic preparedness by predicting antibody escape.
An introduction to variational autoencoders
November 04, 2023
Predicting protein function using deep generative models. Latent variable models, reconstruction, variational autoencoders (VAEs), Bayesian inference, evidence lower bound (ELBO).
Protein Inception
October 09, 2023
Protein design by hallucination. DeepDream, Markov Chain Monte Carlo (MCMC), KL divergence, gradient optimization, scaffolding functional sites, SARS-CoV-2 receptor traps.
How to represent a protein sequence
September 29, 2023
Learning protein representations. Transfer learning, protein language models, contextual embeddings, Transformers, masked language modeling, BERT, UniRep, ESM, attention analysis.
What we can learn from evolving proteins
September 12, 2023
Predicting protein structure and function. Multiple Sequence Alignments (MSAs), the protein folding problem, the Potts model, Direct Coupling Analysis (DCA), EVCouplings.