Understanding how evolution works using tools of statistical physics
Using analogies from statistical physics to understand emergent phenomena in evolution
Natural selection acts on phenotypes, yet variation in phenotypes arises from mutations at the genetic level. It is being increasingly recognised that a crucial missing ingredient for a full understanding of evolution, even at a qualitative level, is the role of genotype-phenotype maps, or the mapping from sequence to function. Quantitative and predictive theories of evolution that account for this mapping will be required to understand several pressing problems, including antibiotic resistance, virus evolution, and cancer evolution. A key generic prediction is that small drift-dominated populations adapt to phenotypes that have the most sequences coding for them instead of the most optimal (Khatri, PNAS 2009, Genetics 2016). This means smaller populations undergo more rapid speciation as common ancestors are more likely to be maladapted (Khatri, JTB 2015, Khatri, Genetics, 2015 & Khatri, bioarxiv123265, 2017). In addition, I have developed a stochastic theory for the evolution of phenotypes within a continuous stochastic dynamics framework (Khatri et al, JTB, 2015); this holds promise of greatly simplified modelling of complex genotype-phenotype maps in the future.
Detecting selection in longitudinal deep sequencing of viruses
My work aims to understand the basic evolutionary forces that shape the genomes of pathogens and the stochastic dynamics of variants in a host. A key quantity to calculate is the probability of observing a change in the frequency of a variant in a fixed time interval. This is a difficult mathematical diffusion problem that has eluded accurate solution for changes over short times. The key difficulty is the that the variance of allele frequencies (or diffusion constant) is not the same for all allele frequencies. Fisher's angular transformation gives a variance which is independent of a new angular frequency and so effectively simple Brownian motion. The cost is a non-linear potential which is a manifestation of the effects of genetic drift, and is probably why this transformation has largely gone unnoticed and unexploited, despite Fisher discovering it almost a century ago. To make practical use of the transformation, I developed an heuristic Gaussian-based method that gave simple and accurate solutions including genetic drift, selection and mutation. This work (Khatri, Sci. Rep. 2016) is a key foundational basis for analysing longitudinal deep-sequencing data. More recently, I used the same framework for a new analysis which can distinguish between transmission and intrahost fitness effects in phylogenetic trees (Khatri et al, in preparation) and a new method to calculate the rate of fixation of rare variants (Khatri, bioRxiv123232, 2017).