Lesion segregation dominates early tumour genome evolution

Screenshot 2020-06-24 at 16.42.05

For almost 3 years we’ve had the pleasure of participating in the Liver Cancer Evolution (LCE) consortium, along with the labs of Duncan Odom, Paul Flicek, Nuria Lopez-Bigas and Martin Taylor – and today sees the publication of the first consortium paper in Nature. The Odom lab generated some unprecedented data, mapping genome and transcriptome evolution during liver tumour evolution in several strains/species of mice some time ago, and we have all collaborated on the analysis of these data. The initial analyses revealed surprising patterns of mutational asymmetry in these model systems: huge multimegabase segments of each chromosome showed strong biases to particular base substitutions. These patterns emerge due to a failure to repair mutagenic DNA lesions over successive cell cycles, and similar patterns can appear in human cells following mutagenesis. Each new round of DNA replication on a lesion-containing strand can lead to the incorporation of a different mispaired base opposite the lesion site in the newly synthesized strand, generating cells in the evolving tumour with different mutations of the same base pair. Unrepaired lesion segregation is therefore an unexpected source of diversity during what is otherwise straightforward clonal evolution, and may provide fuel for adaptive evolution in early tumours.

Northern exposure


For the past couple of years we’ve been studying the unusual genetics of the Shetland islands with Jim Wilson’s group. Jim has a long history of studying the isolated populations of the Scottish Northern Isles, but we’ve just published the first study that is based upon whole genome sequencing (WGS), comparing Shetland (n=500) and mainland Scottish populations (n=1156). The results are quite striking, showing an enrichment of genetic variants that are rare or ultra-rare (ie not yet seen elsewhere) such that around 10% of all Shetland variants “are unique to the VIKING cohort or are seen at frequencies at least ten fold higher than in more cosmopolitan control populations”. Many of these variants are predicted to alter gene function and they are particularly enriched in promoter regions, which control gene expression patterns. This raises the possibility that gene expression may evolve relatively rapidly in isolated human populations.

Modeling the breakome


Tracy’s work building models of DNA double strand break susceptibility finally emerges from review this week in Genome Biology. She shows that it is possible to make remarkably accurate models, predicting the frequency of breakage in a given region of the genome, using a variety of underlying chromatin features. These predicted frequencies from these models can then be compared (above) to the rates of breakage seen in human tumour data, and identify regions that may be important to tumourigenesis. This work bridges the fields of genome instability, chromatin structure and cancer genomics – which is pretty cool, until you attempt to find suitably eclectic reviewers! It’s also the first manuscript to come out of our ongoing collaboration with our friends in the Crosetto group at the Karolinska.

Anchors in the storm

Chromatin loop anchors seem to be a basic unit of the physical organisation of the human genome, providing stable architectural sites within the nucleus, and influencing gene expression. Vera’s work exploring the strange mutational landscape at loop anchors shows that these sites are also unusually fragile: showing high rates of DNA double strand breaks in vitro and elevated rates of breakage in a variety of tumours. Unexpectedly a substantial fraction of loop anchors also coincide precisely with human recombination hotspots (HS_LAPs below), establishing these sites as foci for evolutionary change in mammalian evolution as well as during tumourigenesis.


Average human recombination rates within 500 kb of recombination hotspots (HSs), the subset of LAPs overlapping HSs (HS_LAPs) and all LAPs. Recombination rates were derived from the worldwide whole genome sequencing data of the 1000 Genomes Project.

Lab winter retreat

After a busy year, and a successful QQR for the HGU, we retreated to a chilly North Berwick to do science, beer and roaring fires.jim


The blind watch-breaker: regulatory evolution in cancers

We still know relatively little about the evolution of gene regulation in cancer. Vera’s study (Kaiser et al, 2016, PLOS Genet) is one of the biggest so far (~1500 tumour whole genomes) and shows that there are remarkably high mutation rates and rapid evolution at most (putatively) functional regulatory sites, and she sees this across many cancer types. Particularly striking contrasts are seen between functional (upper graph) and control (lower graph) CTCF binding sites. However these patterns seem to be adequately explained simply by ‘blind’ mutational bias (ie neutral evolution), rather than active selection for particular alterations to regulation.

Chromatin domain trees

Collaborative work with the groups of Ana Pombo, Josee Dostie and Mario Nicodemi finally sees the light of day (Fraser et al, 2015). Using matched chromatin conformation (Hi-C) and expression (CAGE) data across neural differentiation we were able to relate the dynamics of gene expression to changes in chromatin domain organisation. The results suggest that previously known (TAD) domains on the level of ~1Mb congregate within larger multi-megabase (meta-TAD) structures, to produce a hierarchical tree of interactions up to the level of entire chromosomes. Alterations in the arrangement of branches on these trees over time influence gene expression changes.

A new approach to time series data

Stuart’s developed a clever new approach to time series data (Aitken et al, 2015), fitting expression profiles over time to a series of archetypical models or ‘kinetic signatures’. Unlike the dominant forms of analysis (clustering, differential expression between time points) this allows us to detect profiles of interest even if the profile of interest is entirely unique or involves lowly expressed transcripts. That turns out to be particularly handy when studying ncRNA dynamics.

Human epigenome divergence after duplication

James’ interesting survey of where and when (and even how) duplicated regions diverge in terms of their chromatin structure appears in Genome Biology and Evolution (Prendergast et al, 2014). I realise nobody wants to hear scientists moaning about the peer review process (yet again). But jeez – this paper went through several journals over the course of ~18 months – and is more or less unchanged as a result. Someone needs to at least try to find an alternative for the stodgy, inefficient process we’ve ended up with. Maybe this is it?