br The twofold increase in translocation breakpoints at G
The twofold increase in translocation breakpoints at G4 DNA-forming sequences found here is likely a lower bound since dele-tion or addition of nucleotides that often accompany DSB repair may have moved a number of G4 DNA-associated breakpoints outside of the detection range. Nevertheless, our estimates are in line with determinations of mutations at non-B DNA-forming motifs in cancer genomes (Georgakopoulos-Soares et al., 2018) and strengthen the concept, both from genome-wide (Bacolla et al., 2016; Zhao et al., 2018) and targeted (Javadekar et al., 2018; Pannunzio and Lieber, 2018; Smida et al., 2017) studies, that non-B DNA structures contribute to mutagenesis, both in cancer and in genetic disease (Kamat et al., 2016).
The prevalence of G4-associated translocation breakpoints with tumor samples carrying extensive genomic alterations is consistent with reports that TP53 mutant tumors are associated with high rates of genomic instability (reviewed in (Hanel and Moll, 2012)). TP53 mutants have been reported to sequester DNA repair factors (Hanel and Moll, 2012), such as MRE11, away from double-strand breaks and at stalled replication forks (Roy et al., 2018), leading to an accumulation of translocations (Buis et al., 2008; Syed and Tainer, 2018). Thus, it is likely that TP53-induced instability is related to defects in homologous recombination repair and balancing responses at stalled replication forks including recruit-ment of the MRE11 nuclease, which initiates break and fork pro-cessing (Roy et al., 2018; Schlacher et al., 2011; Shibata et al., 2014). Changes in gene expression, which we show here are extensive and impact cancer mutagenesis, are also likely to influence G4 DNA structure-induced genetic instability (Day et al., 2017). We are also learning how replication and repair proteins, such as FEN1, can act in trans to greatly impact mutations so molecular mechanisms are expected to be key to improve predictions (Tsutakawa et al., 2017). With this in mind, it will be important to elucidate the roles of the other 4 mutated 75899-68-2 (PTPRD, GATA3, KRAS and CTNNB1) in the susceptibility to incur strand breaks at G4 DNA-forming sequences.
As L1 and SVA transposable elements contain G4 DNA-forming sequences (Kejnovsky et al., 2015; Lexa et al., 2014; Sahakyan et al., 2017) that occur in the human genome in the thousands, they are key candidates for G4-dependent translocations. It was surprising to find that translocations breakpoints were enriched at SVA but not at L1 sites, given: 1) that the number of L1PA elements is twice that of SVAs; 2) that retrotransposition and transcription occurs more robustly for L1PA than for SVA elements (Lee et al., 2012) despite the former being older elements, 7.6e18.0 Myrs for L1PA3-5 elements (Khan et al., 2006) versus 3.8e9.5 Myrs for SVA_D-F (Wang et al., 2005); and 3) that SVAs rely on ORF2p, and perhaps ORF1p, for transcription, both of which are provided by L1 elements (Raiz et al., 2012). Indeed, deletion of the G4-forming repeats, which act as entry points for transcription, is detrimental to SVA retrotransposition (Raiz et al., 2012). Hence, our findings expand the repertoire of genomic alterations attributed to SVA el-ements, which has included germline rearrangements (Hancks and Kazazian, 2016; Vogt et al., 2014) and chromosomal breakages leading to chromothripsis (Hancks, 2018). Thus, it is possible hat some SVAs may be particularly active and a source of recurrent strand breaks, so methods to bridge from macromolecular com-plexes to imaging may prove important for a molecular under-standing (Brosey et al., 2017).
4.2. Gene expression and somatic mutations
Despite the realization that correlation does not necessarily
imply causation, we undertook a comprehensive analysis of gene expression and its relationships to mutational loads in cancer ge-nomes with the goal of finding common trends across tumor types of possible predictive value.
4.2.1. Negative correlations
The extent to which many genes displayed strong correlation between their expression and mutational loads was surprising, which prompted us to focus on some of the top correlated genes to better elucidate some of the potential causative relationships. The second strongest anticorrelation was that of MLH1 in ESCA. MLH1 mediates protein-protein interactions during mismatch recogni-tion, strand discrimination, and strand removal. In colon and rectal cancers hypermutation has been linked in part to MLH1 hyper-methylation (TCGAN, 2012), and in esophageal squamous cell car-cinoma MLH1 promoter methylation correlates with weak expression and poor survival (Chen et al., 2016). Our results strengthen the role of low MLH1 mRNA levels in elevating mutation loads. Methods to examine MLH1 activities in the context of its multiple partners, such as developed for X-ray scattering from gold nanocrystals and single molecule forceps (Wang et al., 2018), will be critical to develop a predictive mechanistic understanding for its impacts for cancer biology (Hura et al., 2013; Rambo and Tainer, 2010). In fact, X-ray scattering may prove to be an enabling method for defining the many solution complexes and conforma-tions (Rambo and Tainer, 2013) underlying outcomes to replication stress.