The elusive 8p minimal critical region
The search for 8p drivers has been a journey to genomic destinations unknown. We seek to answer the question: what do different megabase alterations to the short arm of chromosome 8 have in common?
Megabase alterations to the short arm of chromosome 8 are a plot to overthrow the organism, not the actions of a lone wolf. A whodunit orchestrated by a shadowy web of co-conspirators, double agents, and unwitting patsies. Innocent bystanders get swept up in the confusion. Everyone in a defined radius is under suspicion.
The evidence collected is often circumstantial, no obvious smoking gun. There are too many leads to chase down, most of them dead ends, and not enough investigators assigned to the case. The manhunt to round up driver genes feels like a race against time.
While dramatic, this analogy accurately portrays the challenges of studying 8p chromosomal disorders and other complex cases of genomic imbalance that do not obey the deterministic laws of classical aneuploidy. Classical aneuploidy works best in cases where a single driver gene that is amplified in a duplication or lost in a deletion is solely responsible for pathophysiology.
One example is Phelan-McDermid Syndrome, which may cause 1% of autism spectrum disorders. Deletions as small as one kilobase and up to 9 megabases on the long arm of chromosome 22 — 22q13del — knock out one copy of the SHANK3 gene, a situation that more closely matches the classical aneuploidy framework.
The genes located on 8p are not the only drivers of interest because the whole genome may be in on it. In the Grand Unified Theory of Neurodevelopmental Chromosome Disorders, the Project 8p science team introduced the concept of a probabilistic description of chromosomal mechanics that seeks to explain and predict the interactions within and between chromosomes modeled as dynamically phase-transitioning polymers.
However, unlike in the atomic world where things get weirder the smaller you go, chromosome behavior gets even funkier as you go from nucleotides to nucleosomes to the megabase scale.
Returning to the driver gene manhunt analogy, coordinating among different law enforcement agencies and jurisdictions that don’t always see eye to eye is no small feat. Molecular and cell biologists tend to focus on form: chromatin structure. Geneticists tend to focus on function: effects in cis or in trans. Systems biologists like to think they unite form and function: insert trendy genomics technique here.
Given that hundreds of genes — not to mention noncoding genes and other conserved functional elements — are directly physically affected by megabase-sized 8p alterations, the problem isn’t issuing indictments for suspected driver genes but convincing the court to convict.
Booth didn’t act alone, but Stanton had to prove there was a grand conspiracy to topple the Union government that was directly ordered by Jefferson Davis and carried out by a network of co-conspirators.
Back to chromosomes. It may not be the dosage of a particular driver gene located on the short arm of chromosome 8 that is driving disease or a disease phenotype. Instead, the levels of a protein encoded by an 8p gene might affect the dosage of a gene located on, say, chromosome 7p.
The 7p gene may be activated (de-repressed) or inhibited by the 8p gene, and then this change causes a disease phenotype or phenotypes. The levels of the 7p protein go on, in turn, to affect the dosage of another gene located on, say, chromosome 6p. And so on.
Who is really driving the car? Perhaps we should refer to the driver genes that are not located on 8p as backseat drivers.
The foundational paper Okur et al., 2021 describes the first clinical cohort of 8p heroes. In that study, the authors noted six 8p hero genotypes and four minimal critical regions. Independently, two overlapping minimal critical regions in the 8p inverted duplications have been reported by Sajan et al., 2013 and Vibert et al., 2021.
The six 8p hero genotypes are shown in the figure below, which includes chromosome 8 genome coordinates:
Deletion with inverted duplication, or invdupdel (most common genotype)
Duplication
Distal deletion
Distal & proximal deletion
Proximal deletion
Proximal duplication
8p invdupdel heroes almost all have similar albeit not exactly the same chromosomal breakpoints. In the figure below, notice the yellow highlighted gap segment sandwiched between the purple deleted segments and the blue duplicated segments in the figure below. Two clusters of defensin genes line both sides of that gap.
Also note the brackets. The top four rows are the Okur et al minimal critical regions (#1-4). The bottom two rows are the Sajan et al minimal critical region (#5) and the Vibert et al minimal critical region (#6).
The yellow highlighted region is blown out in the next figure. The locations of defensin genes are also highlighted in yellow. A future Substack will go into greater detail about why these defensin gene clusters create a hotspot for inverted duplication events in meiosis.
For now, this genome browser resolution view provides a high-level explanation for the pattern of recurrent breakpoints:
The Supplemental Data section of Okur et al does an excellent job spotlighting potential driver genes of interest in each of the 8p minimal critical regions. In lieu of a recitation of genes located in each of the 8p minimal critical regions they identified, let’s take a closer look at the inverted duplication segment that affects almost all 8p invdupdel heroes:
The yellow-highlighted blue segment is referred to as the agenesis of the corpus callosum (ACC) minimal critical region, which is located in inverted duplications. The orange box delineates a 5.1-megabase segment of interest — a search perimeter — that includes a small portion of the Okur et al inverted duplication minimal critical region as well as the Sajan et al and Vibert et al inverted duplication minimal critical regions.
A long-suspected driver gene of interest RHOBTB2 resides in this 5.1-megabase region, along with neighboring genes that have something in common.
Credit to Project 8p researcher Prof Stefan Pinter who made the following insightful observations about potential driver genes in this 5.1-megabase stretch. There are six genes with high triplosensitivity scores (pTS > 0.9), meaning having three copies of the gene is likely to be pathogenic: DPYSL2, PPP2R2A, EBF2, PTK2B, KCTD9, RHOBTB2. Three of those triplosensitive genes (in bold) are also CUL3 targets.
Two other CUL3 targets live a few floors down on 8p: NEFL (neurofilament light chain) and NEFM (neurofilament medium chain). CUL3 is a ubiquitin E3 ligase that regulates protein degradation. A third copy of a triplosensitive gene may overwhelm the protein degradation capacity of the cell. Consistent with that model, two of the CUL3 targets in the 8p duplicated interval (PPP2R2A & RHOBTB2) can give rise to neurodevelopmental disorders on their own.
But like any grand conspiracy, there’s always another layer.
Neurodevelopmental disorders and neuropsychiatric disorders may be two sides of the same coin because the same underlying neuronal pathways are implicated by human genetics. This is also borne out by our Reelin/RELN over-expression findings.
The RELN gene plays a neurogenic role in early brain development, and neuromaintenance and neuroprotective roles in adulthood. Project 8p is focused on neurodevelopment because most 8p heroes known to medicine are still children. As families deal with the demanding day-to-day realities of a special needs child, what will happen to the brains of 8p heroes as they age?
8p can learn from the experiences of researchers who have studied a 3-megabase heterozygous deletion of one copy of the long arm of chromosome 22, or 22q11.2.
22q11.2 deletions are a risk factor not just for schizophrenia but also autism spectrum disorders and intellectual disability. It just so happens to be the equivalent of a commuter train ride down the chromosome from 22q11.3, the location of the deletions that cause Phelan-McDermid Syndrome.
What makes 22q11.2 and 8p similar is that genetic variation outside of 22q11.2 plays a role in determining the type, severity and penetrance of neurological presentation. A paper published a few years ago — Nehme et al., 2022 — attempted to untangle the knotted web of genetic interactions between multiple haploinsufficient genes on 22q11.2 and backseat driver genes elsewhere in the genome. A quick review of their study is instructive for Project 8p.
The authors generated iPSCs lines from twenty 22q.11.2 deletion carriers and 29 unaffected controls. Next they performed bulk RNAseq on cells at three different times points during a neuronal differentiation protocol: undifferentiated pluripotent stem cells on Day 0; partially differentiated neural progenitor cells at Day 4; terminally differentiated excitatory neurons at Day 28.
The Venn diagram below reveals a striking result. Across the three timepoints and cell differentiation states, 90% of the differentially expressed genes (n=386) in 22q11.2 deletion cell lines are located outside of 22q11.2. Reassuringly, the 27 genes that are down-regulated in all three datasets are located in the 22q11.2 deletion region. They also reproduced the RNAseq results from the 22q11.2 deletion carrier cell lines in a critical isogenic control experiment by generating the 22q11.2 heterozygous deletion in a wild-type cell using CRISPR engineering.
Nehme et al went on to canvas those 386 potential backseat driver genes. How else are they connected to the 22q11.2 conspirators besides transcriptionally? What do those 386 genes have in common with each other? Are they directly associated with neurodevelopmental and/or neuropsychiatric disorders, or to the genes that give rise to said disorders?
The authors performed a number of analyses of answer those questions. They showed that differentially expressed genes in 22q11.2 pluripotent stem cells and 22q11.2 neural progenitor cells are more likely to interact genetically, and at the protein level, with a curated set of 295 genes associated with childhood neurodevelopmental disorders.
In parallel, the authors showed that differentially expressed genes in 22q11.2 neurons are more likely to interact genetically with known schizophrenia risk factor genes. If the same 22q11.2 deletion interacts with the rest the genome in context-dependent ways — affecting specific cell lineages at discrete times in life — it’s not unreasonable to assume megabase 8p alterations will behave similarly.
To that end, Project 8p has been collaborating with Prof Hiruy Meharena’s lab on creating a multi-omics 8p brain cell atlas that includes RNAseq so we can perform the types of analyses presented in Nehme et al. Based on preliminary data from Meharena lab, the promoters of genes up-regulated in 8p neural progenitor cells are bound by members of four different families of transcription factors. The promoters of genes down-regulated in 8p neural progenitor cells are bound by members of three of those four families.
The gene expression levels of ten transcription factors belonging to those four families is shown below. Strikingly, one of the transcription factors is up-regulated — nearly doubled — in 8p neural progenitor cells relative to control neural progenitor cells. This transcription factor happens to be located on chromosome 6. Did we just discover a backseat driver of 8p?
In order to put candidate frontseat or backseat drivers to the test, we’ll need functional readouts (phenotypes) ideally in a rapid turnaround and cost-effective 2-D cell model. One potentially fruitful avenue is emerging. 8p neural progenitor cells have a proliferation defect, as shown below:
In our tireless search for 8p drivers, we have a growing appreciation for the who. But we still don’t understand the how and the when. Nehme et al speculate that 3-D genome architecture may provide an explanation.
How exactly 22q11.2del might regulate the expression of genes in trans remains a matter of great interest. One intriguing possibility is that 22q11.2del might impact chromatin architecture, thereby regulating the expression of genes outside of the region. Indeed, a recent study using 22q11.2del lymphoblastoid cell lines revealed changes in their genome architecture87. It is thus possible that 22q11.2del spatially rearranges the genome of neuronal cells, resulting in mis-regulation of genes linked to neuropsychiatric disorders.
The Project 8p community will continue to work the problem with humility and an open mind. One day we will demystify the secret lives of chromosomes.