* denotes equal contribution
PDFs are for personal use only
|Combinatorial quantification of 5mC and 5hmC at individual CpG dyads and the transcriptome in single cells reveals modulators of DNA methylation maintenance fidelity
Chialastri A, Sarkar S, Schauer EE, Lamba S, Dey SS
Nature Structural & Molecular Biology (accepted) (2023).
Transmission of 5-methylcytosine (5mC) from one cell generation to the next plays a key role in regulating cellular identity in mammalian development and diseases. While recent work has shown that the activity of DNMT1, the protein responsible for the stable inheritance of 5mC from mother to daughter cells, is imprecise; it remains unclear how the fidelity of DNMT1 is tuned in different genomic and cell state contexts. Here we describe Dyad-seq, a method that combines enzymatic detection of modified cytosines with nucleobase conversion techniques to quantify the genome-wide methylation status of cytosines at the resolution of individual CpG dinucleotides. We find that the fidelity of DNMT1-mediated maintenance methylation is directly related to the local density of DNA methylation, and for genomic regions that are lowly methylated, histone modifications can dramatically alter the maintenance methylation activity. Further, to gain deeper insights into the methylation and demethylation turnover dynamics, we extended Dyad-seq to quantify all combinations of 5mC and 5-hydroxymethylcytosine (5hmC) at individual CpG dyads to show that TET proteins preferentially hydroxymethylate only one of the two 5mC sites in a symmetrically methylated CpG dyad rather than sequentially convert both 5mC to 5hmC. To understand how cell state transitions impact DNMT1-mediated maintenance methylation, we scaled the method down and combined it with the measurement of mRNA to simultaneously quantify genome-wide methylation levels, maintenance methylation fidelity and the transcriptome from the same cell (scDyad&T-seq). Applying scDyad&T-seq to mouse embryonic stem cells transitioning from serum to 2i conditions, we observe dramatic and heterogenous demethylation and the emergence of transcriptionally distinct subpopulations that are closely linked to the cell-to-cell variability in loss of DNMT1-mediated maintenance methylation activity, with regions of the genome that escape 5mC reprogramming retaining high levels of maintenance methylation fidelity. Overall, our results demonstrate that while distinct cell states can substantially impact the genome-wide activity of the DNA methylation maintenance machinery, locally there exists an intrinsic relationship between DNA methylation density, histone modifications and DNMT1-mediated maintenance methylation fidelity that is independent of cell state.
Download bioRxiv (PDF)
|Targeted rRNA depletion enables efficient mRNA sequencing in diverse bacterial species and complex co-cultures
Heom KA*, Wangsanuwat C*, Butkovich LV, Tam SC, Rowe AR, O'Malley MA (co-corresponding), Dey SS (co-corresponding)
mSystems (accepted) (2023).
Bacterial mRNA sequencing is inefficient due to the abundance of ribosomal RNA that is challenging to deplete. While commercial kits target rRNA from common bacterial species, they are frequently inefficient when applied to divergent species, including those from environmental isolates. Similarly, other methods typically employ large probe sets that tile the entire length of rRNAs; however, such approaches are infeasible when applied to many species. Therefore, we present EMBR-seq+, which requires fewer than ten oligonucleotides per rRNA by combining rRNA blocking primers with RNase H-mediated depletion to achieve rRNA removal efficiencies of up to 99% in diverse bacterial species. Further, in more complex microbial co-cultures between F. succinogenes strain UWB7 and anerobic fungi, EMBR-seq+ depleted both bacterial and fungal rRNA, with a 4-fold improvement in bacterial rRNA depletion compared to a commercial kit, thereby demonstrating that the method can be applied to non-model microbial mixtures. Notably, for microbes with unknown rRNA sequences, EMBR-seq+ enables rapid iterations in probe design without requiring to start experiments from total RNA. Finally, efficient depletion of rRNA enabled systematic quantification of the reprogramming of the bacterial transcriptome when cultured in the presence of the anerobic fungi Anaeromyces robustus or Caecomyces churrovis. We observed that F. succinogenes strain UWB7 downregulated several lignocellulose-degrading carbohydrate-active enzymes in the presence of anerobic gut fungi, suggesting close interactions between two cellulolytic species that specialize in different aspects of biomass breakdown. Thus, EMBR-seq+ enables efficient, cost-effective and rapid quantification of the transcriptome to gain insights into non-model microbial systems.
Download bioRxiv (PDF)
|Optogenetic control of the integrated stress response reveals proportional encoding and the stress memory landscape
Batjargal T, Zappa F, Grant RJ, Piscopio RA, Chialastri A, Dey SS , Acosta-Alvear D, Wilson MZ
Cell Systems (2023).
The Integrated Stress Response (ISR) is a conserved signaling network that detects cellular damage and computes adaptive or terminal outcomes. Understanding the mechanisms that underly these computations has been difficult because natural stress inputs activate multiple parallel signaling pathways and classical ISR inducers have pleiotropic effects. To overcome this challenge, we engineered photo-switchable control over the ISR stress sensor kinase PKR (opto-PKR), which allows virtual control of the ISR. Using controlled light inputs to activate opto-PKR we traced information flow in the ISR both globally, in the transcriptome, and for key ISR effectors. Our analyses revealed a biphasic, input-proportional transcriptional response with two dynamic modes, transient and gradual, that correspond to adaptive and terminal ISR outcomes. Using this data, we constructed an ordinary differential equation (ODE) model of the ISR which predicted system hysteresis dependent on prior stress durations and that stress memory encoding may lead to resilience. Our results demonstrate that the input dynamics of the ISR encode information in stress levels, durations, and the timing between stress encounters.
Download PDF | bioRxiv (PDF) | Study highlighted in Cell Systems
|Integrated single-cell sequencing reveals principles of epigenetic regulation of human gastrulation and germ cell development in a 3D organoid model
Chialastri A,Karzbrun E, Khankhel AH, Radeke MJ, Streichan SJ, Dey SS
The emergence of different cell types and the role of the epigenome in regulating transcription is a key yet understudied event during human gastrulation. Investigating these questions remain infeasible due to the lack of availability of embryos at these stages of development. Further, human gastrulation is marked by dynamic changes in cell states that are difficult to isolate at high purity, thereby making it challenging to map how epigenetic reprogramming impacts gene expression and cellular phenotypes. To overcome these limitations, we describe scMAT-seq, a high-throughput one-pot single-cell multiomics technology to simultaneously quantify DNA methylation, DNA accessibility and the transcriptome from the same cell. Applying scMAT-seq to 3D human gastruloids, we characterized the epigenetic landscape of major cell types corresponding to the germ layers and primordial germ cell-like cells (hPGCLC). As the identity of the progenitors that give rise to human PGCLCs remain unclear, we used this system to discover that the progenitors emerge from epiblast cells and show transient characteristics of both amniotic- and mesoderm-like cells, before getting specified towards hPGCLCs. Finally, as cells differentiate along different lineages during gastrulation, we surprisingly find that while changes in DNA accessibility are tightly correlated to both upregulated and downregulated genes, reorganization of gene body DNA methylation is strongly related to only genes that get downregulated, with genes that turn on displaying a lineage trajectory-dependent correlation with DNA methylation. Collectively, these results demonstrate that scMAT-seq is a high-throughput and sensitive approach to elucidate epigenetic regulation of gene expression in complex systems such as human gastrulation that are marked by rapidly transitioning cell states.
|Integrated single-cell sequencing of 5-hydroxymethylcytosine and genomic DNA using scH&G-seq
Chialastri A*, Wangsanuwat C*, Dey SS
STAR Protocols 2:101016 (2021).
The asymmetric distribution of 5-hydroxymethylcytosine (5hmC) between two DNA strands of a chromosome enables endogenous reconstruction of cellular lineages at an individual-cell-division resolution. Further, when integrated with data on genomic variants to infer clonal lineages, this combinatorial information accurately reconstructs larger lineage trees. Here, we provide a detailed protocol for single-cell 5-hydroxymethylcytosine and genomic DNA sequencing (scH&G-seq) to simultaneously quantify 5hmC and genomic DNA from the same cell to reconstruct lineage trees at a single-cell-division resolution.
|A probabilistic framework for cellular lineage reconstruction using integrated single-cell 5-hydroxymethylcytosine and genomic DNA sequencing
Wangsanuwat C, Chialastri A, Aldeguer JF, Rivron NC, Dey SS
Cell Reports Methods 1:100060 (2021).
Lineage reconstruction is central to understanding tissue development and maintenance. To overcome the limitations of current techniques that typically reconstruct clonal trees using genetically encoded reporters, we report scPECLR, a probabilistic algorithm to endogenously infer lineage trees at a single cell-division resolution using 5-hydroxymethylcytosine (5hmC). When applied to 8-cell preimplantation mouse embryos, scPECLR predicts the full lineage tree with greater than 95% accuracy. In addition, we developed scH&G-seq to sequence both 5hmC and genomic DNA from the same cell. As genomic DNA sequencing yields information on both copy-number variations and single-nucleotide polymorphisms, when combined with scPECLR, it enables more accurate lineage reconstruction of larger trees. Finally, we show that scPECLR can also be used to map chromosome strand segregation patterns during cell division, thereby providing a strategy to test the “immortal strand” hypothesis. Thus, scPECLR provides a generalized method to endogenously reconstruct lineage trees at an individual cell division resolution.
|Strand-specific single-cell methylomics reveals distinct modes of DNA demethylation dynamics during early mammalian development
Sen M*, Mooijman D*, Chialastri A*, Boisset JC, Popovic M, Heindryckx B, Chuva de Sousa Lopes SM, Dey SS (co-corresponding), van Oudenaarden A (co-corresponding)
Nature Communications 2:1286 (2021).
DNA methylation (5mC) is central to cellular identity. The global erasure of 5mC from the parental genomes during preimplantation mammalian development is critical to reset the methylome of gametes to the cells in the blastocyst. While active and passive modes of demethylation have both been suggested to play a role in this process, the relative contribution of these two mechanisms to 5mC erasure remains unclear. Here, we report a new single-cell method (scMspJI-seq) that enables strand-specific quantification of 5mC, allowing us to systematically probe the dynamics of global demethylation. When applied to mouse embryonic stem cells, we identified substantial cell-to-cell strand-specific 5mC heterogeneity, with a small group of cells displaying asymmetric levels of 5mCpG between the two DNA strands of a chromosome suggesting loss of maintenance methylation. Next, in preimplantation mouse embryos, we discovered that methylation maintenance is active till the 16-cell stage followed by passive demethylation in a fraction of cells within the early blastocyst at the 32-cell stage of development. Finally, human preimplantation embryos qualitatively show temporally delayed yet similar demethylation dynamics as mouse embryos. Collectively, these results demonstrate that scMspJI-seq is a sensitive and cost-effective method to map the strand-specific genome-wide patterns of 5mC in single cells.
Download PDF | bioRxiv (PDF)
|Efficient and cost-effective bacterial mRNA sequencing from low input samples through ribosomal RNA depletion
Wangsanuwat C*, Heom KA*, Liu E, O’Malley MA, Dey SS
BMC Genomics 21:717 (2020).
RNA sequencing is a powerful approach to quantify the genome-wide distribution of mRNA molecules in a population to gain deeper understanding of cellular functions and phenotypes. However, unlike eukaryotic cells, mRNA sequencing of bacterial samples is more challenging due to the absence of a poly-A tail that typically enables efficient capture and enrichment of mRNA from the abundant rRNA molecules in a cell. Moreover, bacterial cells frequently contain 100-fold lower quantities of RNA compared to mammalian cells, which further complicates mRNA sequencing from non-cultivable and non-model bacterial species. To overcome these limitations, we report EMBR-seq (Enrichment of mRNA by Blocked rRNA), a method that efficiently depletes 5S, 16S and 23S rRNA using blocking primers to prevent their amplification. EMBR-seq results in 90% of the sequenced RNA molecules from an E. coli culture deriving from mRNA. We demonstrate that this increased efficiency provides a deeper view of the transcriptome without introducing technical amplification-induced biases. Moreover, compared to recent methods that employ a large array of oligonucleotides to deplete rRNA, EMBR-seq uses a single or a few oligonucleotides per rRNA, thereby making this new technology significantly more cost-effective, especially when applied to varied bacterial species. Finally, compared to existing commercial kits for bacterial rRNA depletion, we show that EMBR-seq can be used to successfully quantify the transcriptome from more than 500-fold lower starting total RNA. EMBR-seq provides an efficient and cost-effective approach to quantify global gene expression profiles from low input bacterial samples.
|Simultaneous quantification of protein–DNA interactions and transcriptomes in single cells with scDam&T-seq
Markodimitraki CM, Rang F, Rooijers K, de Vries S, Chialastri A, de Luca K, Lochs SJA, Mooijman D, Dey SS (co-corresponding), Kind J (co-corresponding)
Nature Protocols 15:1922–1953 (2020).
Protein–DNA interactions are essential for establishing cell type–specific chromatin architecture and gene expression. We recently developed scDam&T-seq, a multi-omics method that can simultaneously quantify protein–DNA interactions and the transcriptome in single cells. The method effectively combines two existing methods: DNA adenine methyltransferase identification (DamID) and CEL-Seq2. DamID works through the tethering of a protein of interest (POI) to the Escherichia coli DNA adenine methyltransferase (Dam). Upon expression of this fusion protein, DNA in proximity to the POI is methylated by Dam and can be selectively digested and amplified. CEL-Seq2, in contrast, makes use of poly-dT primers to reverse transcribe mRNA, followed by linear amplification through in vitro transcription. scDam&T-seq is the first technique capable of providing a combined readout of protein–DNA contact and transcription from single-cell samples. Once suitable cell lines have been established, the protocol can be completed in 5 d, with a throughput of hundreds to thousands of cells. The processing of raw sequencing data takes an additional 1–2 d. Our method can be used to understand the transcriptional changes a cell undergoes upon the DNA binding of a POI. It can be performed in any laboratory with access to FACS, robotic and high-throughput-sequencing facilities.
|An extended culture system that supports human primordial germ cell-like cell survival and initiation of DNA methylation erasure
Gell JJ, Liu W, Sosa E, Chialastri A, Hancock G, Tao Y, Wamaitha S, Bower G, Dey SS, Clark AT
Stem Cell Reports 14:433-446 (2020).
The development of an in vitro system in which human primordial germ cell-like cells (hPGCLCs) are generated from human pluripotent stem cells (hPSCs) has been invaluable to further our understanding of human primordial germ cell (hPGC) specification. However, the means to evaluate the next fundamental steps in germ cell development have not been well established. In this study we describe a two dimensional extended culture system that promotes proliferation of specified hPGCLCs, without reversion to a pluripotent state. We demonstrate that hPGCLCs in extended culture undergo partial epigenetic reprogramming, mirroring events described in hPGCs in vivo, including a genome-wide reduction in DNA methylation and maintenance of depleted H3K9me2. This extended culture system provides a new approach for expanding the number of hPGCLCs for downstream technologies, including transplantation, molecular screening, or possibly the differentiation of hPGCLCs into gametes by in vitro gametogenesis.
|Control over single-cell distribution of G1 lengths by WNT governs pluripotency
Jang J, Han D, Golkaram M, Audouard M, Liu G, Bridges D, Hellander S, Chialastri A, Dey SS, Petzold LR, Kosik KS
PLoS Biology 17:e3000453 (2019).
The link between single-cell variation and population-level fate choices lacks a mechanistic explanation despite extensive observations of gene expression and epigenetic variation among individual cells. Here, we found that single human embryonic stem cells (hESCs) have different and biased differentiation potentials toward either neuroectoderm or mesendoderm depending on their G1 lengths before the onset of differentiation. Single-cell variation in G1 length operates in a dynamic equilibrium that establishes a G1 length probability distribution for a population of hESCs and predicts differentiation outcome toward neuroectoderm or mesendoderm lineages. Although sister stem cells generally share G1 lengths, a variable proportion of cells have asymmetric G1 lengths, which maintains the population dispersion. Environmental Wingless-INT (WNT) levels can control the G1 length distribution, apparently as a means of priming the fate of hESC populations once they undergo differentiation. As a downstream mechanism, global 5-hydroxymethylcytosine levels are regulated by G1 length and thereby link G1 length to differentiation outcomes of hESCs. Overall, our findings suggest that intrapopulation heterogeneity in G1 length underlies the pluripotent differentiation potential of stem cell populations.
|A probabilistic framework for cellular lineage reconstruction using single-cell 5-hydroxymethylcytosine sequencing
Wangsanuwat C, Aldeguer JF, Rivron NC, Dey SS
Lineage reconstruction is central to understanding tissue development and maintenance. While powerful tools to infer cellular relationships have been developed, these methods typically have a clonal resolution that prevent the reconstruction of lineage trees at an individual cell division resolution. Moreover, these methods require a transgene, which poses a significant barrier in the study of human tissues. To overcome these limitations, we report scPECLR, a probabilistic algorithm to endogenously infer lineage trees at a single cell-division resolution using 5-hydroxymethylcytosine. When applied to 8-cell preimplantation mouse embryos, scPECLR predicts the full lineage tree with greater than 95% accuracy. Further, scPECLR can accurately extract lineage information for a majority of cells when reconstructing larger trees. Finally, we show that scPECLR can also be used to map chromosome strand segregation patterns during cell division, thereby providing a strategy to test the “immortal strand” hypothesis in stem cell biology. Thus, scPECLR provides a generalized method to endogenously reconstruct lineage trees at an individual cell-division resolution.
|Simultaneous quantification of protein–DNA contacts and transcriptomes in single cells
Rooijers K, Markodimitraki CM, Rang F, de Vries S, Chialastri A, de Luca K, Mooijman D, Dey SS (co-corresponding), Kind J (co-corresponding)
Nature Biotechnology 37:766–772 (2019).
Protein–DNA interactions are critical to the regulation of gene expression, but it remains challenging to define how cell-to-cell heterogeneity in protein–DNA binding influences gene expression variability. Here we report a method for the simultaneous quantification of protein–DNA contacts by combining single-cell DNA adenine methyltransferase identification (DamID) with messenger RNA sequencing of the same cell (scDam&T-seq). We apply scDam&T-seq to reveal how genome–lamina contacts or chromatin accessibility correlate with gene expression in individual cells. Furthermore, we provide single-cell genome-wide interaction data on a polycomb-group protein, RING1B, and the associated transcriptome. Our results show that scDam&T-seq is sensitive enough to distinguish mouse embryonic stem cells cultured under different conditions and their different chromatin landscapes. Our method will enable the analysis of protein-mediated mechanisms that regulate cell-type-specific transcriptional programs in heterogeneous tissues.
Download PDF | bioRxiv (PDF)
|Single-cell 5hmC sequencing reveals chromosome-wide cell-to-cell variability and enables lineage reconstruction
Mooijman D*, Dey SS*, Boisset JC, Crosetto N, van Oudenaarden A
Nature Biotechnology 34:852-856 (2016).
The epigenetic DNA modification 5-hydroxymethylcytosine (5hmC) has crucial roles in development and gene regulation. Quantifying the abundance of this epigenetic mark at the single-cell level could enable us to understand its roles. We present a single-cell, genome-wide and strand-specific 5hmC sequencing technology, based on 5hmC glucosylation and glucosylation-dependent digestion of DNA, that reveals pronounced cell-to-cell variability in the abundance of 5hmC on the two DNA strands of a given chromosome. We develop a mathematical model that reproduces the strand bias and use this model to make two predictions. First, the variation in strand bias should decrease when 5hmC turnover increases. Second, the strand bias of two sister cells should be strongly anti-correlated. We validate these predictions experimentally, and use our model to reconstruct lineages of two- and four-cell mouse embryos, showing that single-cell 5hmC sequencing can be used as a lineage reconstruction tool.
Download PDF | Study highlighted in Nature Reviews Genetics | Study highlighted in Nature Methods
|Genome-wide maps of nuclear lamina interactions in single human cells
Kind J, Pagie L, de Vries SS, Nahidiazar L, Dey SS, Bienko M, Zhan Y, Lajoie B, de Graaf CA, Amendola M, Fudenberg G, Imakaev M, Mirny L, Jalink K, Dekker J, van Oudenaarden A, van Steensel B
Cell 163:134-147 (2015).
Mammalian interphase chromosomes interact with the nuclear lamina (NL) through hundreds of large lamina-associated domains (LADs). We report a method to map NL contacts genome-wide in single human cells. Analysis of nearly 400 maps reveals a core architecture consisting of gene-poor LADs that contact the NL with high cell-to-cell consistency, interspersed by LADs with more variable NL interactions. The variable contacts tend to be cell-type specific and are more sensitive to changes in genome ploidy than the consistent contacts. Single-cell maps indicate that NL contacts involve multivalent interactions over hundreds of kilobases. Moreover, we observe extensive intra-chromosomal coordination of NL contacts, even over tens of megabases. Such coordinated loci exhibit preferential interactions as detected by Hi-C. Finally, the consistency of NL contacts is inversely linked to gene activity in single cells and correlates positively with the heterochromatic histone modification H3K9me3. These results highlight fundamental principles of single-cell chromatin organization.
Download PDF | Video Abstract (youtube)
|Orthogonal control of expression mean and variance by epigenetic features at different genomic loci
Dey SS*, Foley JE*, Limsirichai P, Schaffer DV, Arkin AP
Molecular Systems Biology 11:806 (2015).
While gene expression noise has been shown to drive dramatic phenotypic variations, the molecular basis for this variability in mammalian systems is not well understood. Gene expression has been shown to be regulated by promoter architecture and the associated chromatin environment. However, the exact contribution of these two factors in regulating expression noise has not been explored. Using a dual-reporter lentiviral model system, we deconvolved the influence of the promoter sequence to systematically study the contribution of the chromatin environment at different genomic locations in regulating expression noise. By integrating a large-scale analysis to quantify mRNA levels by smFISH and protein levels by flow cytometry in single cells, we found that mean expression and noise are uncorrelated across genomic locations. Furthermore, we showed that this independence could be explained by the orthogonal control of mean expression by the transcript burst size and noise by the burst frequency. Finally, we showed that genomic locations displaying higher expression noise are associated with more repressed chromatin, thereby indicating the contribution of the chromatin environment in regulating expression noise.
Download PDF | News & Views
|Integrated genome and transcriptome sequencing of the same cell
Dey SS*, Kester L*, Spanjaard B, Bienko M, van Oudenaarden A
Nature Biotechnology 33:285-289 (2015).
Single-cell genomics and single-cell transcriptomics have emerged as powerful tools to study the biology of single cells at a genome-wide scale. However, a major challenge is to sequence both genomic DNA and mRNA from the same cell, which would allow direct comparison of genomic variation and transcriptome heterogeneity. We describe a quasilinear amplification strategy to quantify genomic DNA and mRNA from the same cell without physically separating the nucleic acids before amplification. We show that the efficiency of our integrated approach is similar to existing methods for single-cell sequencing of either genomic DNA or mRNA. Further, we find that genes with high cell-to-cell variability in transcript numbers generally have lower genomic copy numbers, and vice versa, suggesting that copy number variations may drive variability in gene expression among individual cells. Applications of our integrated sequencing approach could range from gaining insights into cancer evolution and heterogeneity to understanding the transcriptional consequences of copy number variations in healthy and diseased tissues.
Download PDF | Study highlighted in Nature Methods | Study highlighted in Cell
|Quantitative evaluation and optimization of co-drugging to improve anti-HIV latency therapy
Wong VC*, Fong LE*, Adams NM, Xue Q, Dey SS, Miller-Jensen K
Cellular and Molecular Bioengineering 7:320-333 (2014).
Human immunodeficiency virus 1 (HIV) latency remains a significant obstacle to curing infected patients. One promising therapeutic strategy is to purge the latent cellular reservoir by activating latent HIV with latency-reversing agents (LRAs). In some cases, co-drugging with multiple LRAs is necessary to activate latent infections, but few studies have established quantitative criteria for determining when co-drugging is required. Here we systematically quantified drug interactions between histone deacetylase inhibitors and transcriptional activators of HIV and found that the need for co-drugging is determined by the proximity of latent infections to the chromatin-regulated viral gene activation threshold at the viral promoter. Our results suggest two classes of latent viral integrations: those far from the activation threshold that benefit from co-drugging, and those close to the threshold that are efficiently activated by a single drug. Using a primary T cell model of latency, we further demonstrated that the requirement for co-drugging was donor dependent, suggesting that the host may set the level of repression of latent infections. Finally, we showed that single drug or co-drugging doses could be optimized, via repeat stimulations, to minimize unwanted side effects while main- taining robust viral activation. Our results motivate further study of patient-specific latency-reversing strategies.
|Chromatin accessibility at the HIV-1 LTR promoter sets a threshold for NF-κB mediated viral gene expression
Miller-Jensen K*, Dey SS*, Pham N, Foley JE, Arkin AP, Schaffer DV
Integrative Biology 4:661-671 (2012).
Higher order chromatin structure in eukaryotes can lead to differential gene expression in response to the same transcription factor; however, how transcription factor inputs integrate with quantitative features of the chromatin environment to regulate gene expression is not clear. In vitro models of HIV gene regulation, in which repressive mechanisms acting locally at an integration site keep proviruses transcriptionally silent until appropriately stimulated, provide a powerful system to study gene expression regulation in different chromatin environments. Here we quantified HIV expression as a function of activating transcription factor nuclear factor-κB RelA/ p65 (RelA) levels and chromatin features at a panel of viral integration sites. Variable RelA overexpression demonstrated that the viral genomic location sets a threshold RelA level necessary to induce gene expression. However, once the induction threshold is reached, gene expression increases similarly for all integration sites. Furthermore, we found that higher induction thresholds are associated with repressive histone marks and a decreased sensitivity to nuclease digestion at the LTR promoter. Increasing chromatin accessibility via inhibition of histone deacetylation or DNA methylation lowered the induction threshold, demonstrating that chromatin accessibility sets the level of RelA required to activate gene expression. Finally, a functional relationship between gene expression, RelA level, and chromatin accessibility accurately predicted synergistic HIV activation in response to combinatorial pharmacological perturbations. Different genomic environments thus set a threshold for transcription factor activation of a key viral promoter, which may point toward biological principles that underlie selective gene expression and inform strategies for combinatorial therapies to combat latent HIV.
|Mutual information analysis reveals coevolving residues in Tat that compensate for two distinct functions in HIV-1 gene expression
Dey SS, Xue Y, Joachimiak MP, Friedland GD, Burnett JC, Zhou Q, Arkin AP, Schaffer DV
Journal of Biological Chemistry 287:7945-7955 (2012).
Viral genomes are continually subjected to mutations, and functionally deleterious ones can be rescued by reversion or additional mutations that restore fitness. The error prone nature of HIV-1 replication has resulted in highly diverse viral sequences, and it is not clear how viral proteins such as Tat, which plays a critical role in viral gene expression and replication, retain their complex functions. Although several important amino acid positions in Tat are conserved, we hypothesized that it may also harbor functionally important residues that may not be individually conserved yet appear as correlated pairs, whose analysis could yield new mechanistic insights into Tat function and evolution. To identify such sites, we combined mutual information analysis and experimentation to identify coevolving positions and found that residues 35 and 39 are strongly correlated. Mutation of either residue of this pair into amino acids that appear in numerous viral isolates yields a defective virus; however, simultaneous introduction of both mutations into the heterologous Tat sequence restores gene expression close to wild-type Tat. Furthermore, in contrast to most coevolving protein residues that contribute to the same function, structural modeling and biochemical studies showed that these two residues contribute to two mechanistically distinct steps in gene expression: binding P-TEFb and promoting P-TEFb phosphorylation of the C-terminal domain in RNAPII. Moreover, Tat variants that mimic HIV-1 subtypes B or C at sites 35 and 39 have evolved orthogonal strengths of P-TEFb binding versus RNAPII phosphorylation, suggesting that subtypes have evolved alternate transcriptional strategies to achieve similar gene expression levels.
|Varying Virulence: Epigenetic control of expression noise and disease processes
Miller-Jensen K*, Dey SS*, Schaffer DV, Arkin AP
Trends in Biotechnology 29:517-525 (2011).
Gene expression noise is a significant source of phenotypic heterogeneity in otherwise identical populations of cells. Phenotypic heterogeneity can cause reversible drug resistance in diseased cells, and thus a better understanding of its origins might improve treatment strategies. In eukaryotes, data strongly suggest that intrinsic noise arises from transcriptional bursts caused by slow, random transitions between inactive and active gene states that are mediated by chromatin remodeling. In this review, we consider how chromatin modifications might modulate gene expression noise and lead to phenotypic diversity in diseases as varied as viral infection and cancer. Additionally, we argue that this fundamental information can be applied to develop innovative therapies that counteract ‘pathogenic noise’ and sensitize all diseased cells to therapeutic intervention.
|Opportunities for chemical engineering thermodynamics in biotechnology: Some examples
Dey SS, Prausnitz JM
Industrial & Engineering Chemistry Research 50:3-15 (2011).
Because of its generality, thermodynamics is applicable to all substances, including biomacromolecules. To illustrate how thermodynamics can contribute to biotechnology, each of six examples gives a brief summary of pertinent, previously published research. Each example indicates that familiar concepts in chemical engineering thermodynamics can be applied to contribute toward solution of a practical problem. These examples are discussed here to encourage thermodynamically oriented chemical engineers to devote their talents toward helping to advance industrial biotechnology.
Mehta G, Sen S, Dey SS
Acta Crystallographica Section C 61 (Pt 6):o358-360 (2005).
In the title compound, C6H12O4.H2O, 1,4/2,5-cyclohexane-tetrol and water molecules are seen to possess twofold symmetry. All four hydroxyl groups of the tetrol participate in extensive intermolecular O—H...O hydrogen bonding to form molecular tapes propagating along the a axis. Translationally related tapes along the c axis are held together by four coordinated water molecules.
Mehta G, Sen S, Dey SS
Acta Crystallographica Section E 61 (Pt 4):o920-922 (2005).
The title compound, C6H12O4, exists in a chair form, with three of the four OH groups equatorially disposed. All four hydroxy groups participate in extensive intermolecular O—H...O hydrogen bonding.