In the past decade, the landscape of molecular biology has been profoundly shaped by the burgeoning research into chemically modified bases in DNA. The natural occurrence, metabolism, and functional roles of these modifications are now recognized as essential components of genomic regulation and cellular identity. The driving forces behind this progress have been technological advancements and remarkable discoveries, which have broadened our understanding of DNA beyond its classical nucleotide composition.
Fig. 1 Examples of modified DNA bases. (Bilyard M. K., et al. 2020)
The detection and analysis of chemically modified DNA bases have undergone significant improvements, driven by the development of advanced chemical tools and technologies. These advancements have been pivotal in advancing our understanding of DNA modifications and their implications.
LC-MS/MS has become the preferred method for the sensitive detection and quantification of DNA modifications. This technique utilizes internal calibration with stable isotope-labeled analogues of the modifications, allowing for quantification in the low femtomolar range. LC-MS/MS enables accurate comparative analysis of genomes across different organisms, tissues, or cell states, revealing insights into the metabolic pathways and dynamics of modified bases. However, mass spectrometry-based methods only provide nucleoside composition and lack sequence context, necessitating careful consideration of potential contamination from external DNA sources.
The advent of next-generation DNA sequencing (NGS) has provided an unprecedented depth and scale in studying genomic features, including modified bases. Enrichment of genomic DNA fragments for modified bases followed by sequencing allows for mapping at a resolution of 200-400 base pairs. Methods such as selective covalent chemistry-based pull-down or DNA immunoprecipitation (DIP) are used for enrichment. While DIP can suffer from off-target binding and false positives, single-base resolution methods like bisulfite sequencing have enabled the resolution of 5-methylcytosine (5mC) and its derivatives, such as 5-hydroxymethylcytosine (5hmC). However, bisulfite sequencing can cause loss of DNA material, leading to the development of bisulfite-free methods. Advances in single-cell sequencing and spatial analysis further enhance our understanding of cellular heterogeneity and tissue functionality.
Genome editing technologies like CRISPR/Cas9 have transformed the field of biology, enabling precise modifications of the genome. Recent advances such as prime editing promise fewer off-target effects and more accurate rewriting of specific genome regions. These technologies have been adapted to introduce epigenetic changes, allowing for targeted methylation or demethylation to regulate gene expression. This opens up opportunities for controlled reprogramming of gene expression and cellular identity.
Chemically modified DNA bases play crucial roles in regulating gene expression, maintaining cellular identity, and responding to environmental changes. Understanding the biological functions of these modifications is essential for elucidating their impact on gene regulation and cellular processes.
Modified cytosine bases, such as 5mC, are central to establishing an epigenetic information layer that regulates gene expression in mammalian genomes. Cytosine methylation is a well-known mechanism for transcriptional silencing and gene regulation. Ten-eleven translocation (Tet) enzymes play a critical role in active DNA demethylation by converting 5mC to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxycytosine (5caC). These modifications can be repaired through base excision repair mechanisms involving glycosylases like thymine-DNA glycosylase (TDG).
Recent discoveries have expanded our understanding of DNA modifications beyond 5mC. For example, Tet enzymes can convert thymine (T) to 5-hydroxymethyluracil (5hmU) and 5-formyluracil (5fU) in mammalian genomes. Additionally, certain parasites, such as Leishmania, possess Tet homologues that produce Base J, and the Tet homologue from C. reinhardtii generates 5-glyceryl-methylcytosine (5gly-mC) using ascorbate as a co-substrate. These findings highlight the diversity of Tet homologues and their potential to produce novel modifications.
Beyond 5mC and its derivatives, other modified bases have been identified in DNA, including N6-methyladenosine (6mA) and 4-methylcytosine (4mC). While 6mA is robustly identified, the presence of 4mC remains a topic of debate. Recent sensitive analytical methods, such as LC-MS/MS and single-molecule real-time sequencing (SMRT-seq), have detected 6mA and 4mC in eukaryotes. However, the low or undetectable levels of these modifications in eukaryotes suggest possible overestimation or contamination issues.
The association between modified DNA bases and disease has garnered significant interest, particularly in the context of cancer and neurodegenerative disorders. Understanding these connections is crucial for developing diagnostic and therapeutic strategies.
Altered DNA methylation patterns are well-documented in cancer cells, with global reductions in 5hmC levels observed in cancer cell lines and primary tissues. Modified DNA bases, such as 5hmC, have potential cancer biomarkers. For instance, 5hmC signatures in cell-free DNA from blood samples may provide insights into cancer stage and type. Similar potential exists for 5mC profiling in early-stage cancer diagnosis.
Modified DNA bases have also been linked to neurodegenerative diseases. Specific cytosine modification signatures associated with Alzheimer's disease (AD) have been identified and validated in clinical cohorts. Additionally, altered 5mC levels in mitochondrial DNA have been observed in Parkinson's disease. These findings highlight the potential of modified bases as biomarkers for neurodegenerative disorders.
Research on chemically modified DNA bases stands at the intersection of chemistry and biology, with significant progress made in recent years. The discovery of new modifications and the development of advanced analytical techniques have expanded our understanding of DNA modifications and their roles in gene regulation and disease. Continued research will likely uncover further modifications and their functions, leading to new clinical applications and therapeutic strategies. The future of this field promises exciting opportunities for both basic research and translational medicine.
References
Note: If you don't receive our verification email, do the following: