Dysregulation of transcription is associated with the pathogenesis of cardiovascular diseases, including congenital heart diseases and heart failure. However, it remains unclear how transcription factors regulate transcription in the heart and which genes are associated with cardiovascular diseases in humans. Development of genome-wide analyses using next-generation sequencers provides powerful methods to determine how these transcription factors and chromatin regulators control gene expressions and to identify causative genes in cardiovascular diseases. These technologies have revealed that transcription during heart development is elaborately regulated by multiple cardiac transcription factors.
As the human genome size is 3G bp, next-generation sequencers can easily cover the whole human genome, thus this sequencer is suitable to use for genome-wide analysis. Indeed, many methods are developed using next-generation sequencers, such as the combination with chromatin immunoprecipitation and sequencing (ChIPseq) and whole-transcriptome shotgun sequencing (RNA sequencing; RNAseq). These technologies allow us to identify genomic mutation and to analyze the molecular mechanisms of chromatin regulation. Using massively parallel DNA sequencing, we can determine where transcription regulating factors bind and how they control gene expression via chromatin status modulation, as well as which mutations are associated with cardiovascular diseases.
This new technology has recently revealed the molecular mechanisms of transcription regulation by cardiac transcription factors and identified several genes related to cardiovascular diseases. In this review, we provide a review of recent studies that explore the molecular mechanisms of transcription regulation in heart development and cardiovascular diseases.
Schematic illustration of ChIPseq and RNAseq. The procedure of ChIPseq. (1) Chromatin is digested by sonication or enzymes with/without crosslinking between proteins and DNA. (2) Complex of proteins of interest with their binding DNA is immunoprecipitated by the antibody against the protein of interest. (3) To make library of immunoprecipitated DNA, adapter DNA is ligated to the end of purified DNA, then these DNA fragments are amplified using the adapter DNA sequences. (4) DNA sequences in the library are analyzed by next-generation sequencing. Bioinformatic analysis begins with (1) mapping DNA sequences onto a reference genome, such as mm9 for the mouse genome and hg19 for the human genome. Several mapping tools are available from the next-generation sequencer company and researchers including Lifescope (Life technologies) and ELAND (Illumina), bowtie and BWA. (2) Mapped DNA sequences are then used for peak-calling to identify enriched genomic regions, but the results may vary based on tool and parameter settings, since peak-calling tools use various algorithms. (3) The obtained peaks are used for further analysis, such as comparing binding regions between samples, finding specific DNA motifs, and analyzing correlation with transcription status. The procedure of RNAseq. (1) Two different methods are available for mRNA collection: purification of polyA-tailed mRNA and depletion of rRNA. Because most cellular RNA is rRNA, depletion of rRNA is critical for enrichment of mRNA. Purification of polyA-tailed mRNA is easy to deplete rRNA, but non-polyadenylated mRNA, including histone mRNA, is lost. Alternatively, depletion of rRNA by sequence-specific probe is suitable to enrich mRNA, including non-polyadenylated mRNA, but more reads are required for analysis because sequence-specific probe cannot completely deplete rRNA. (2, 3) Purified mRNA is digested and used to construct a library, which is sequenced in the same manner as ChIPseq. Bioinformatic analysis begins with (1) the obtained reads are mapped onto a reference genome. Because splicing is occurred during maturation of mRNA, some reads are mapped in the junction between exons. Thus, it is required to assign mapping tools that could consider splicing, such as Tophat and the programs from next-generation sequencer companies. (2, 3) Programs such as Cufflinks, DESeq, bayseq or EdgeR, can be used to compare expression levels of mapped sequences between different samples. ChIPseq, chromatin immunoprecipitation and sequencing; RNAseq, RNA sequencing; rRNA, ribosomal RNA.