|
Supplementary Material for Published Papers
Sural et al, Nature Structural and Molecular Biology, 2008
The MSL3 chromodomain directs a key targeting step for dosage compensation of the Drosophila melanogaster X chromosome
Abstract: The male-specific lethal (MSL) complex upregulates the single male X chromosome to achieve dosage compensation in Drosophila melanogaster. We have proposed that MSL recognition of specific entry sites on the X is followed by local targeting of active genes marked by histone H3 trimethylation (H3K36me3). Here we analyze the role of the MSL3 chromodomain in the second targeting step. Using ChIP-chip analysis, we find that MSL3 chromodomain mutants retain binding to chromatin entry sites but show a clear disruption in the full pattern of MSL targeting in vivo, consistent with a loss of spreading. Furthermore, when compared to wild type, chromodomain mutants lack preferential affinity for nucleosomes containing H3K36me3 in vitro. Our results support a model in which activating complexes, similarly to their silencing counterparts, use the nucleosomal binding specificity of their respective chromodomains to spread from initiation sites to flanking chromatin.
Kharchenko et al, Nature Biotechnology, 2008
Design and analysis of ChIP-seq experiments for DNA-binding proteins
Abstract: Recent progress in massively parallel sequencing platforms has enabled genome-wide characterization of DNA-associated proteins using the combination of chromatin immunoprecipitation and sequencing (ChIP-seq). Although a variety of methods exist for analysis of the established alternative ChIP microarray (ChIP-chip), few approaches have been described for processing ChIP-seq data. To fill this gap, we propose an analysis pipeline specifically designed to detect protein-binding positions with high accuracy. Using previously reported data sets for three transcription factors, we illustrate methods for improving tag alignment and correcting for background signals. We compare the sensitivity and spatial precision of three peak detection algorithms with published methods, demonstrating gains in spatial precision when an asymmetric distribution of tags on positive and negative strands is considered. We also analyze the relationship between the depth of sequencing and characteristics of the detected binding positions, and provide a method for estimating the sequencing depth necessary for a desired coverage of protein binding sites.
Alekseyenko et al, Cell, 2008
A sequence motif within chromatin entry sites directs MSL establishment on the Drosophila X chromosome.
Abstract: The Drosophila MSL complex associates with active genes specifically on the male X chromosome to acetylate histone H4 at lysine 16 and increase expression approximately 2-fold. To date, no DNA sequence has been discovered to explain the specificity of MSL binding. We hypothesized that sequence-specific targeting occurs at "chromatin entry sites," but the majority of sites are sequence independent. Here we characterize 150 potential entry sites by ChIP-chip and ChIP-seq and discover a GA-rich MSL recognition element (MRE). The motif is only slightly enriched on the X chromosome ( approximately 2-fold), but this is doubled when considering its preferential location within or 3' to active genes (>4-fold enrichment). When inserted on an autosome, a newly identified site can direct local MSL spreading to flanking active genes. These results provide strong evidence for both sequence-dependent and -independent steps in MSL targeting of dosage compensation to the male X chromosome.
Orford et al, Developmental Cell, 2008
Differential H3K4 methylation identifies developmentally poised hematopoietic genes.
Abstract: Throughout development, cell fate decisions are converted into epigenetic information that determines cellular identity. Covalent histone modifications are heritable epigenetic marks and are hypothesized to play a central role in this process. In this report, we assess the concordance of histone H3 lysine 4 dimethylation (H3K4me2) and trimethylation (H3K4me3) on a genome-wide scale in erythroid development by analyzing pluripotent, multipotent, and unipotent cell types. Although H3K4me2 and H3K4me3 are concordant at most genes, multipotential hematopoietic cells have a subset of genes that are differentially methylated (H3K4me2+/me3-). These genes are transcriptionally silent, highly enriched in lineage-specific hematopoietic genes, and uniquely susceptible to differentiation-induced H3K4 demethylation. Self-renewing embryonic stem cells, which restrict H3K4 methylation to genes that contain CpG islands (CGIs), lack H3K4me2+/me3- genes. These data reveal distinct epigenetic regulation of CGI and non-CGI genes during development and indicate an interactive relationship between DNA sequence and differential H3K4 methylation in lineage-specific differentiation.
Larschan et al, Molecular Cell, 2007
MSL complex is attracted to genes marked by H3K36 trimethylation using a sequence-independent mechanism
Abstract: Dosage compensation equalizes the levels of transcripts encoded on the X chromosome between XY males and XX females. In Drosophila, dosage compensation requires the MSL (Male Specific Lethal) complex, which associates with actively transcribed genes on the single male X chromosome to upregulate transcription approximately two-fold. MSL complex targets genes, with increased binding toward 3’ ends. To search for chromatin modifications associated with MSL binding, we mapped H3K36 trimethylation (H3K36me3) in Drosophila SL2 cells, and found that it marks transcribed genes with a 3’ bias, as in yeast and humans. On the male X chromosome, or when MSL complex is ectopically localized to an autosome, H3K36me3 is a strong predictor of MSL binding. We isolated mutants lacking Set2, the H3K36me3 methyltransferase, and found that Set2 is an essential gene in both sexes of Drosophila. In set2 mutant males, MSL complex can still associate with high affinity sites, which are proposed to mark the X chromosome through DNA sequence elements. Yet, MSL complex exhibits reduced binding to target genes, suggesting a specific role for H3K36me3 in MSL targeting. In addition, we found that recombinant MSL3 protein preferentially binds nucleosomes marked by H3K36me3 in vitro. Our results support a model in which the MSL complex recognizes many of its targets through general features of transcribed genes.
Peng et al, BMC Bioinformatics, 2007
Normalization and experimental design for ChIP-chip data.
Abstract: BACKGROUND
Chromatin immunoprecipitation on tiling arrays (ChIP-chip) has been widely used to investigate the DNA binding sites for a variety of proteins on a genome-wide scale. However, several issues in the processing and analysis of ChIP-chip data have not been resolved fully, including the effect of background (mock control) subtraction and normalization within and across arrays.
RESULTS: The binding profiles of Drosophila male-specific lethal (MSL) complex on a tiling array provide a unique opportunity for investigating these topics, as it is known to bind on the X chromosome but not on the autosomes. These large bound and control regions on the same array allow clear evaluation of analytical methods. We introduce a novel normalization scheme specifically designed for ChIP-chip data from dual-channel arrays and demonstrate that this step is critical for correcting systematic dye-bias that may exist in the data. Subtraction of the mock (non-specific antibody or no antibody) control data is generally needed to eliminate the bias, but appropriate normalization obviates the need for mock experiments and increases the correlation among replicates. The idea underlying the normalization can be used subsequently to estimate the background noise level in each array for normalization across arrays. We demonstrate the effectiveness of the methods with the MSL complex binding data and other publicly available data.
CONCLUSION: Proper normalization is essential for ChIP-chip experiments. The proposed normalization technique can correct systematic errors and compensate for the lack of mock control data, thus reducing the experimental cost and producing more accurate results.
Alekseyenko et al, Genes & Development, 2006
High-resolution ChIP-chip analysis reveals that the Drosophila MSL complex selectively identifies active genes on the male X chromosome.
Abstract: X-chromosome dosage compensation in Drosophila requires the male-specific lethal (MSL) complex, which up-regulates gene expression from the single male X chromosome. Here, we define X-chromosome-specific MSL binding at high resolution in two male cell lines and in late-stage embryos. We find that the MSL complex is highly enriched over most expressed genes, with binding biased toward the 3 end of transcription units. The binding patterns are largely similar in the distinct cell types, with ~600 genes clearly bound in all three cases. Genes identified as clearly bound in one cell type and not in another indicate that attraction of MSL complex correlates with expression state. Thus, sequence alone is not sufficient to explain MSL targeting. We propose that the MSL complex recognizes most X-linked genes, but only in the context of chromatin factors or modifications indicative of active transcription. Distinguishing expressed genes from the bulk of the genome is likely to be an important function common to many chromatin organizing and modifying activities.
Hamada et al, Genes & Development, 2005
Global regulation of X chromosomal genes by the MSL complex in Drosophila melanogaster.
Abstract: A long-standing model postulates that X-chromosome dosage compensation in Drosophila occurs by twofold up-regulation of the single male X, but previous data cannot exclude an alternative model, in which male autosomes are down-regulated to balance gene expression. To distinguish between the two models, we used RNA interference to deplete Male-Specific Lethal (MSL) complexes from male-like tissue culture cells. We found that expression of many genes from the X chromosome decreased, while expression from the autosomes was largely unchanged. We conclude that the primary role of the MSL complex is to up-regulate the male X chromosome.
Tian et al, Proc Natl Acad Sci USA, 2005
Discovering statistically significant pathways in expression profiling studies.
Abstract: Accurate and rapid identification of perturbed pathways through the analysis of genome-wide expression profiles facilitates the generation of biological hypotheses. We propose a statistical framework for determining whether a specified group of genes for a pathway has a coordinated association with a phenotype of interest. Several issues on proper hypothesis-testing procedures are clarified. In particular, it is shown that the differences in the correlation structure of each set of genes can lead to a biased comparison among gene sets unless a normalization procedure is applied. We propose statistical tests for two important but different aspects of association for each group of genes. This approach has more statistical power than currently available methods and can result in the discovery of statistically significant pathways that are not detected by other methods. This method is applied to data sets involving diabetes, inflammatory myopathies, and Alzheimer's disease, using gene sets we compiled from various public databases. In the case of inflammatory myopathies, we have correctly identified the known cytotoxic T lymphocyte-mediated autoimmunity in inclusion body myositis. Furthermore, we predicted the presence of dendritic cells in inclusion body myositis and of an IFN-alpha/beta response in dermatomyositis, neither of which was previously described. These predictions have been subsequently corroborated by immunohistochemistry.
Lai et al, Bioinformatics, 2005
Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data.
Abstract: MOTIVATION: Array Comparative Genomic Hybridization (CGH) can reveal chromosomal aberrations in the genomic DNA. These amplifications and deletions at the DNA level are important in the pathogenesis of cancer and other diseases. While a large number of approaches have been proposed for analyzing the large array CGH datasets, the relative merits of these methods in practice are not clear. RESULTS: We compare 11 different algorithms for analyzing array CGH data. These include both segment detection methods and smoothing methods, based on diverse techniques such as mixture models, Hidden Markov Models, maximum likelihood, regression, wavelets and genetic algorithms. We compute the Receiver Operating Characteristic (ROC) curves using simulated data to quantify sensitivity and specificity for various levels of signal-to-noise ratio and different sizes of abnormalities. We also characterize their performance on chromosomal regions of interest in a real dataset obtained from patients with Glioblastoma Multiforme. While comparisons of this type are difficult due to possibly sub-optimal choice of parameters in the methods, they nevertheless reveal general characteristics that are helpful to the biological investigator.
|