-
Thierry-Mieg, Danielle, Piano, Fabio, Prasad Manoharan, Arun, Rajewsky, Nikolaus, Thierry-Mieg, Jean, Chen, Kevin, MacMenamin, Philip, Kim, John, Zegar, Charles, Ting, Han, Chen, Wei, Mis, Emily, Mangone, Marco, Kohara, Yuji, Vidal, Marc, Attie, Oliver, Khivansara, Vishal, Salehi-Ashtiani, Kourosh, Gunsalus, Kristin
[
International Worm Meeting,
2009]
Three-prime untranslated regions (3''UTRs) contain sequence elements used by RNA-binding proteins and regulatory RNAs such as microRNAs (miRNAs) to influence mRNA stability, translation and localization. However, the annotation of these regions within transcripts is generally incomplete. In C.elegans, which has a well annotated genome, about half of the ~20,000 curated genes in WS190 are not annotated with any 3''UTR and less than 10% are annotated with multiple or alternative 3''UTR isoforms. As part of the modENCODE project, we have developed a pipeline targeting ~7,000 genes using 3'' RACE and found at least one 3''UTR for around 90% of this targeted set. To analyze these data we have used different sequencing technologies that resulted in the identification of multiple distinct 3''UTR isoforms for over half of the targeted genes. In addition to 3''RACE, we have developed a technique to capture polyA ends followed by deep sequencing. This polyA-capture approach has resulted in over 2,000,000 polyA tags that map to ~12,000 genes across all major developmental stages and males, with the majority of these sequences mapping to full-length 3''UTRs. The depth of these data shows the remarkable complexity in the distribution of polyadenylation events in vivo. We have also manually curated 3''UTR boundaries using all available cDNAs, derived mostly from the 200,000 staged ESTs, and obtained 3''UTRs boundaries for ~11,500 genes. For about half of these (~5,000 genes), the data connect 3''UTRs to specific transcripts with known trans-spliced leader sequences at the 5'' end. Combining the results of these analyses has increased our knowledge of 3''UTR structures significantly, identifying novel 3''UTR isoforms for about half of all C.elegans protein-coding genes.
-
Salehi-Ashtiani, Kourosh, Vidal, Marc, Attie, Oliver, Mis, Emily, Zegar, Charles, Piano, Fabio, Mangone, Marco, MacMenamin, Philip, Gunsalus, Kris
[
International Worm Meeting,
2009]
Three-prime untranslated regions (3''UTRs) are widely recognized as important post-transcriptional regulatory portions of mRNAs. RNA-binding proteins and small non-coding RNAs such as microRNAs (miRNAs) bind to functional elements within 3''UTRs to influence mRNA stability, translation and localization. To characterize and clone C.elegans 3''UTRs, we have developed a high throughput 3''RACE strategy and have characterized an initial target set of 7,014 genes. For each gene, we use a transcript-specific forward primer and a universal polydT(23) anchored reverse primer. The primers are designed to generate products compatible with the Promoterome and ORFeome resource. We cloned the 3''UTRs into an entry vector ready to be used for the third position in the multi-site Gateway system suitable for downstream functional analysis. In our first cycle, we cloned 3''UTRs of ~ 6,000 genes and sequence verified these as mixed bacterial transformants (or "minipools"). For half of these genes, the analysis of their 3''ends identified new 3''UTRs compared to data present in Wormbase WS190
(http://wormbase.org). We subsequently separated these minipools into ~56,000 isolated colonies and re-sequenced our library by deep sequencing (Illumina/Solexa and 454/Roche). We obtained unique cloned 3''UTR isoforms for over 90% of our target set, and observed multiple 3''UTR isoforms for over half. Most of the alternative isoforms are produced by differential PAS site usage. Our data and annotations are being deposited into the modENCODE website
(http://www.modencode.org), and are also viewable in our 3''UTR-centric website
(http://www.utrome.org). We will present the status of this project and its applications to study 3''UTR biology in C.elegans.
-
Sugano, Sumio, Suzuki, Yukata, Salehi-Ashtiani, Kourosh, Gunsalus, Kris C, Rajewsky, Nikolaus, Harkins, Tim, Prasad Manoharan, Arun, Thierry-Mieg, Danielle, Thierry-Mieg, Jean, Mis, Emily, Mackowiak, Sebastian, Han, Ting, Khivansara, Vishal, Kim, John, Piano, Fabio, Zegar, Charles, Gutwein, Michelle, Mangone, Marco, Kohara, Yuji, Buffard, Pascal
[
C. elegans: Development and Gene Expression, EMBL, Heidelberg, Germany,
2010]
Three-prime untranslated regions (3UTRs) of metazoan mRNAs contain numerous regulatory elements, yet the structure and developmental impact of 3UTRs remain largely uncharacterized. By integrating data from all Caenorhabditis elegans (C. elegans) developmental stages obtained by polyA capture, 3RACE, full-length cDNAs, and RNAseq, we define ~26,000 distinct mRNA 3 UTRs for ~85% of the 18,328 experimentally supported protein coding genes and refine the annotation of ~30% of gene models. Alternative 3UTR isoforms display widespread differential expression during development. Surprisingly, no canonical or variant polyadenylation signal (PAS) sequence is detec ted for 13% of polyA sites, most frequently among shorter alternative isoforms. Comparing trans-spliced and non trans-spliced genes reveals a strong correlation between the processing of transcript 5 and 3 ends: trans-spliced mRNAs possess longer 3UTRs and a higher frequency of no PAS or variant PAS motifs. We also predict conserved isoform-specific microRNA (miRNA) binding sites and identify additional evolutionarily constrained sequence blocks that may mediate 3UTR regulation. Thus, our data reveal a rich complexity of 3UTRs genome-wide and throughout development in C. elegans.
-
[
International Worm Meeting,
2007]
Untranslated regions (UTRs) are found at the 5‘ and 3 flanking ends of transcribed RNAs and contain elements important for the post-transcriptional regulation of the RNA. UTRs are implicated in the control of gene expression through interaction with regulatory proteins and with small non-coding RNAs, such as miRNAs. These interactions can inhibit translation or alter the stability of the messenger RNA resulting in a decrease of protein levels. Computational predictions suggest that each miRNA controls a network of proteins through interaction with consensus sequences in their respective 3 UTRs; thus collectively miRNAs likely regulate the expression of thousands of transcripts. Here we aim to begin to study the biology of 3 UTRs. We developed a high throughput approach to clone all 3‘ UTRs present C. elegans into a vector that can easily be used for downstream in vivo testing. Using a 3 RACE strategy, we amplify 3‘ UTRs from total RNA obtained from mixed developmental stages. We began this project by cloning the 3‘ UTRs found in both the ORFeome and the PROMOTERome databases. Initial results suggest that our strategy will lead to over 80% of 3UTRs in this set cloned and sequenced verified. Preliminary analyses also suggest that for over 20% of the mRNAs there are bona fide alternative 3UTRs. Given the strategy we are using to identify 3UTRs we will have in C. elegans a collection of clones that can be used to assemble a gene locus in the three componenent parts: the promoter, the ORF and 3UTR in a modular way which we can use in a variety of downstream analyses.
-
[
International Worm Meeting,
2019]
MicroRNAs (miRNAs) are known to modulate gene expression, but their activity at the tissue-specific level remains largely uncharacterized. In order to study their contribution to tissue-specific gene expression, we developed novel tools to profile miRNA targets in the C. elegans intestine and body muscle. We validated many previously described interactions, and identified ~3,500 novel targets. Many of the miRNA targets curated are known to modulate the functions of their respective tissues. Within our datasets we observed a disparity in the usage of miRNA-based gene regulation between the intestine and body muscle. The intestine contained significantly more miRNA targets than the body muscle highlighting its transcriptional complexity. We detected an unexpected enrichment of RNA binding proteins targeted by miRNA in both tissues, with a notable abundance of RNA splicing factors. We developed in vivo genetic tools to validate and further study three mRNA splicing factors identified as miRNA targets in our study (
asd-2,
hrp-2 and
smu-2), and show that these factors indeed contain functional regulatory elements in their 3'UTRs that are able to repress their expression in the intestine. In addition, the alternative splicing pattern of their respective downstream targets (
unc-60,
unc-52,
lin-10 and
ret-1) is dysregulated when the miRNA pathway is disrupted. A re-annotation of the transcriptome data in C. elegans strains that are deficient in the miRNA pathway from past studies supports and expands on our results. This study highlights an unexpected role for miRNAs in modulating tissue-specific gene isoforms, where post-transcriptional regulation of RNA splicing factors associates with tissue-specific alternative splicing.
-
[
International Worm Meeting,
2021]
The cleavage and polyadenylation of pre-mRNAs is a critical step needed for RNA transcription termination and maturation. This process is executed by a large multi-subunit complex known as the RNA cleavage and polyadenylation complex (CPC). The CPC binds to the polyadenylation signal (PAS), which is a conserved hexameric element located at the end of the transcript's 3' Untranslated Region (3'UTR). The CPC then performs the cleavage reaction at the polyadenylation (PS) site. Despite their importance, PS element locations in eukaryotic genomes are poorly characterized. Prior research from our lab revealed that the distance between the PAS and the PS elements is not constant and a buffer region between 12-14 nt from the PAS element is needed by the CPC in order to perform a successful cleavage reaction. Our lab recently identified an enrichment of adenosine nucleotides at the PS site and demonstrated that their removal alters the location of the cleavage site in vivo, suggesting an important novel role of the PS element in pre-mRNA cleavage and polyadenylation. In order to further study the involvement of this terminal adenosine located at the PS site, we developed a novel in vivo cleavage and polyadenylation assay which we will use to determine the optimal buffer region length and further study the role of the terminal adenosine at the cleavage site. For these experiments, we have used the M03A1.3 3'UTR since it uses only one canonical PAS element (and thus does not use alternative polyadenylation) and has a buffer region of 14 nt. We have prepared a GFP reporter construct containing a mutated M03A1.3 3'UTR with no adenosines between the PAS and PS elements. This GFP reporter construct was used to prepare seven new mutants containing a terminal adenosine inserted at specific locations between 17 and 29 nucleotides downstream of the PAS site. The results of this assay will provide a working framework which we will use to model PS elements in the 5546 genes in the worm transcriptome, which currently lack annotated 3'UTR data. Our work will greatly improve our understanding of pre-mRNA cleavage and polyadenylation in C. elegans and will allow us to provide a needed reference for PS elements in the worm transcriptome to the scientific community.
-
[
International Worm Meeting,
2021]
Tissue and cell identity relies heavily on the 3' untranslated region (3'UTR) of mRNAs, which contain several regulatory elements required for proper gene expression. Mechanisms including alternative polyadenylation (APA), which produces distinct 3'UTR isoforms, and downregulation through microRNAs (miRNAs), are essential in establishing tissue identity through modulating gene expression. The networks formed between these distinct mechanisms spin a complex web around tissue and cell identity, which are poorly understood in metazoans. Previous experiments in our lab utilized an immunoprecipitation-based approach which identified APA allows mRNA transcripts to evade miRNA targeting in a tissue-specific manner in the model organism C. elegans. In addition, we identified miRNA targets in the intestine and body muscle tissues, but this approach is unable to identify specific miRNA populations, which are essential pieces in the post-transcriptional regulation puzzle. Identifying tissue-specific miRNA populations will provide a better understanding of how gene regulation modulates identity. With the ultimate goal of producing a comprehensive tissue-specific miRNAome in C. elegans, we developed a novel approach in which RNA is isolated from tissue-specific nuclei using FACS sorting then sequenced. The miRNAs identified with this method are validated using a second unbiased RT-qPCR -based approach. To develop strains expressing fluorescent tissue-specific nuclei, the mCherry fluorochrome was fused to the C. elegans histone H2B ortholog,
his-58. Six worm strains were prepared expressing this construct specifically in the intestine (
ges-1p), body muscle (
myo-3p), hypodermis (
dpy-7p), seam cells (
grd-10p), and excitatory (
nmr-1p) and GABAergic neurons (
unc-47p). Briefly, the worms are homogenized in a tissue grinder with a clearance slightly wider than the diameter of the nuclei, then the lysate is sequentially filtered and centrifuged before FACS isolation. Finally, RNA is isolated and the library is prepped with the Nextera XT kit. Our initial results support the validity of this methodology which will be over-imposed to tissue-specific miRNA targets and the tissue-specific 3'UTRome datasets available in our lab. This will ultimately provide the first comprehensive tissue-specific miRNA and 3'UTR Interactome in a living organism.
-
[
International Worm Meeting,
2021]
One of the most fundamental questions in RNA biology is how transcriptional termination is executed in eukaryotes, and how the location of the cleavage reaction influences mRNA stability and its expression levels. The mechanism of this process is important because determines the length of the 3' Untranslated Regions (3'UTRs), which are defined as the sequences located between the STOP codon and the polyA tail of mature mRNAs. 3'UTRs are targeted by a variety of regulatory factors, including miRNAs and RNA Binding Proteins (RBPs). Here, we have used a genomic approach to map and study 3'UTR data from 1,094 transcriptome datasets downloaded from the public SRA repository at the NCBI. These datasets correspond to the entire collection of C. elegans transcriptomes stored in this repository from 2015 to 2018, which allowed us to map 3'UTRs with an unprecedented ultra-deep coverage of several magnitudes (the average coverage at the mRNA cleavage site is close to 220X). Given the amount of data used in this study, to our knowledge this is the most comprehensive and high-resolution analysis of 3'UTRs in a living organism performed anywhere to date. We have assigned novel 3'UTR isoforms to ~1,000 protein coding genes, refined and updated 3'UTR boundaries for ~23,000 3'UTR isoforms, and performed a detailed comparative genomic analysis of the C. elegans cleavage and polyadenylation complex (CPC) performing in vivo studies to probe principles of mRNA cleavage and polyadenylation. We found that the CPC in C. elegans is conserved to its human counterpart, with most of the functional domains and critical amino acids preserved. While most of the 3'UTRs possess a known Polyadenylation signal element (PAS) localized around -19 nt from the cleavage site (AAUAAA), non-canonical PAS 3'UTRs possess a less stringent requirement but preserve the chemical nature of the element which is RRYRRR. The majority of C. elegans 3'UTRs terminate with a terminal Adenosine nucleotide, which we speculate is included by the RNA polymerase II during the transcription step, since. This Adenosine nucleotide is required for proper cleavage since its removal impacts the location of the cleavage site in vivo.
-
[
International Worm Meeting,
2021]
The region downstream of the STOP codon in mRNA, referred to as the 3'Untranslated Region (3'UTR), governs the length of mature mRNA. Specifically, the cleavage site located in this region determines where mRNA cleavage will occur and where polyadenylation reaction will begin, thus terminating mRNA transcription. The mRNA cleavage and polyadenylation machinery in C. elegans is highly conserved to its human counterpart, with most functional domains and critical amino acids preserved. Dysregulation of 3'UTR processing has been observed in many diseases, such as cancer, Alzheimer's disease, and muscular dystrophies, but unfortunately the molecular mechanisms underlying the mRNA transcription termination remain elusive. Although the exact cleavage site is not precise, our lab has identified an adenosine consistently located at the mRNA cleavage site. It is unclear if this adenosine is maintained in the mature mRNA transcripts proceeding cleavage and/or is used as a template for the polymerization of the poly(A) tail. In order to answer this question, we developed a novel terminal adenosine RNA methyltransferase (TAM) assay that will sense the inclusion or exclusion of this terminal adenosine at the cleavage site of C. eleganstranscripts by taking advantage of the human nuclear methyltransferase, METTL16. METTL16 methylates the underscored adenosine in its binding motif, "UACAGAGAA", in both mRNA and snRNA. We have cloned both the human METTL16 gene and its RNA recognition motif at the cleavage site of the C. elegans gene M03A1.3and co-expressed them both in the pharynx tissue. Understanding this process is crucial to identifying the main mechanisms behind mRNA cleavage site determination, further advancing knowledge in gene regulation which influences development, growth, and disease.
-
Piano, F., Gunsalus, K.C., Lucas, J.M., Mangone, M., Gutwein, M.R., Mecenas, D.
[
International Worm Meeting,
2011]
Metazoan messenger RNAs contain poorly defined cis-acting elements within their 3' untranslated regions (3'UTRs) that modulate gene expression at the post-transcriptional level. These elements play important roles in development, metabolism, and their dysfunction are associated with disease states such as autism, depression, diabetes, Alzheimer's and cancer. Using a combination of deep sequencing, high-throughput 3'RACE, and manual curation of public datasets, we have recently reported a 3'UTR encyclopedia defining ~26,000 3'UTRs for ~75% of protein-coding genes in the C. elegans WS190 genome. This study identified 3'end-processing elements, evolutionarily conserved blocks, predicted miRNA target binding sites, and alternative termination sites within thousands of 3'UTRs. The surprising complexity revealed by these data suggests that 3'UTR-mediated post-transcriptional gene regulation is prevalent on a genome-wide scale (1). To build a comprehensive resource for 3'UTR biology we have been constructing a library of 3'UTR clones using modular vectors that facilitate downstream in vivo functional analyses. We have completed our first pass of directed 3'UTR cloning obtaining at least one 3'UTR isoform for ~15,000 genes annotated in WS190, and we are currently isolating and deep-sequencing 3'UTR isoforms from our last batch of ~6,000 3'UTR minipools. These clones will be made available to the community through distributors and through our website UTRome.org (2). In addition, we have developed a suite of destination vectors to adapt the 3'UTRome to perform large-scale RNAi screens by feeding, to study the contribution to gene expression provided by different 3'UTR isoforms in vivo using dual-colors reporters, and to facilitate the introduction of 3'UTRs into worms either by injection or ballistic transformation. Moreover, we are now preparing entry vectors that will allow the studying of the contribution of 3'UTR isoforms to mRNAs localization using a MS2-derived system compatible with MosSCI technology. In conclusion, we have provided the first whole transcriptome-level description of the 3'UTRome in any organism with single-nucleotide resolution. Furthermore, our 3'UTR clone library provides a powerful tool to probe 3'UTR biology at a systems-level scale. Our work will help to better understand 3'UTR biology and push forward the study of important cis-regulatory elements in the genome.
1M. Mangone et al., The landscape of C. elegans 3'UTRs. Science 329, 432 (2010).
2M. Mangone et al., UTRome.org: a platform for 3'UTR biology in C. elegans. Nucleic Acids Res 36, D57 (2008).