-
[
Development & Evolution Meeting,
2006]
Large-scale functional genomic analyses produce large collections of data on gene function and functional links. Interpreting and visualizing this body of data in the context of specific biological processes is challenging, and easily accessible tools to aid in this endeavor are lacking. We have developed a Java-based Web application for navigating gene networks based on different kinds of functional links between genes, such as protein-protein or protein-DNA interactions, genetic interactions, predicted miRNA-target relationships, and correlations based on co-expression or phenotypic similarity. Inspired by the Generic Genome Browser (GBrowse), our goal was to build a similarly intuitive, easy to use, interactive tool for navigating network neighborhoods, or a "Generic Network Browser (N-Browse)". N-Browse, available at
http://www.gnetbrowse.org, operates within a web browser as a Java applet and uses a client-server system composed of a server-side MySQL database and a client-side graphical user interface (GUI). The GUI allows users to navigate functional networks by double-clicking on successive genes ("nodes") in the graph to expand the network around each node. Users can configure what kinds of functional links ("edges") to display and can retrieve a variety of functional information about selected genes or groups of genes, along with useful external links. The graphical display is dynamic and can be modified in various ways (e.g. zoom/rotate, anchor nodes, select subsets of nodes, display multiple support edges, etc.). N-Browse currently provides data for E. coli, C. elegans, D. melanogaster, and H. sapiens. For C. elegans, the current N-Browse database contains ~180,000 functional links among ~10,000 genes and miRNAs. Users can also upload their own data sets for integrated browsing with publicly available data and can save networks in various formats. By providing a convenient graphical integration of diverse types of data, N-Browse facilitates the interpretation of large-scale datasets in the context of local gene neighborhoods that helps formulate testable hypotheses about potential biological functions and functional links between genes. We will conduct live demonstrations of N-Browse and tutorials to help users derive the most potential from its use.
-
[
International Worm Meeting,
2005]
The RNAi Database (RNAiDB;
http://www.rnai.org) is an online resource for RNAi phenotypic data in Caenorhabditis elegans. RNAiDB contains a compendium of publicly available data and provides information about experimental methods, annotated phenotypic results, raw data in the form of images and streaming time-lapse movies, and graphical maps of chromosomal coordinates for RNAi reagents and gene models. Two main views of the data are available, which alternately show all genes potentially inhibited by a single RNAi experiment (the RNAi card), and all RNAi experiments potentially inhibiting a single gene (the Gene card). In addition to canonical gene mappings for each RNAi experiment, RNAiDB provides information about potential off-target inhibition based on sequence analysis of RNAi reagents. A new scoring system has been implemented to provide data on the relative extent of sequence similarity for potential off-target hits. In addition a new heat map for each gene displays sequence regions that are unique or shared with other genes. RNAi results can be searched in a variety of ways, including combinatorial queries for phenotypes and the tool PhenoBlast, which ranks experiments and genes based on their overall phenotypic similarity to a single query. PhenoBlast has been extended to allow searching of different sets of digitized phenotypic signatures, including those based on detailed embryonic phenotypes or gross morphological defects, and using different distance metrics. We will present an overview of the main features of RNAiDB and new improvements to facilitate analysis of phenotypic data and data mining.
-
[
International Worm Meeting,
2003]
Different functional genomics efforts have been initiated in C. elegans, including RNAi phenotype, expression profile, and protein interaction mapping projects. We are developing Web-accessible database tools to facilitate the archival, distribution, navigation, and mining of these data. RNAiDB (www.rnai.org): RNAiDB is a database tool for the archival and analysis of RNAi-based results from large-scale studies. It presents a view of the genome centered around phenotypic analysis, is fully searchable, and provides a user-friendly interface for navigation with links to external database resources. RNAiDB uses the AceDB database engine and is easily integrated with WormBase through its compatible database models. We are extending the functionality of RNAiDB in several ways. To enhance flexibility for mining phenotypic data we are creating tools such as 'PhenoBlast', which ranks genes on the basis of their overall phenotypic similarities (thus providing a function analogous to that of BLAST for sequence searching). We have extended the RNAiDB data models to accommodate one-to-many mappings between RNAi experiments and potentially inhibited target genes based on sequence analyses such as ePCR or BLAST. We have also created a rich set of new data models for phenotypic scoring together with a comprehensive Web-based data entry system that allows users to record phenotypic data directly into the database as they are being scored in a controlled, systematic manner via a series of Web forms linked to the database. InFuGen: We have begun prototyping a second Web tool to facilitate the exploration of different types of functional genomics data in an integrated fashion, which we call "InFuGen" (Integrated Functional Genomics). The idea behind this tool is to provide the ability to visualize and navigate network relationships between genes based on correlations from a combination of different functional and genomic data such as phenotypic signatures, gene expression profiles, protein interaction maps, sequence similarity, domain composition, and functional annotations.
-
Dieterich, Christoph, Ahmed, Rina, Sar, Funda, Gunsalus, Kristin, Miska, Erik, Chang, Zisong
[
International Worm Meeting,
2011]
MicroRNA genes (miRNAs) are post-transcriptional regulators of mRNA stability and/or translation and are regulated themselves on the level of gene transcription, processing and decay. MiRNAs regulate many biological processes and aberrant miRNA expression has been implicated in several disease states. Mirtrons are a special class of miRNAs since they originate from properly sized introns (~ 70 nt) of protein-coding genes. Precursor miRNAs (pre-miRNAs) of mirtrons are excised by the splicing machinery from the host gene, debranched, and directly processed by Dicer. The expression pattern of a mirtron is consequently similar if not identical to its host gene's expression pattern. Only a few mirtrons have been identified in vertebrates and invertebrates since their discovery in 2007. None of them has been linked to any phenotype so far.
We revisited the repertoire of miRNAs in Caenorhabditis elegans with a multi-platform sequencing approach (ABI SOLiD and Illumina GA II) to screen for novel miRNA gene candidates. Both platforms differ in sequencing bias, which is usually expressed in divergent normalized cross-platform read counts for any given miRNA. Consequently, both platforms complement one another in the gene discovery process. With this approach, we were able to extend the known set from four to six mirtrons. The modENCODE consortium (Chung et al., 2011) has independently confirmed our discovery.
However, one novel mirtron caught our attention and we started a functional characterization of this candidate gene. At the time of writing, we are certain that a knockout of the host gene has an embryonic lethal phenotype and shows greatly reduced levels of hatching worms. The knockout phenotype can be, at least partially, rescued by a mirtron-expressing transgene. Most surprisingly, this mirtron has been acquired recently and is not present in any of the other available Caenorhabditis genomes. We will give an update of our experimental findings at the International Worm Meeting.
References: Chung, W.J. et al. Computational and experimental identification of mirtrons in Drosophila melanogaster and Caenorhabditis elegans. Genome Res. 21, 286-300 (2011).
-
[
International Worm Meeting,
2009]
Close to 1,000 genes have been identified as necessary for proper early embryonic processes in C. elegans. Yet many more genes are known to be expressed that could contribute to these processes. To uncover the structure of the underlying genetic networks during early embryogenesis and to identify new components and pathways that participate in these key events, we set up a system to systematically test for enhancing and suppressing interactions using RNAi. We have used 24 available temperature sensitive (ts) alleles whose strong loss-of-function phenotype affects the early embryo. To select the genes to test against these mutants, we used a bioinformatics approach to enrich for genes that are likely to play a role in early embryogenesis, but have not been identified by feeding RNAi screens. In addition, we enriched our list for homologs of human disease genes. Overall, we have RNAi-tested an average of 500 genes against each ts allele. We have acquired and analyzed over 100,000 RNAi experiments and identified over 560 high confidence genetic interactions. On average, we found ~18 enhancers and ~5 suppressors per essential gene. The combined genetic network connects all the tested essential genes through at least one non-essential gene. Analyzing each set of interactions separately, the network of suppressor interactions is less connected than the enhancer network, in addition to the lower number of interactions overall. Interestingly, many of our positive interaction pairs include at least one human disease gene ortholog that had not been known to work in the embryo before, opening new ways to study these genes. To investigate the interaction network more closely, we used the network navigation tool N-Browse to help navigate these data and integrate them with other available functional data. These analyses revealed that most of the observed genetic interactions we identified were not previously predicted or known and, as expected, they do not overlap with the available protein-protein interaction data. The preliminary analysis of our data has revealed new functions of previously uncharacterized genes, uncovered pathways previously not known to operate in the embryo, and shown connections between pathways that are potentially buffering each other. Some of these new pathways implicate functions such as RNA turnover as potential phenotypic capacitors in the early embryo. In addition, we see peroxisome biogenesis as a critical function not previously identified in early embryogenesis. We have confirmed some of our results from high-throughput screening by detailed phenotypic analysis and quantification, and we have also made substantial progress on the automatic scoring of the images produced.
-
[
Development & Evolution Meeting,
2006]
Using combined network analysis of large-scale functional genomic data we mapped multi-protein modules required for distinct processes during early embryogenesis (Gunsalus et al. 2005). A basic question is how these molecular modules are coordinated through the mitotic cycle to ensure the proper unfolding of early developmental events. To identify proteins that could coordinate different modules we searched for proteins that bridged different modules. One such protein, C38D4.3, could be placed in either the nuclear pore complex module or in the chromosome maintenance module by a network clustering algorithm M-CODE (Bader et al. 2003). Consistent with its predicted roles at the nuclear pore and in chromosome segregation, GFP fusions and anti-C38D4.3 immmunolocalizations show that C38D4.3 shuttles between the nuclear envelope and the kinetochore during the cell cycle. Functionally, C38D4.3 is required for proper nuclear envelope and chromatin maintenance. C38D4.3 (RNAi) embryos, like embryos without nuclear pore components, are incapable of completely separating cytoplasm from nucleoplasm, failing to exclude microtubules and affecting the nuclear localization of PIE-1, a protein normally enriched in the P1 nucleus (Mello, 1996). Additionally, pronuclei fail to meet, and centrosomes do not remain attached to the paternal pronucleus and segregate prematurely. In these embryos, metaphase spindles are not established and chromatin neither condenses, congresses, nor segregates properly. These phenotypes are reminiscent of RNAi phenotypes of genes from the Ran GTPase cycle (Askjaer 2002). Looking for C38D4.3 (RNAi) phenotypic neighbors using PhenoBlast (Gunsalus et al 2004) or phenoclusters from large-scale RNAi analyses (Sonnichsen et al 2005; Gunsalus et al 2005) we identified ~25 other genes with similar defects when analyzed by time-lapse Nomarski microscopy. Of these, genes that are part of the RanGTPase pathway (
ran-1,
ran-2, and
npp-9) were required for proper C38D4.3 localization. Thus C38D4.3 is critical for both mitotic and interphase cell functions and is a likely target of the Ran GTPase pathway.
-
Polanowska, Jolanta, Cipriani, Patricia, Mecenas, Desirea, Gunsalus, Kristin, Selbach, Matthias, Chen, Jia-Xuan, Piano, Fabio
[
International Worm Meeting,
2015]
Mapping protein-protein interactions in vivo is instrumental in deciphering the molecular mechanisms underlying animal development. We have developed a new method combining in vivo expressed GFP-tagged proteins with label-free quantitative proteomics to identify protein-protein interactions in developing C. elegans embryos. This strategy is generic and can in principle be used with any GFP-tagged protein. To test our approach we focused on eight proteins involved in essential biological processes during embryogenesis and built a pilot embryo in vivo interaction map comprising 559 interactions among 472 proteins. This network captures known biology and is highly enriched in functionally relevant interactions. We further show the utility of the map by searching for new regulators of P granule formation during embryogenesis. We discovered the worm-specific protein GEI-12 as a novel interaction partner of the DYRK kinase MBK-2 and as an important regulator of P granule dynamics and germline maintenance. This leads us to propose a hypothetical model in which the phosphorylation state of GEI-12 regulates P granule assembly and disassembly during early embryogenesis. Additionally, GEI-12 also induces granule formation in mammalian cells, suggesting that a common mechanism of ribonucleoprotein granule assembly exists in worms and humans. Our results show that in vivo interactome mapping is a powerful and versatile approach that provides unique insights into animal development.
-
Piano, Fabio, West, Sean M., Polanowska, Jola, Reboul, Jerome, Gutwein, Michelle, Gunsalus, Kristin C., Mecenas, Desirea G., Bian, Wenting
[
International Worm Meeting,
2015]
Proper spatio-temporal control of gene activity is vital for C. elegans germline development and maintenance and is determined primarily by regulatory elements within 3'UTRs (Merritt et al., Curr Biol 2008). Because almost half of protein-coding genes in the genome are subject to alternative polyadenylation (Mangone et al., Science 2010; Jan et al., Nature 2011), we are investigating whether the regulatory potential of genes during germline development is controlled by alternative 3'UTR isoform expression. We have established a Low Input 3'-End Sequencing (LITE-Seq) method to simultaneously identify and quantify mRNA transcript abundance and 3'UTR isoforms from small RNA samples, and we have applied it to investigate differences in transcripts and 3'UTR isoforms expressed in oocyte- and sperm-producing germline and in three distinct developmental stages within the hermaphrodite germline (mitosis, early meiosis, and developing oocytes). We observe on a global level that 3'UTRs in sperm-producing germline tend to be shorter than those expressed in oocyte producing germline, and that 3'UTRs become progressively longer as germ cell nuclei proliferate, enter meiosis, and differentiate into oocytes. We have identified numerous transcripts whose abundance and/or 3'UTR isoforms differ in a sex- or developmental stage-dependent manner. We also detect examples of 3'UTR isoform switching between sexes or developmental stages, including for some genes whose total transcript abundance is similar. To test the idea that individual transcripts may be subject to differential post-transcriptional regulation by selective expression and/or degradation of alternative 3'UTR isoforms at different developmental times, we developed an in vivo assay that reports on the translational regulatory potential of alternative 3'UTR isoforms in the germline. The reporter construct enables the cloning of two distinct 3'UTR isoforms into a Gateway-compatible, two-color reporter system in which each fluorophore is subject to translational regulation by a single 3'UTR isoform. Using this reporter system, we found that protein expression for several genes identified above is altered in a 3'UTR isoform-dependent manner, and that protein levels vary in different developmental contexts. Future work to identify cis-regulatory elements within the variable regions of 3'UTRs will enable us to assay their relative contributions to specific spatio-temporal expression patterns of known developmental regulators in the germline and to ascertain their functional significance in different developmental processes.
-
Erickson, Katherine, Cipriani, Patricia G., White, Amelia, Piano, Fabio, Gunsalus, Kristin, Kao, Huey-Ling, Reboul, Jerome, Munarriz, Eliana, Lucas, Jessica, Chatterjee, Indrani
[
International Worm Meeting,
2013]
The phenotypes manifested by genetic alleles are influenced by the genetic background in which they reside. Yet, we still have a very limited understanding of how genetic interactions (GIs) influence animal development. The goal of our project is to use genome-wide screens to identify all enhancing and suppressing GIs for a set of strains harboring temperature sensitive (ts) mutations in 24 essential embryonic genes. We have completed over three million primary GI assays and secondary screening of putative suppressors, and we have archived in a database all experimental metadata and images, along with quantitative scoring results from an automated phenotypic scoring algorithm we developed (DevStaR). DevStaR combines computer vision and machine learning methods to count different developmental stages in mixed populations of animals. Using these results we have developed a quantitative phenotypic "GI score" based on the multiplicative model of independence: if the effects of perturbing two genes are independent, then their combined effects should not deviate from the product of their individual effects. GI scores for individual experimental replicates correlate positively with semi-quantitative manual estimates of interaction strength. Using manual inspection as a reference, we devised criteria to combine GI scores across replicates that reliably detect suppressing interactions. We then generated final interaction scores that reflect both strength and reproducibility, which we used to define ~800 high-confidence and ~750 intermediate-confidence suppressing interactions. Based on comparisons with manual scoring, we estimate the false discovery rates in these two sets as 2% and 10%, respectively. The resulting GI network provides the first genome-wide map of suppressing genetic interactions for the embryo based on quantitative phenotypic analysis of viability.
-
Sontag, Eduardo, Munarriz, Eliana, Cipriani, Patricia G, Kao, Huey-Ling, Piano, Fabio, Gunsalus, Kristin C, Paaby, Annalise, Geiger, Davi, White, Amelia G
[
International Worm Meeting,
2011]
We present an automated image analysis system (DevStaR) for quantitative phenotyping of C. elegans embryonic lethality and sterility phenotypes. Our image analysis system counts each developmental stage in an image of a C. elegans population, allowing efficient high throughput calculation of C. elegans viability phenotypes. DevStaR is an object recognition machine comprising several hierarchical layers that build successively more sophisticated representations of the objects (developmental stages) to be classified. The algorithm segments the objects, decomposes the objects into parts, extracts features from these parts, and classifies them using an SVM (support Vector Machine) and global shape information. This enables correct classifications in the presence of complicated occlusions and deformations of the animals. Features of the classified objects are then used to obtain a count of each developmental stage. We are currently using this system to analyze phenotypic data from C. elegans high-throughput genetic screens, and have processed over one million images for lab users so far. Validation of DevStaR measurements will be shown by comparing DevStaR output to both manual counting of developmental stages and manual scores of quantitative phenotypes. DevStaR can provide an accurate measurement of quantitative phenotype and is comparable to manual scoring. DevStaR has been used to score a C. elegans genome wide RNAi screen with up to 30 repeats per clone tested at up to 5 temperatures per clone. The screen consists of over 600,000 images each scored by DevStaR, Analysis of these data illustrate the convenience of DevStaR scoring and the use of a quantitative phenotype. Our system overcomes a previous bottleneck in image analysis by achieving near real-time scoring of image data in a fully automated manner. Our system reduces the need for human evaluation of images and provides rapid quantitative output that is not feasible at high throughput by manual scoring.