[
International C. elegans Meeting,
2001]
The completion of the C. elegans genome sequence has identified nearly all of the genes in the genome (19,282 genes), but the function for most of these genes remains mysterious. A scant 6% of them have been studied using classical genetic or biochemical approaches (1135 genes), and only about 53% show homology to genes in other organisms (10,303 genes). The next challenge is to develop high-throughput, functional genomics procedures to study many genes in parallel in order to elucidate gene function on a global scale. One such approach is to use DNA microarrays to assay the relative expression of nearly every gene in the genome between two samples. Knowing when, where and under which conditions a particular gene is expressed can reveal the function of that gene. A consortium of laboratories has used C. elegans DNA microarrays to profile gene expression changes in a wide variety of experiments. In each experiment, RNA from one sample was used to generate Cy3-labelled cDNA, and RNA from another sample was used to prepare Cy5-labelled cDNA. The two cDNA probes were simultaneously hybridized to a single DNA microarray and the ratio of the Cy3 to Cy5 hybridization intensities was measured, revealing which genes were relatively enriched in either RNA sample. Thirty different laboratories have collectively performed 553 experiments using these C. elegans DNA microarrays, including 179 experiments with microarrays containing 11,917 genes (63% of the genome) and 370 experiments using microarrays that have 17,817 genes (94% of the genome). The experiments compare RNA between mutant and wild-type strains, or between worms grown under different conditions. Many experiments have been done to date, including experiments on wild-type development, heat shock, Ras signaling, aging, the dauer stage, sex regulation and germ line gene expression. Individual microarray experiments reveal sets of genes that change in one mutant strain or growth condition. We combine the data from all of the experiments to assemble a gene expression database, and then used this database to group together co-regulated genes and visualize them using a three dimensional expression map. By matching the expression profile of an unknown gene to those of genes with known functions, the expression map can be used to ascribe functions to the large fraction of genes in the genome whose functions were previously unknown.
[
International C. elegans Meeting,
2001]
mRNA transcripts that contain premature termination codons are degraded in a regulated manner by a specific decay pathway termed mRNA surveillance or nonsense mediated mRNA decay (NMD). mRNA surveillance has been described in all eukaryotes tested including plants, flies, yeast, nematodes, and mammals. Loss-of-function mutations affecting any of seven smg genes eliminates mRNA surveillance in C. elegans . While much progress has been made in defining the cis -acting elements and trans -acting factors necessary for mRNA surveillance, less is know about the mRNAs produced during the normal course of gene expression that serve as substrates of NMD. In order to systematically investigate these natural targets of NMD, I have probed high density microarrays of C. elegans cDNA clones. These arrays, made in Stuart Kim's lab at Stanford University, consist of PCR fragments from 17,871 genes currently annotated in the C. elegans genome (Jiang et al., 2001). We performed nine different hybridizations, each comparing expression profiles of poly(A) + mRNA from smg(-) and smg(+) samples. The microarray screen identified 402 mRNAs whose abundance is 2-fold or greater in smg(-) mutants. Several observations suggest that microarrays are a reliable method for identifying natural targets of mRNA surveillance: 1. Control mRNAs known previously to be natural targets of mRNA surveillance (
rpl-12 and
srp-20 ) were consistently elevated on the microarray screen. 2. Both northern blots and quantitative PCR analyses confirm the microarray results. These results indicate that the microarray screen will be useful for the systematic investigation of the C. elegans natural targets of NMD. We are presently analyzing smg(-) affected mRNAs to establish whether they are direct or indirect targets of NMD, and understand why their abundance is increased in smg(-) mutants. We anticipate that analysis of the ~400 candidate genes will identify broad categories of mRNAs produced in wild-type animals yet targeted for degradation by NMD and, possibly, new principles of post-transcriptional regulation of gene expression. Jiang M, Ryu J, Kiraly M, Duke K, Reinke V, Kim SK. Genome-wide analysis of developmental and sex-regulated gene expression profiles in Caenorhabditis elegans . Proc. Nat. Acad. Sci. USA. (2001) 98 (1):218-23.
[
International Worm Meeting,
2003]
CeTwist, the product of the
hlh-8 gene, is a member of the basic helix-loop-helix (bHLH) family of transcription factors and functions as a heterodimer with CeE/DA, the product of the
hlh-2 gene (1). In order to reveal a complete pathway of CeTwist function in C. elegans mesoderm development, it will be necessary to ultimately identify all downstream target genes of
hlh-8. In addition, some mutations in human genes with homologs in the CeTwist pathway are thought to be the causes of several syndromes of human craniosynostotic disease. We predict the CeTwist target genes which are found in our work might identify genetic defects of other syndromes of human craniosynostotic disease. Using cDNA microarrays, it is possible to observe global changes in gene expression of nearly every mRNA during development (2). Since
hlh-8 is expressed in at most 2% of cells at any given time during development, it may be difficult to obtain meaningful data using DNA microarrays by comparing mRNA from
hlh-8 null animals to mRNA from wild-type animals. To circumvent this difficulty, we will overexpress both
hlh-8 and
hlh-2 from heat shock promoters and compare mRNA isolated from this strain to mRNA isolated from isogenic wild-type animals. We have already built the following strains: experimental group containing pRF4 pHS::
hlh-8 pHS::
hlh-2 and control group containing pRF4 only. All of the constructs are already integrated onto chromosomes through irradiation. By using RT-PCR to investigate the kinetics of the known target gene
arg-1 under heat shock treatment, we have found that a 20 min heat shock treatment at 33oC followed by a 40 min room temperature recovery period is the optimal experimental condition for observing target gene expression. The expectation is that unknown targets will be expressed under these conditions as well. Now we are scaling up to collect enough mRNA from synchronized L2 larvae under our experimental conditions to perform cDNA microarrays. 1. Harfe, B. D., Gomes, A. V., Kenyon, C., Liu, J., Krause, M. and Fire, A. (1998). Analysis of a Caenorhabditis elegans Twist homolog identifies conserved and divergent aspects of mesodermal patterning. Genes Dev 12, 2623-2635; 2. Jiang, M., Ryu, J., Kiraly, M., Reinke, V. and Kim, S. K. (2001) Genome-wide analysis of developmental and sex-regulated gene expression profiles in Caenorhabditis elegans. Proc Natl Acad Sci USA 98, 218-223.
[
West Coast Worm Meeting,
2004]
Regulatory motifs are short sequences of DNA that regulate the level, timing, and location of gene expression. Identifying these motifs and their functions is crucial in our understanding of gene regulation and disease processes. We developed CompareProspector, a motif-finding program that takes advantage of cross-species sequence comparison to identify putative regulatory motifs from sets of co-regulated genes [1] . We applied CompareProspector to 30 sets of genes with very similar patterns of expression, identified from the C. elegans topomap [2] and individual DNA microarray experiments. The statistical significance of each candidate motif identified was evaluated using criteria such as motif enrichment-the ratio of prevalence of the motif in a given set of promoters to its prevalence elsewhere in the genome, and the expression coherence of genes with the motif. We identified twelve significant regulatory motifs, three of which have literature evidence confirming they are true regulatory motifs. Overall, these twelve motifs are found in the upstream regulatory regions of 2970 different genes, and may be involved in gene regulation in 24 clusters of co-expressed genes. The first known motif, with the consensus TGATAA, matches the consensus of known binding sites for GATA factors. As GATA factors are known to be involved in worm intestine development [3] and hyperdermis development, it is not surprising that the GATA motif is identified from a set intestine-specific genes (F. Pauli, unpublished), mount08 of the topomap, which is enriched in genes from the intestine, and several collagen-related datasets (mount14, 17, and 35 of the topomap). We correctly identified GATA sites in the promoters of genes known to be regulatory by GATA factors. Interestingly, the GATA motif is also identified from several data sets involved in the aging process. This result parallels that of Murphy and colleagues, who independently identified this motif from their data set of DAF-16 target genes [4] . Both our result and the result from Murphy suggest that GATA factors may be involved in worm aging. Motif 2, which is identified in the two heat shock-related data sets, matches the consensus of known binding sites for heat shock factors [5] . Motif 3 matches the consensus of heat shock associated sites (HSAS), a motif that was first predicted computationally to be involved in the heat shock process [6] and later experimentally validated to be involved in ethanol stress response (14 th International C. elegans Conference abstract 1113C). We are currently in the process of validating the rest of the motifs and their individual binding sites using mutagenesis studies of promoters with predicted motifs. 1. Liu, Y., Liu, X.S., Wei, L., Altman, R.B. and Batzoglou, S. (2004) Eukaryotic regulatory element conservation analysis and identification using comparative genomics . Genome Res. 14 , 451-8. 2. Kim, S.K., Lund, J., Kiraly, M., Duke, K., Jiang, M., Stuart, J.M., Eizinger, A., Wylie, B.N. and Davidson, G.S. (2001) A gene expression map for Caenorhabditis elegans . Science. 293 , 2087-92. 3. Maduro, M.F. and Rothman, J.H. (2002) Making worm guts: the gene regulatory network of the Caenorhabditis elegans endoderm . Dev Biol. 246 , 68-85. 4. Murphy, C.T., McCarroll, S.A., Bargmann, C.I., Fraser, A., Kamath, R.S., Ahringer, J., Li, H. and Kenyon, C. (2003) Genes that act downstream of DAF-16 to influence the lifespan of Caenorhabditis elegans . Nature. 424 , 277-83. 5. Amin, J., Ananthan, J. and Voellmy, R. (1988) Key features of heat shock regulatory elements . Mol Cell Biol. 8 , 3761-9. 6. GuhaThakurta, D., Palomar, L., Stormo, G.D., Tedesco, P., Johnson, T.E., Walker, D.W., Lithgow, G., Kim, S. and Link, C.D. (2002) Identification of a novel cis-regulatory element involved in the heat shock response in Caenorhabditis elegans using microarray gene expression and computational methods . Genome Res. 12 , 701-12 .