-
Rose, Ann, Marra, Marco, Baillie, David, Chu, Jeffrey S C, Jones, Steven, Johnsen, Bob, Tu, Domena
[
International Worm Meeting,
2011]
We are investigating the role of essential genes in C. elegans. Several essential genes have been characterized by the C. elegans community and the published literature shows that the essential genes are of great interest, not only because of their biological importance to the development of C. elegans, but also because of their relevance to human health. Initial approaches to identifying essential genes involved screening for lethal mutations (lethals) using genetic balancers. This approach has yielded thousands of mutations defining more than 500 essential genes. We have mapped mutations to chromosomes I, III, IV and V. We aligned genetic mutations with physical coding regions in order to identify the molecular basis of the lethals. This involved positioning the lethals by using three-factor mapping and complementation to deletions and duplications. Subsequently, cosmid and fosmid transgenic rescues were used to identify candidate coding regions. Finally, PCR analysis and DNA sequencing confirmed the coding region containing a lethal mutation. This is a labour intensive and long term project.
Currently we are using whole genome sequencing to identify the coding regions corresponding to essential genes. The facts that the mutations are precisely mapped and that there are, in most cases, more than one allele per complementation group, makes the identification of the mutation for a given lethal strain relatively easy. The cost of Illumina sequencing and subsequent bioinformatics analysis makes this approach very competitive. Here we present the results of our initial sequencing of chromosome I lethals balanced by sDp2 and propose the feasibility of sequencing the entire essential gene collection. The sDp2 region covers the left half of chromosome I and contains approximately 1350 predicted coding regions. We have identified 237 of these as essential genes by lethal analysis. Statistical analysis predicts about 400 (Johnsen R.et al., Mol Gen Genet, 2000) and RNAi has identified 409 coding regions resulting in lethal phenotypes (Wormbase WS223). This may be an upper limit on the number of essential genes in the sDp2 region because RNAi can knockout whole families of closely related genes. Knocking out one member of the family may not be lethal. Correlating the lethal mutations with their corresponding coding regions would greatly increase the genetic information and tools available for analysis of the essential biology of C. elegans.
-
[
International Worm Meeting,
2009]
Since the publication of the C. briggsae genome annotation in 2003 [1], not much improvement has been done, although accumulating evidence suggests that many gene models are inaccurately predicted or missing. In this project, we have reannotated the C. briggsae genome, exploiting the much improved C. elegans genome annotation (using WS195, compared to WS77 annotation used for the original C. briggsae annotation), as well as a new homology-based gene finder we have developed, genBlastG. genBlastG builds on our recently published program genBlastA [2] and takes as input a query protein sequences and a genome sequence that will be annotated to produce all homologous gene models. Our analysis suggests that genBlastA outperforms GeneWise in both processing time (on average genBlastG runs ~50 times faster than GeneWise) and accuracy. We applied genBlastG to reannotate the C. briggsae genome. Our preliminary results from genBlastG produced 16,954 homologous models with the majority (11,235) matching well with the current WormBase annotation. However, a significant number (4,828) of genBlastG models exhibit better percent identity (PID) to the query protein sequence, the C. elegans query sequences. Thus, genBlastG models shows better homology to C. elegans models for many genes. In addition to better homology, our predictions also points out 261 WormBase models that should be split and 298 pairs of models should be merged. As an example of a model that should be merged, we found that CBG14800 and CBG14801 may actually be one gene that''s orthologous to C54G7.3a. CBG14801 may only represent the shorter isoform that''s orthologous to C54G7.3b. As an example of a model that should be split, we found that CBG00366 consists of orthologs from ZK550.3 and ZK550.4. In this presentation, I will summarize all improvement suggested by genBlastG. genBlastG will also be applied to predict gene models in other Caenorhabditis species. 1.Stein, L.D., et al., (2003). The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol 1: E45. 2.She, R., et al., (2009). GenBlastA: enabling BLAST to identify homologous gene sequences. Genome Res 19: 143-9.
-
Baillie, David, Uyar, Bora, Trinh, Joanne, Chen, Nansheng, Johnsen, Bob, Chu, Jeffrey S C, Wang, Jun, Tu, Domena, Tarailo-Graovac, Maja
[
International Worm Meeting,
2011]
RFX transcription factors play important roles in cilia biogenesis and maintenance by transcriptionally regulate ciliary genes. Their target genes have been associated with a number of disease conditions collectively called ciliopathies.
daf-19 is the only known member of the RFX family in C. elegans. RFX regulation of ciliary genes is conserved between humans and C. elegans. The DNA binding site, known as the X-box motif, is also conserved. Thus, C. elegans has been effectively used as a model organism to identify RFX target genes. Past studies have identified target genes by searching for X-box motifs in C. elegans using bioinformatics, functional genomics, and comparative genomics methods. All of the DAF-19 target genes studied to date possess a single consensus X-box motif. Some genes (
bbs-2 and
osm-5) were found to possess two X-box motifs. We hypothesize that tandem X-box motifs could have cooperative roles. To test this hypothesis, we compared the gene sets between 4 Caenorhabditis species to find C. elegans genes with multiple X-box motifs within 500-bp promoter region. The C. elegans gene set is an extensively curated set, but this is not the case for the remaining Caenorhabditis species. In order to employ comparative genomics for X-box motif searches, we improved the gene sets for 3 Caenorhabditis species using genBlastG, a homology based gene predictor that we recently developed. Using comparative genomics with the improved gene set, we identified 15 genes that have conserved X-box motifs in all species and have multiple X-box motifs in C. elegans. We examined one gene, F25B4.2, in detail. Using singly integrated reporter constructs, we have shown that F25B4.2 is expressed in ciliated neurons. This expression is dependent on the two 15-bp X-box motifs as well as DAF-19 indicating that F25B4.2 is a DAF-19 target gene. When the proximal motif is removed, expression in ciliated neurons is ablated. When the distal motif is removed, we observed an elevated expression suggesting the distal motif has a repressive role. This is the first to report a putative repressive X-box motif in C. elegans. Our data suggest that two X-box motifs cooperate together to regulate specific expression level of F25B4.2. We model that having multiple X-box motifs in the promoter could achieve specific expression level. Our identifications of X-box motifs will improve our understanding on RFX mediated regulation in C. elegans and in other organisms including humans.
-
[
International Worm Meeting,
2007]
How genes are organized in a genome is not well understood. The recent establishment of about 2,000 transgenic Caenorhabditis elegans strains that express green fluorescence protein (GFP) driven by endogenous promoters, and the subsequent identification of high resolution expression profiles (Hunt-Newbury et al., submitted) has provided a solid platform for investigating this problem. We hypothesize that organ- and tissue-specific genes are not randomly distributed in a genome and instead, they are clustered to form groups. Previous works using microarray have suggested that genes in muscle, intestinal, and germ-line are non-randomly clustered in the C. elegans genome. In this study, we would like to explore whether such clustering properties are evident generally across organ systems or tissues. The advantage of using the GFP expression data for this project is the high resolution and high sensitivity of expression pattern identification across many tissues. We analyzed both at the organ level and the tissue level expression profiles. At the organ level, we found genes express in almost all organ systems are significantly clustered at various intervals. Furthermore, different organs show significant clustering at different ranges of intervals suggesting that different organs may show different mechanisms underlying clustering. At the tissue level, we found that genes express in 17 out of 48 tissues exhibit significant clustering at different ranges of intervals with some tissues cluster more narrowly. In addition to the clustering distribution of organ- and tissue-specific genes in the C. elegans genome, with this data set, we also found that some tissue types are correlated because of shared expressed genes. As expected, different types of muscle tissues are correlated and cluster together in a correlation tree, and so are many neuronal cell types. Unexpectedly, we also found that certain groups of muscle cells are more correlated to certain group of neuronal cells, suggesting that the shared genes may play important roles in functionally specialized tissues. Finally, we have examined the similarity of expression profile of different genes by digitizing expression profiles of each gene. Such similarity in gene expression profile may be able to guide functional predictions and annotation of uncharacterized genes.
-
[
Worm Breeder's Gazette,
1994]
Specification of vulval cell fates by sequential signaling pathways Jeffrey S. Simske and Stuart K. Kim, Dept. of Developmental Biology, Stanford University Medical Center, Stanford CA 94305
-
[
Worm Breeder's Gazette,
1994]
lin-36, a Class B Synthetic Multivulva Gene, Encodes a Novel Protein Jeffrey H. Thomas and H. Robert Horvitz, HHMI, Dept. Biology, MIT, Cambridge, MA 02139, USA
-
[
International Worm Meeting,
2013]
Mitomycin C (MMC) is a DNA crosslinking agent used clinically as a diagnostic for the genome instability syndrome Fanconi anemia, and also as a chemotherapeutic agent to treat a wide range of cancers. MMC covalently interacts with guanines in GC-rich segments of DNA, resulting in inter- and intrastrand DNA crosslinks, as well as DNA adducts. These interactions can cause DNA lesions that inhibit critical cellular processes such as replication, and if unrepaired lead to a range of chromosomal lesions. Although widely used, the prevalence and range of genomic instability caused by MMC has not been characterized. The nematode Caenorhabditis elegans is a well characterized genetic model in which to study genomic damage generated by exposure to crosslinking agents. In this context, we have performed forward genetic screens, determined the forward mutation frequency at different doses of MMC, and have isolated a number of lethal mutations for study.The lethal mutations were isolated and maintained as heterozygotes using hT2, a genetic balancer involving a reciprocal translocation between chromosomes I and III. We used three-factor mapping to narrow down the location of over 60 MMC-induced lethal mutations on chromosomes I and III. DNA from the genetic strains carrying the MMC-induced mutations has been sequenced using next-generation sequencing. As a consequence of the precise mapping of the lethal mutations, we can efficiently identify the physical basis of the lesions. We are in the process of characterizing the spectrum of mutagenic lesions caused by MMC, examining both their nature and frequency. Identification of the types of DNA lesions caused by MMC not only furthers our understanding of this widely used drug, but may also lend insight into the molecular mechanisms required for their repair.
-
[
International Worm Meeting,
2013]
C. elegans has a simplified FA pathway and as such has proven to be an invaluable model for the discovery and investigation of conserved FA-associated genes. We have previously identified DOG-1 as the functional ortholog of the human FANCJ helicase. Animals mutant for a null allele,
dog-1(
gk10), share many of the hallmarks of FA cells such as increased genomic instability and sensitivity to interstrand crosslinks (ICLs). Although our understanding of DOG-1/FANCJ is growing, its precise function in ICL repair remains elusive. We are exploiting C. elegans genetics in combination with whole-genome sequencing (WGS) to investigate the relationship between genome stability and ICL sensitivity in DOG-1/FANCJ deficient animals. We have sequenced genomes derived from a sequential line of animals that have accrued mutations over a defined number of generations under standard laboratory conditions. From this analysis we are able to identify copy number variations (CNVs), including small deletions, and single nucleotide variations (SNVs). In the
dog-1(
gk10) genomes we find an accumulation of deletions in g-rich DNA as expected. We are currently expanding on this analysis to assess the genomes of
dog-1 mutants that have been treated with a variety of mutagenic agents including those known to induce ICLs. We have demonstrated the feasibility of using WGS to assess genomic instability in
dog-1 genomes to a high resolution. Assessing the extent and type of genomic damage in ICL treated genomes will lead to a better understanding of how
dog-1/FANCJ's function in maintaining g-rich DNA is correlated with its ability to prevent sensitivity to ICL inducing agents. Our findings have potential significance for the informed use of chemotherapeutic agents against cancers arising in FANCJ patients.
-
[
MicroPubl Biol,
2021]
During the process of cell differentiation, specific cytoskeletal proteins can sequentially assemble into a wide variety of diverse molecular superstructures. Nematode spermatogenesis provides a powerful system for studying these transitions since sperm-specific transcription ceases prior to the meiotic divisions and translation ceases shortly thereafter (Chu and Shakes, 2013). Therefore, structural transitions that follow the meiotic divisions must be carried out by the remodeling of already synthesized proteins. The Major Sperm Protein (MSP) is a nematode-specific cytoskeletal element whose polymerization dynamics drive the pseudopod-based motility of the activated sperm (Roberts, 2005). In C. elegans, MSP additionally functions as the extracellular signaling molecule for triggering both ovulation and oocyte maturation (Miller et al., 2003). MSP is highly abundant in sperm, where it reaches 10-15% of total and 40% of soluble cellular protein (Roberts 2005). Within developing spermatocytes, MSP is packaged into fibrous bodymembranous organelle (FB-MO) complexes (Fig. 1A, Roberts et al., 1986). By assembling into paracrystalline FBs, MSP is both sequestered away from the critical meiotic processes of chromosome segregation and cytokinesis while also being packaged for efficient segregation into spermatids during the post-meiotic partitioning process (Chu and Shakes 2013, Nishimura and LHernault, 2010, Price et al., 2021). Following the meiotic divisions and sperm individualization, FBs disassemble, and MSP disperses as dimers throughout the spermatid cytoplasm (Fig. 1A). When sperm activate to form motile spermatozoa, MSP polymerization within the pseudopod drives the motility of the crawling sperm (Chu and Shakes, 2013). Thus, MSP exists in at least three distinct molecular states: 1) in highly organized paracrystalline FBs within developing spermatocytes 2) as unpolymerized dimers within spermatids, and 3) in dynamically polymerizing filaments and fibers within crawling spermatozoa.
-
[
International Worm Meeting,
2013]
The growth of an organism is the result of a complex and highly-regulated developmental program, and changes in environmental conditions can alter the expression of key genes, hence altering developmental pathways which result in differences in its phenotype. For example, when faced with conditions of over-crowding, high temperatures or starvation, late L2 C. elegans larvae develop into an alternate L3 stage known was as the Dauer-diapause which have altered morphology, prolonged life, increased stress resistance. In addition to dauer formation, the free-living nematode C. elegans utilizes several strategies to deal with conditions of fluctuating food supply in the soil, such as Adult Reproductive Diapause, and L1-diapause. Specifically, L1-diapause is induced when embryos are hatched in the absence of food. These L1 larvae do not appear to undergo any morphogenetic changes, and are capable of re-entering the developmental pathway when placed back on food. In this study, our aim was to identify essential genes that are required for starvation-induced L1-diapause animals to fully recover from starvation and re-enter the developmental pathway. Essential genes are those required for proper growth into a fertile adult, and mutations lead to a range of lethal phenotypes such as emb, larval arrests, sterile adults and mels. Hence, this study was carried out via a genetic screen of lethal mutants to identify those that, after recovery on food after undergoing L1-diapause, (i) had no effect on their arrest stage, (ii) demonstrated an earlier arrest stage indicating premature arrest, or (iii) demonstrated a later arrest stage indicating extended survival. In addition, to further understand their roles in L1-diapause recovery, whole-genome sequencing of the heterozygous, genetically-balanced strains was carried out to identify their molecular identities, and in 59/75 strains, we were able to successfully ascertain their specific molecular lesions.