[
WormBook,
2006]
Throughout the C. elegans sequencing project Genefinder was the primary protein-coding gene prediction program. These initial predictions were manually reviewed by curators as part of a "first-pass annotation" and are actively curated by WormBase staff using a variety of data and information. In the WormBase data release WS133 there are 22,227 protein-coding gene, including 2,575 alternatively-spliced forms. Twenty-eight percent of these have every base of every exon confirmed by transcription evidence while an additional 51% have some bases confirmed. Most of the genes are relatively small covering a genomic region of about 3 kb. The average gene contains 6.4 coding exons accounting for about 26% of the genome. Most exons are small and separated by small introns. The median size of exons is 123 bases, while the most common size for introns is 47 bases. Protein-coding genes are denser on the autosomes than on chromosome X, and denser in the central region of the autosomes than on the arms. There are only 561 annotated pseudogenes but estimates but several estimates put this much higher.
[
WormBook,
2005]
The normal karyotype of Caenorhabditis elegans, with its five pairs of autosomes and single pair of X chromosomes, is described. General features of chromosomes and global differences between different chromosomal regions are discussed. Abnormal karyotypes, including duplications, deficiencies, inversions, translocations and chromosome fusions are reviewed. The effects of varying ploidy and of varying gene dosage are summarized. Dosage-sensitive genes seem to be rare in C. elegans, and the organism is able to tolerate substantial levels of aneuploidy. However, autosomal hemizygosity for more than about 3 % of the total genome may be incompatible with viability.
[
WormBook,
2005]
Understanding the biology of C. elegans relies on identification and analysis of essential genes, genes required for growth to a fertile adult. Approaches for identifying essential genes include several types of classical forward genetic screens, genome-wide RNA interference screens and systematic targeted gene knockout. Based on most estimates made from screening results thus far, from 15-30% of C. elegans genes appear to be essential. Genetic redundancy masks some essential functions and pleiotropy of many essential genes poses a challenge for a full understanding of their functions. Temperature sensitive mutations are valuable tools for studies of essential genes, but our ability to analyze essential genes would benefit from development of new tools for conditional inactivation or activation of specific genes.
[
WormBook,
2005]
Evolutionary innovation requires genetic raw materials upon which selection can act. The duplication of genes is of fundamental importance in providing such raw materials. Gene duplications are very widespread in C. elegans and appear to arise more frequently than in either Drosophila or yeast. It has been proposed that the rate of duplication of a gene is of the same order of magnitude as the rate of mutation per nucleotide site, emphasising the enormous potential that gene duplication has for generating substrates for evolutionary change. The fate of duplicated genes is discussed. Complete functional redundancy seems unstable in the long term. Most models require that equality amongst duplicated genes must be disrupted if they are to be preserved. There are various ways of achieving inequality, involving either the nonfunctionalization of one copy, or one copy acquiring some novel, beneficial function, or both copies becoming partially compromised so that both copies are required to provide the overall function that was previously provided by the single ancestral gene. Examples of C. elegans gene duplications that appear to have followed each of these pathways are considered.