[
Worm Breeder's Gazette,
1995]
Six-Cysteine Motifs in Nematode Proteins Mark Blaxter, Wellcome Research Centre for Parasitic Infections, Dept. Biology, Imperial College, London SW7 2BB UK email m.blaxter@ic.ac.uk I am interested in the evolution of function in nematode surface proteins and have been comparing genes isolated from parasitic species with the C. elegans genome project sequences. This has identified two cysteine-rich domains which appear to be nematode specific, and associated with surface-bound proteins. Each domain has 6 cys residues with characteristic spacing, suggesting that they are involved in three disulphide bridges. Such 6-cys domains are common in vertebrate and invertebrate proteins (eg EGF repeats, von Willebrand factor repeats7 trefoil repeats) but the nematode ones are distinct from these. The first cysteine-rich domain, termed NC6#1 (Nematode Cys 6), was found by David Gems in two surface coat proteins of Toxocara canis, an ascarid parasite of dogs (D. Gems and R. Maizels). It is 36 amino acids long and is found in two copies each in the T. canis proteins. In C. elegansthere are three genefinder-predicted genes with such 36 amino acid NC6#1 domains: one is composed of 5 such domains head tO tail. The second, NC6#2, is 48-53 aa long and has been identified in three species: l : Adults of Syngamus trachea, a strongylid parasite of the airways of birds reknowned for their huge amphidial glands (-50% of an adult body length of 10 mm), express an abundant 650 bp trans-spliced transcript encoding the cuticular globin. While cloning this I isolated a trans-spliced cDNA (St8.4) which has two NC6#2 repeats separated by 50 aa rich in G, S, Y and P (total 185 aa). Its function is unknown. 2 :This gene identifies three ORFs in the C. elegans genome sequence (chr m): B0280.5 (544 aa), R02F2.4 (458 aa) and C07G2.1 (558 aa). Each gene has NC6#2 repeats: B0280 and R02F2 have six each, separated by regions made up of 4 aa (EGSG or ESAG) repeats. C07G2 has three NC6#2 domains separated by P/T rich subrepetitive regions. None of these genes appears to have been identified by mutation and their function is unknown. 3 :Using an alignment of the S. trachea and C. elegans sequences I searched the db and identified another nematode protein with a single NC6*2 domain. Brugia malayi, the causative agent of brugian filariasis (elephantiasis) in humans, have ensheathed microfilariae (L1 stage). The sheath is derived from the eggshell and is retained by the mf in the bloodstream: it is shed on uptake by the mosquito vector. Juliet Fuhrman identified and cloned a mf surface protein which by sequence and activity is a chitinase. It is activated on uptake by the mosquito and may either effect escape from the sheath or entry through the gut wall. The chitinase domain of the protein is followed by a T/P rich region and an NC6#2 domain. The T/P region is the site of O-linked glycosylation in vivo. J. Fuhrman has identified another (insect) chitinase with a related C6 domain (Manduca sexta, Swissprot:S64757, pers comm). Neither the S.trachea nor the C. elegans genes have extensive non-NC6#2 regions which could be enzymatic domains. Speculations: Given the association of the chitinase NC6#2 domain with glycosylation and the presence of Ser and Thr residues in the non-NC6#2 regions of the other genes it is tempting to speculate that these too are O-glycosylated and that the NC6#2 domains are in some way involved in specifying modification. However, of the NC6#2 genes, only the B. malayi chitinase has a secretory leader peptide su;,gesting that the C. elegans and S. trachea proteins are not secreted and thus unlikely to be glycosylated. One of the T. canis genes carrying the NC6#1 domains is extensively O-glycosylated and has a secretory leader (D. Gems and R. Mazels). The role of the NC6#2 domains might lie in protein-protein interaction by analogy with the other six-cysteine repeats such as the EGF domain. In this model the inter-domain segments may either be structural spacers or effector regions.