Comparison of the 5' flanking sequences of the C. elegans vitellogenin genes revealed two highly-conserved, repeated, heptameric sequence elements. One of these, VPE2, has the consensus sequence, CTGATAA. It is found at -98 and again at -150 in the
vit-2 promoter. We have recently demonstrated that both of these VPE2 elements must remain intact for high level
vit-2 transcription (MacMorris, Broverman, Greenspoon, Blumenthal and Spieth, 1989 C. elegans Meeting Abstracts, p.195, and unpublished observations). The VPE2 sequence is quite similar to the recognition sequence for an erythroid tissue-specific transcription factor (A/T GATA A/G) identified in chickens, mice and humans. This transcription factor, called Eryf1 (or GF-1), is restricted to erythroid cells and is required for activation of a variety of erythroid genes including those coding for all globins, elastase and several other erythrocyte-specific enzymes (Plumb et al., 1989, NAR 17, 73). The gene for Eryf1 has recently been cloned and it has an unusual 'finger' DNA binding domain composed of two 'fingers' of 17 amino acids between paired cysteine residues. While the chicken and mammalian genes are nearly identical within the DNA binding domain, they appear almost unrelated in other regions of the protein (Tsai et al., 1989, Nature 339, 446; Evans and Felsenfeld, 1989, Cell 58, 877; Trainor et al., Nature 343, 92; Zon et al., 1990, PNAS 87, 668). Because of the similarity between the Eryf1 recognition sequence and VPE2 we decided to seek a relative of Eryf1 in C. elegans. Using a 33 base oligonucleotide to the highly-conserved 'finger' region (but with optimal C. elegans codons) we cloned a gene that encodes the worm version of this protein. We are calling it
elt-1 for erythrocyte- like transcription factor.
elt-1 contains the same highly-conserved DNA binding domain: two 'finger' motifs followed by regions rich in basic residues. Strikingly, the four introns we have localized so far in the C. elegans gene are at the same locations as introns in the mouse gene. Outside of the DNA binding domain the protein is rich in ser, thr and asn but doesn't resemble Eryf1 or other known transcription factors. The mRNA is about 1.7 kb, which is similar to the length found in mouse. Potential recognition sites for this protein exist upstream of all vit genes and also just upstream of the TATA box in the msp genes. John and Alan have located the gene at the left end of a cluster of msp genes on chromosome IV. Experiments are now underway to determine whether the protein recognizes VPE2, the site upstream of the msp genes, or both, or some other sequence. In any case, it is remarkable that worms contain such a highly conserved homolog of an erythrocyte- specific transcription factor, especially since worms lack altogether the homologous tissue.