FB2024_03 , released June 25, 2024
Reference Report
Open Close
Reference
Citation
Kassis, J.A., Poole, S.J., Wright, D.K., O'Farrell, P.H. (1986). Sequence conservation in the protein coding and intron regions of the engrailed transcription unit.  EMBO J. 5: 3583--3589.
FlyBase ID
FBrf0044246
Publication Type
Research paper
Abstract
Engrailed (en) is a gene involved in proper segmentation of the Drosophila embryo. The predicted en protein contains a homeodomain and regions rich in polyalanine, polyglutamine, polyglutamate/aspartate and serine. We have taken an evolutionary approach to define which regions may be of fundamental importance by examining the D. virilis genomic sequence homologous to the D. melanogaster en primary transcription unit. Sequence homology begins at the first ATG of a long open reading frame yielding proteins of 584 and 552 amino acids for the D. virilis and D. melanogaster proteins, respectively. The predicted amino acid sequence can be divided into conserved and non-conserved domains. The C-terminal 30% of the protein (which includes the homeodomain) is completely conserved. In the N-terminal 70% of the protein, the overall conservation is 71%, but non-conservative amino acid changes occur in clusters and there are short stretches of highly conserved sequence. A region rich in glutamate and aspartate is conserved and has homology to an 18-amino acid sequence present in members of the myc family of proteins. Major differences in the size of the two proteins occur in regions of non-conserved repeated sequences. In the introns of the engrailed transcription units there are long stretches of conservation, suggesting this DNA may be of functional importance.
PubMed ID
PubMed Central ID
PMC1167397 (PMC) (EuropePMC)
DOI
Associated Information
Comments
Associated Files
Other Information
Secondary IDs
    Language of Publication
    English
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Journal
    Abbreviation
    EMBO J.
    Title
    The EMBO Journal
    Publication Year
    1982-
    ISBN/ISSN
    0261-4189
    Data From Reference
    Genes (1)