| |
Inteins are protein-splicing elements that exist as in-frame fusions
with flanking protein sequences called exteins. Inteins are self-splicing
at the protein level, with their excision being coupled to extein
ligation. Most of the inteins that have been described are in the
400- to 500-aa range with little absolute sequence conservation
among the elements. However, Cys or Ser residues are required at
the amino termini of both the intein and the second extein, and
a His and Asn are present at the carboxy terminus of the intein.
Most inteins contain eight conserved sequence blocks (A-H), two
of these being the LAGLIDADG motifs (blocks C and E) that define
a family of intron-homing endonucleases. Consistent with the occurrence
of these motifs, several inteins have been shown to have site-specific
endonuclease activity, and PI-SceI, the VMA1 intein of Saccharomyces
cerevisiae, is capable of homing into a cognate inteinless allele.
The sporadic distribution of inteins in all three biological kingdoms
is consistent with their being mobile elements.
Endonuclease genes have been assumed to be invasive genetic elements
that colonized group I introns, converting them into mobile genetic
elements . Similarly, mobile inteins appear to be derived from invasive
endonuclease genes. Recent structural studies indeed suggest that
the protein-splicing and endonuclease domains are separate and that
their two activities may have evolved independently. First, the
crystal structure of PI-SceI has recently been solved. This 454-aa
protein is folded into two distinct structural domains. Second,
hidden Markov models have been used to define two conserved functional
domains of inteins, corresponding to independent endonuclease and
splicing modules, separated by nonconserved spacer regions of variable
lengths. Third, three putative inteins have recently been reported
that are in the 150-aa size range and lack endonuclease motifs,
although it is not clear whether these smaller elements retain splicing
function. Finally, a newly identified Synechocystis intein does
not contain a LAGLIDADG endonuclease but instead contains a member
of the H-N-H family of group I intron endonucleases.
Site-directed mutagenesis experiments have shown that endonuclease
activity is not required for protein-splicing function, and deletion
of a region encompassing the LAGLIDADG motifs of PI-SceI has confirmed
this conclusion. However, despite the apparent structural autonomy
of the protein-splicing and endonuclease domains of PI-SceI, they
do appear to collaborate in interacting with the homing site DNA.
Therefore, it is important to determine whether the bipartite structure
of inteins is mirrored by the functional independence of their two
components. We tested the prediction that the entire endonuclease
domain and spacer sequences between the domains can be deleted from
a protein-splicing element to generate a mini-intein that is splicing
proficient. To this end, we used the 440-aa intein from the Mycobacterium
tuberculosis recA gene expressed in Escherichia coli. The Mtu recA
intein contains a conventional LAGLIDADG endonuclease domain, although
endonuclease activity has not yet been demonstrated. Guided by junctions
inferred from structure models, a series of mini-intein derivatives
was tested in two genetic systems developed to screen for splicing
activity in vivo and in vitro. A number of mini-inteins deleted
for the entire endonuclease domain were shown to be capable of protein
splicing in both contexts, consistent with structure predictions.
These results support the model that homing inteins evolved through
an endonuclease gene invading a DNA sequence encoding a functional
mini-intein.
(Victoria Derbyshire, David W. Wood, , Wei
Wu, John T. Dansereau, Jacob Z. Dalgaard , and Marlene Belfort:
Genetic definition of a protein-splicing domain: Functional mini-inteins
support structure predictions and a model for intein evolution:
Vol. 94, pp. 11466-11471, October 1997)
Intein proteins contain a number of conserved sequence motifs (blocks).
The motifs can be grouped in three domains according to their location
and inferred function. Intein structures show that the inteins protein-splicing
and endonuclease active sites are formed from conserved motifs.
The intein's domain organization, deduced by sequence analysis,
exactly corresponds to the structural domains.
Domain structure of a typical intein
with a LAGLIDADG type endonuclease domain
N domain EN domain (optional) C domain
==-=----==----==----==--------==-==---==--------===
scale: - 8 amino acids, = motif region
(Pietrokovski S. (2001). Inteins- Protein
Introns. http://bioinfo.weizmann.ac.il/~pietro/inteins)
|