| |
Inteins are named after the organism and gene in which they are
found. The organism name follows the same consensus as restriction
enzymes and uses a 3 letter genus + species designation, followed
by a strain designation, if necessary. The organism name is followed
by an abbreviation of the extein name. If more than 1 intein is
present in an extein gene, the inteins are given a numerical suffix
starting from 5' to 3' or in order of their identification.
For example, the Pyrococcus furiosus ribonucleoside-diphosphate
reductase alpha subunit gene contains 2 inteins. The organism is
abbreviated as 'Pfu'. Since the gene has been called the 'RIR1'
gene, the inteins are named using this gene name. Thus, these 2
inteins are called the Pfu RIR1-1 intein inserted after Gly 301
in the Pfu RIR1 precursor protein and the Pfu RIR1-2 intein inserted
after Pro914 in the Pfu RIR1 precursor protein.
Note that an intein name, such as the Pfu RIR1-1 intein, refers
to both the intein gene and the intein protein. In many publications,
the consensus is to italicize the gene name and to capitalize the
first letter of the protein name.
As described below, some inteins are bifunctional proteins that
also have endonuclease activity. When endonuclease activity has
been demonstrated, the intein is also given a second name that follows
the endonuclease naming conventions (Belfort 1997). This name includes
the prefix 'PI-', the 3 letter organism abbreviation and a Roman
numeral indicating the order of identification of the intein endonuclease
in that organism. The endonuclease names for the Pfu RIR1- and Pfu
RIR1-2 inteins are PI-PfuI and PI-PfuII, respectively.
There is also a convention for numbering amino acids in inteins.
Although we often number the residues in the precursor as a single
protein, as when intein insertion site locations are given, a second
numbering scheme is often used to assist thinking about inteins
in heterologous or foreign exteins. The intein amino acids are numbered
from N-terminal to C-terminal beginning with the first residue of
the intein and ending with the last residue of the intein. The amino
acids in the N-extein: (a) start with the number 1, (b) include
a minus sign prefix and (c) are counted from right to left (beginning
with the last N-extein residue and going towards the N-terminus).
The amino acid preceding the intein is the -1 amino acid. The amino
acids in the C-extein: (a) are numbered beginning at the C-terminal
splice junction, (b) include a plus sign prefix and (c) are counted
from amino to C-terminus.
The first residue following the intein is the mechanistically essential
+1 amino acid, which is not technically part of the intein since
the intein is defined as the intervening sequence that is spliced
out of the precursor.
(Perler, F. B. (2002). InBase, the Intein
Database. Nucleic Acids Res. 30, 383-384)
|