Transmembrane Helix Prediction
TMHMM is a method for prediction transmembrane helices based on
a hidden Markov model and developed by Anders Krogh and
Erik Sonnhammer.
Data sets
Membrane proteins
Our set of 160 membrane proteins was split into ten parts as they were
used for cross validation
Each entry consists of three lines:
1. The Swiss-prot identifier (fasta style)
2. The protein sequence (one long line)
3. The assignment preceeded by `#',
`i' for inside (cytoplasmic side), `M' for helix,
and `o' for outside (non-cytoplasmic side).
The whole set or cross-validation
partition
0,
1,
2,
3,
4,
5,
6,
7,
8,
9.
Proteins with known structure
This set of 645 proteins from PDB has been
used as a negative set (non-membrane) to test the discriminative power
of TMHMM (submitted).
Models
For the server (TMHMM 1.0) we use a model
trained on the complete set of 160 proteins.
Stuff from ISMB paper
Press here to see the predictions
on the 160 proteins.
These are cross validated, i.e., the model used to predict the
structure of a given protein was NOT trained on the partition
of the data containing that protein.
The format is like the sequence format above, except that lines
are split. Lines preceeded by `#' are correct annotation and those
preceeded by `?0' the prediction.
Cross-validation models:
The models trained on the set of 160 proteins (used for the above predictions).
One model for each test set
Model
0,
1,
2,
3,
4,
5,
6,
7,
8,
9.
In the ISMB paper we used one more set of 83 membrane proteins:
partition
0,
1,
2,
3,
4,
5,
6,
7,
8,
9.
Go to the DTU Health Tech home page
Last updated July 7 2000
by Anders Krogh, krogh@cbs.dtu.dk