Explore

Services

SecretomeP - 1.0

Prediction of non-classical protein secretion

The SecretomeP 1.0f server produces ab initio predictions of non-classical i.e. not signal peptide triggered protein secretion in eukaryotes. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction.

SUBMISSION

Paste a single sequence or several sequences in FASTA format into the field below:

Submit a file in FASTA format directly from your local disk:

Sort the output by prediction score   

Restrictions:
At most 500 sequences and 200,000 amino acids per submission; each sequence not less than 15 and not more than 4,000 amino acids.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:

Feature based prediction of non-classical and leaderless protein secretion.
J. Dyrløv Bendtsen, L. Juhl Jensen, N. Blom, G. von Heijne and S. Brunak.
Protein Eng. Des. Sel., 17(4):349-356, 2004

View the abstract


PORTABLE VERSION

Would you prefer to run SecretomeP at your own site? SecretomeP 1.0 is available as a package on a commercial license. Currently available platforms include MIPS (under IRIX, Silicon Graphics) and Pentium family (under Linux and CYGWIN). Send inquiries by e-mail to software@cbs.dtu.dk.

Usage instructions



1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y

Please note that the sequences containing other symbols e.g. X (unknown) will be discarded before processing. The sequences can be input in the following two ways:

  • Paste a single sequence (just the amino acids) or a number of sequences in FASTA format into the upper window of the main server page.

  • Select a FASTA file on your local disk, either by typing the file name into the lower window or by browsing the disk.

Both ways can be employed at the same time: all the specified sequences will be processed. However, there may be not more than 10 sequences in toto in one submission. The sequences shorter than 15 or longer than 4000 amino acids will be ignored.


2. Customize the run

  • By default the results are sorted by prediction score in deceasing order. If you deselect the option "Sort the output by prediction score" the sorting will be alphabetical by sequence name.

  • By default the server only shows the final predictions. If you select the option "Report the progress of the prediction process" the progress of the prediction process will be reported in detail.

3. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.

NOTE: SecretomeP is dependent on a number of other programs that have to be run on the input sequences prior to employing the SecretomeP method itself. Therefore, the processing of multiple sequences may be time-consuming. In the case of prolonged wait the user is advised to use the e-mail option mentioned above.

Data sets


Data sets used for training of SecretomeP 1.0

Positive training set: Download

Negative training set: Download

Non-classically secreted test set: Download

Output format



DESCRIPTION

For each input sequence the server predicts the possibility of non-classical or leaderless secretion.

For each input sequence three scores are generated by SecretomeP server as shown below. The first score is the neural network output score. This score is the score presented in the reference paper. The second number represents the odds that the sequence in fact is secreted. The third score is an estimated posterior probability that the sequence is secreted. It is calculated be weighing the odds by a prior probability of 0.2%, and should thus not be confused with the true probability as the prior probability in your data set may be entirely different.

Even though SecretomeP is trained to predict non-classical or leaderless secretion, it usually gives high score to proteins entering the classical secretory pathway (through the ER-Golgi apparatus). Therefore, for the proteins in which the presence of a signal peptide is predicted by SignalP-2.0 a warning is added to the output.



EXAMPLE OUTPUT


# Name         NN-score  Odds   Weighted   Warning
#                               by prior
# ============================================================================
FGF1_HUMAN      0.847   4.267     0.009    -
FGF4_HUMAN      0.945   6.804     0.014    signal peptide predicted by SignalP
ATRX_HUMAN      0.093   0.205     0.000    -

In the example above the first protein, known to enter the non-classical secretory pathway, FGF1_HUMAN, is correctly predicted as secretory. The second classical secreted protein, FGF4_HUMAN, is correctly predicted as secretory, but a warning that a signal peptide is predicted is reported. The third protein, ATRX_HUMAN, a known nuclear protein, receives a low score, thus is correctly classified.

Version history


1.0b The current version.
Possibility of sorting the output by prediction score. No changes to the method itself.

1.0 The original server (no longer available on line).

Feature based prediction of non-classical and leaderless protein secretion.
J. Dyrløv Bendtsen, L. Juhl Jensen, N. Blom, G. von Heijne and S. Brunak.
Protein Eng. Des. Sel., 17(4):349-356, 2004

View the article abstracts.

Article abstract


REFERENCE

Feature based prediction of non-classical and leaderless protein secretion.
J. Dyrløv Bendtsen1, L. Juhl Jensen1, N. Blom1, G. von Heijne2 and S. Brunak1.
Protein Eng. Des. Sel., 17(4):349-356, 2004

1 Center for Biological Sequence Analysis, The Technical University of Denmark, DK-2800 Lyngby, Denmark
2 Stockholm Bioinformatics Center, Department of Biochemistry, Stockholm University, S-106 91 Stockholm, Sweden



ABSTRACT

We present a sequence based method - SecretomeP - for prediction of mammalian secretory proteins targeted to the non-classical secretory pathway, i.e. proteins without an N-terminal signal peptide. So far only a limited number of proteins have been shown experimentally to enter the non-classical secretory pathway. These are mainly fibroblast growth factors, interleukins and galectins found in the extracellular matrix. We have discovered that certain pathway independent features are shared among secreted proteins. The method presented here is also capable of predicting (signal peptide containing) secretory proteins where only the mature part of the protein has been annotated, or cases where the signal peptide remains uncleaved. By scanning the entire human proteome we identify new proteins potentially undergoing non-classical secretion.

PMID: 15115854         doi: 10.1093/protein/gzh037

Software Downloads




GETTING HELP

Correspondence:        Technical Support: