DTU Health Tech

Department of Health Technology

We recently made large changes to the webserver infrastructure, so you might experience errors. Please report issues to health-master@dtu.dk

TargetP - 2.0

Subcellular location of proteins: mitochondrial, chloroplastic, secretory pathway, or other

TargetP-2.0 server predicts the presence of N-terminal presequences: signal peptide (SP), mitochondrial transit peptide (mTP), chloroplast transit peptide (cTP) or thylakoid luminal transit peptide (lTP). For the sequences predicted to contain an N-terminal presequence a potential cleavage site is also predicted.

Submit data

Paste or upload protein sequence(s) as fasta format. For example file, Click here

Protein sequences should be not less than 10 amino acids. The maximum number of proteins is 5000.

Format directly from your local disk:

Non-plant
Plant


Long output
Short output (no figures)

Instructions

1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y and X (unknown)

All the alphabetic symbols not in the allowed alphabet will be converted to X before processing. All the non-alphabetic symbols, including white space and digits, will be ignored.

The sequences can be input in the following two ways:

  • Paste a single sequence (just the amino acids) or a number of sequences in FASTA format into the upper window of the main server page.

  • Select a FASTA file on your local disk, either by typing the file name into the lower window or by browsing the disk.

Both ways can be employed at the same time: all the specified sequences will be processed. However, there may be not more than 5,000 sequences in one submission.

2. Customize your run

  • Organism group:
    Choose Plant for any organism with chloroplasts/plastids and Non-plant otherwise.
  • Output format:
    You can choose between two output formats:
    Long
    Shows one plot and one summary per sequence.
    Short
    Convenient if you submit lots of sequences. Shows only one line of output per sequence and no graphics.

3. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.

Example Outputs

To appear...

Training and testing data set

The dataset used for training, validating, and testing TargetP 2.0 (using nested cross-validation) can be found here.

The sequences are in FASTA format with the UniProt AC as sequence name: Download

The annotations are in a tab-separated file where each line contains three fields: The UniProt AC, the type of targeting peptide, and the length of the targeting peptide.
The type can be

  • "SP" for signal peptide,
  • "MT" for mitochondrial transit peptide (mTP),
  • "CH" for chloroplast transit peptide (cTP),
  • "TH" for thylakoidal lumen composite transit peptide (lTP),
  • "Other" for no targeting peptide (in this case, the length is given as 0).
Download

Predictions on proteomes

Results from TargetP predictions on whole proteomes from UniProt (gzipped text files):

Please cite:

Detecting Sequence Signals in Targeting Peptides Using Deep Learning
José Juan Almagro Armenteros, Marco Salvatore, Ole Winther, Olof Emanuelsson, Gunnar von Heijne, Arne Elofsson, and Henrik Nielsen
Life Science Alliance 2 (5), e201900429. doi:10.26508/lsa.201900429 (Open access)
The source code for training and running TargetP 2.0 is available under the creative commons CC BY-NC-SA license from Github.

Version history


2.0 The current server. New in this version:
  • Deep learning: TargetP 2.0 is based on convolutional and recurrent (LSTM) neural networks with a multi-attention layer. The deep recurrent neural network architecture is better suited to recognizing sequence motifs of varying length, such as signal or transit peptides, than traditional feed-forward neural networks (as used in TargetP 1).
  • Thylakoid lumen proteins: TargetP 2.0 is now able to predict thylakoid luminal transit peptides (luTPs), which are composed of a chloroplast transit peptide (cTP) followed by a second peptide similar to a bacterial signal peptide.
1.1 The original server. Based on feed-forward neural networks.

Software Downloads




GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0). If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: