DTU Health Tech

Department of Health Technology

We recently made large changes to the webserver infrastructure, so you might experience errors. Please report issues to health-master@dtu.dk

DeepLocPro - 1.0

Prediction of prokaryotic protein subcellular localization using deep learning

DeepLocPro predicts the subcellular localization of prokaryotic proteins. It can differentiate between 6 different localizations: Cytoplasm, Cytoplasmic membrane, Periplasm, Outer membrane, Cell wall and surface, and extracellular space.

Eukaryotic proteins: To predict the locations of proteins in eukaryotes, use DeepLoc.


Submission


Submit data

Paste or upload protein sequence(s) as fasta format to predict the subcellular localization. A maximum of 500 sequences is allowed. The prediction can take a few seconds per sequence depending on the model selected.


Protein sequences should be not less than 10 and not more than 6000 amino acids.
Mirror Use DeepLocPro on BioLib if this server is heavily loaded.




Example proteins:
Format directly from your local disk:


Input organism group
Any
Archaea
Gram negative
Gram positive
Output format:
Long output
Short output (no figures)

Instructions/Help


The DeepLocPro 1.0 server predicts the subcellular localization of prokaryotic proteins using a neural network based algorithm trained on Uniprot and ePSORTdb proteins with experimental evidence of subcellular localization. It only uses the sequence information to perform the prediction. The importance of each amino acid in the predicted localization is also included as an "attention" plot. Positions in the sequence with a high attention value are deemed more relevant for the prediction. This does not mean that a particular amino acid is very important for the prediction but that a region in the neighbourhood of those positions has more weight in the final prediction of the model.

The DeepLocPro 1.0 server also accepts the organism group of the input sequences as input.

  • When specifying Archaea or Gram positive, Predictions for periplasm and outer membrane are suppressed and remapped to extracellular. For Any or Gram negative, no post-processing is applied.

The DeepLocPro 1.0 server requires protein sequence(s) in fasta format, and can not handle nucleic acid sequences.

Two different versions of the output can be selected before running DeepLocPro 1.0. The long output will generate an attention plot per sequence while the short output will not generate any plots.

Paste protein sequence(s) in fasta format or upload a fasta file.

After the server successfully finishes the job, a summary page shows up. If an error happens during the prediction a log will appear specifying the error.

Output format


The DeepLocPro output is composed of three main components:

  • The Predicted localization displays the subcellular localization predicted for the query protein.
  • The Probability table displays the probability assigned by the model to each of the subcellular localizations.
  • The Feature importance displays a logo-like plot of the positions in the query protein with higher importance for the prediction.

Output format

Training and testing data sets


The dataset used to train and test the DeepLocPro 1.0 server is available here:

References


Please cite:

Predicting the subcellular location of prokaryotic proteins with DeepLocPro.
Jaime Moreno, Henrik Nielsen, Ole Winther, Felix Teufel.
Biorxiv 2024.01.04.574157; doi: https://doi.org/10.1101/2024.01.04.574157

Abstract

Protein subcellular location prediction is a widely explored task in bioinformatics because of its importance in proteomics research. We propose DeepLocPro, an extension to the popular method DeepLoc, tailored specifically to archaeal and bacterial organisms. DeepLocPro is a multiclass subcellular location prediction tool for prokaryotic proteins, trained on experimentally verified data curated from UniProt and PSORTdb. DeepLocPro compares favorably to the PSORTb 3.0 ensemble method, surpassing its performance across multiple metrics on our benchmark experiment.


GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0). If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: