The DeepLocPro 1.0 server predicts the subcellular localization of prokaryotic proteins using a neural network based
algorithm trained on Uniprot and ePSORTdb proteins with experimental evidence of subcellular localization.
It only uses the sequence information to perform the prediction.
The importance of each amino acid in the predicted localization is also included as an "attention" plot.
Positions in the sequence with a high attention value are deemed more relevant for the prediction.
This does not mean that a particular amino acid is very important for the prediction but that a region in the neighbourhood of those positions has more weight in the final prediction of the model.
The DeepLocPro 1.0 server also accepts the organism group of the input sequences as input.
- When specifying Archaea or Gram positive, Predictions for periplasm and outer membrane are suppressed and
remapped to extracellular. For Any or Gram negative, no post-processing is applied.
The DeepLocPro 1.0 server requires protein sequence(s) in fasta format, and can not handle nucleic acid sequences.
Two different versions of the output can be selected before running DeepLocPro 1.0. The long output will generate an attention plot per sequence while the short output will not generate any plots.
Paste protein sequence(s) in fasta format or upload a fasta file.
After the server successfully finishes the job, a summary page shows up.
If an error happens during the prediction a log will appear specifying the error.
The DeepLocPro output is composed of three main components:
- The Predicted localization displays the subcellular localization predicted for the query protein.
- The Probability table displays the probability assigned by the model to each of the subcellular localizations.
- The Feature importance displays a logo-like plot of the positions in the query protein with higher importance for the prediction.
Training and testing data sets
The dataset used to train and test the DeepLocPro 1.0 server is available here:
Predicting the subcellular location of prokaryotic proteins with DeepLocPro.
Jaime Moreno, Henrik Nielsen, Ole Winther, Felix Teufel.
2024.01.04.574157; doi: https://doi.org/10.1101/2024.01.04.574157
Protein subcellular location prediction is a widely explored task in bioinformatics because of its importance in proteomics research. We propose DeepLocPro, an extension to the popular method DeepLoc, tailored specifically to archaeal and bacterial organisms. DeepLocPro is a multiclass subcellular location prediction tool for prokaryotic proteins, trained on experimentally verified data curated from UniProt and PSORTdb. DeepLocPro compares favorably to the PSORTb 3.0 ensemble method, surpassing its performance across multiple metrics on our benchmark experiment.