DTU Health Tech

Department of Health Technology

We recently made large changes to the webserver infrastructure, so you might experience errors. Please report issues to health-master@dtu.dk

NetPhosBac - 1.0

Generic phosphorylation sites in bacterial proteins

The NetPhosBac 1.0 server predicts serine and threonine phosphorylation sites in bacterial proteins. This service is closely related to NetPhos,NetPhosK and NetPhosYeast.

Submission


Sequence submission: paste the sequence(s) and/or upload a local file

Paste a single sequence or several sequences in FASTA format into the field below:

Submit a file in FASTA format directly from your local disk:

Generate graphics         Output in GFF


Restrictions:
At most 2,000 sequences and 200,000 amino acids per submission; each sequence not longer than 6,000 amino acids.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:

NetPhosBac - A predictor for Ser/Thr phosphorylation sites in bacterial proteins.
Martin Lee Miller, Boumediene Soufi, Carsten Jers, Nikolaj Blom, Boris Macek and Ivan Majakovic.
Proteomics. 2008 Dec 3. PUBMED

Instructions



1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y and X (unknown)

All the other symbols will be converted to X before processing. The sequences can be input in the following two ways:

  • Paste a single sequence (just the amino acids) or a number of sequences in FASTA format into the upper window of the main server page.

  • Select a FASTA file on your local disk, either by typing the file name into the lower window or by browsing the disk.

Both ways can be employed at the same time: all the specified sequences will be processed. However, there may be not more than 2,000 sequences and 200,000 amino acids in total in one submission. The sequences longer than 6,000 amino acids are not allowed.

2. Customize your run

By default the server produces graphical output illustrating the predictions (in GIF). The graphs can be very valuable for locating the "hot" spots in your proteins. The generation of graphics can be disabled by un-checking the button labelled 'Generate graphics'.

If the button labelled 'Output in GFF' is checked the text output of the server will be in GFF

3. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.



Output format



DESCRIPTION

For each input sequence the length and the name of the sequence are stated followed by a table with the prediction results. There is a table row for each serine or threonine residue in the sequence; the columns are:

  • sequence name, truncated to 20 characters;

  • residue position in the sequence;

  • residue: serine (S) or threonine (T);

  • score, a number between 0 and 1; when the score is above 0.5 the residue is a predicted phosphorylation site;

  • kinase: the current version of NetPhosBac does not make kinase specific predictions;

  • answer: either the word "YES" or a dot ("."), reflecting the score.

After the table, the whole sequence is printed alongside a summary of the predicted glycation sites and their positions.

Finally, if the 'Generate graphics' button has been checked, the server displays a figure in GIF showing a plot of the score for each serine or threonine residue against the sequence position of that residue. -->


EXAMPLE OUTPUT

The example below shows the output for the UniProt entry P0A6C7 --> Amino-acid acetyltransferase -->
>sp_P0A6C7_ARGA_ECO57	443 amino acids
#
# netphosbac-1.0a prediction results
#
# Sequence		   # x   Context     Score   Kinase    Answer
# -------------------------------------------------------------------
# sp_P0A6C7_ARGA_ECO57     7 T   KERKTELVE   0.319   main        .
# sp_P0A6C7_ARGA_ECO57    16 S   GFRHSVPYI   0.591   main        Y
# sp_P0A6C7_ARGA_ECO57    22 T   PYINTHRGK   0.482   main        .
# sp_P0A6C7_ARGA_ECO57    27 T   HRGKTFVIM   0.474   main        .
# sp_P0A6C7_ARGA_ECO57    43 S   HENFSSIVN   0.461   main        .
# sp_P0A6C7_ARGA_ECO57    44 S   ENFSSIVND   0.498   main        .
# sp_P0A6C7_ARGA_ECO57    54 S   GLLHSLGIR   0.673   main        Y
# sp_P0A6C7_ARGA_ECO57    89 T   NIRVTDAKT   0.199   main        .
# sp_P0A6C7_ARGA_ECO57    93 T   TDAKTLELV   0.420   main        .
# sp_P0A6C7_ARGA_ECO57   103 T   QAAGTLQLD   0.171   main        .
# sp_P0A6C7_ARGA_ECO57   109 T   QLDITARLS   0.452   main        .
# sp_P0A6C7_ARGA_ECO57   113 S   TARLSMSLN   0.271   main        .
# sp_P0A6C7_ARGA_ECO57   115 S   RLSMSLNNT   0.306   main        .
# sp_P0A6C7_ARGA_ECO57   119 T   SLNNTPLQG   0.160   main        .
# sp_P0A6C7_ARGA_ECO57   130 S   INVVSGNFI   0.439   main        .
# sp_P0A6C7_ARGA_ECO57   150 S   DYCHSGRIR   0.839   main        Y
# sp_P0A6C7_ARGA_ECO57   167 S   RQLDSGAIV   0.488   main        .
# sp_P0A6C7_ARGA_ECO57   179 S   PVAVSVTGE   0.305   main        .
# sp_P0A6C7_ARGA_ECO57   181 T   AVSVTGESF   0.285   main        .
# sp_P0A6C7_ARGA_ECO57   184 S   VTGESFNLT   0.499   main        .
# sp_P0A6C7_ARGA_ECO57   188 T   SFNLTSEEI   0.202   main        .
# sp_P0A6C7_ARGA_ECO57   189 S   FNLTSEEIA   0.418   main        .
# sp_P0A6C7_ARGA_ECO57   194 T   EEIATQLAI   0.228   main        .
# sp_P0A6C7_ARGA_ECO57   210 S   IGFCSSQGV   0.476   main        .
# sp_P0A6C7_ARGA_ECO57   211 S   GFCSSQGVT   0.413   main        .
# sp_P0A6C7_ARGA_ECO57   215 T   SQGVTNDDG   0.184   main        .
# sp_P0A6C7_ARGA_ECO57   223 S   GDIVSELFP   0.342   main        .
# sp_P0A6C7_ARGA_ECO57   245 S   GDYNSGTVR   0.533   main        Y
# sp_P0A6C7_ARGA_ECO57   247 T   YNSGTVRFL   0.359   main        .
# sp_P0A6C7_ARGA_ECO57   260 S   KACRSGVRR   0.776   main        Y
# sp_P0A6C7_ARGA_ECO57   269 S   CHLISYQED   0.312   main        .
# sp_P0A6C7_ARGA_ECO57   282 S   QELFSRDGI   0.319   main        .
# sp_P0A6C7_ARGA_ECO57   288 T   DGIGTQIVM   0.264   main        .
# sp_P0A6C7_ARGA_ECO57   294 S   IVMESAEQI   0.516   main        Y
# sp_P0A6C7_ARGA_ECO57   302 T   IRRATINDI   0.318   main        .
# sp_P0A6C7_ARGA_ECO57   326 S   LVRRSREQL   0.456   main        .
# sp_P0A6C7_ARGA_ECO57   338 T   IDKFTIIQR   0.451   main        .
# sp_P0A6C7_ARGA_ECO57   345 T   QRDNTTIAC   0.442   main        .
# sp_P0A6C7_ARGA_ECO57   346 T   RDNTTIACA   0.309   main        .
# sp_P0A6C7_ARGA_ECO57   374 S   PDYRSSSRG   0.287   main        .
# sp_P0A6C7_ARGA_ECO57   375 S   DYRSSSRGE   0.771   main        Y
# sp_P0A6C7_ARGA_ECO57   376 S   YRSSSRGEV   0.701   main        Y
# sp_P0A6C7_ARGA_ECO57   392 S   QAKQSGLSK   0.288   main        .
# sp_P0A6C7_ARGA_ECO57   395 S   QSGLSKLFV   0.581   main        Y
# sp_P0A6C7_ARGA_ECO57   401 T   LFVLTTRSI   0.432   main        .
# sp_P0A6C7_ARGA_ECO57   402 T   FVLTTRSIH   0.362   main        .
# sp_P0A6C7_ARGA_ECO57   404 S   LTTRSIHWF   0.729   main        Y
# sp_P0A6C7_ARGA_ECO57   414 T   ERGFTPVDI   0.359   main        .
# sp_P0A6C7_ARGA_ECO57   424 S   LLPESKKQL   0.294   main        .
# sp_P0A6C7_ARGA_ECO57   435 S   YQRKSKVLM   0.548   main        Y
#
    MVKERKTELVEGFRHSVPYINTHRGKTFVIMLGGEAIEHENFSSIVNDIG   #     50
    LLHSLGIRLVVVYGARPQIDANLAAHHHEPLYHKNIRVTDAKTLELVKQA   #    100
    AGTLQLDITARLSMSLNNTPLQGAHINVVSGNFIIAQPLGVDDGVDYCHS   #    150
    GRIRRIDEDAIHRQLDSGAIVLMGPVAVSVTGESFNLTSEEIATQLAIKL   #    200
    KAEKMIGFCSSQGVTNDDGDIVSELFPNEAQARVEAQEEKGDYNSGTVRF   #    250
    LRGAVKACRSGVRRCHLISYQEDGALLQELFSRDGIGTQIVMESAEQIRR   #    300
    ATINDIGGILELIRPLEQQGILVRRSREQLEMEIDKFTIIQRDNTTIACA   #    350
    ALYPFPEEKIGEMACVAVHPDYRSSSRGEVLLERIAAQAKQSGLSKLFVL   #    400
    TTRSIHWFQERGFTPVDIDLLPESKKQLYNYQRKSKVLMADLG          #    450
%1  ...............S..................................   #     50
%1  ...S..............................................   #    100
%1  .................................................S   #    150
%1  ..................................................   #    200
%1  ............................................S.....   #    250
%1  .........S.................................S......   #    300
%1  ..................................................   #    350
%1  ........................SS..................S.....   #    400
%1  ...S..............................S........
.


References



NetPhosBac - A predictor for Ser/Thr phosphorylation sites in bacterial proteins. in bacteria.
Martin Lee Miller1,4, Boumediene Soufi2,4, Carsten Jers2, Nikolaj Blom1 Boris Macek3 and Ivan Majakovic2.
Proteomics. 2008 Dec 3. PUBMED

1Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Lyngby, Denmark
2Center for Microbial Biotechnology, Department of Systems Biology, Technical University of Denmark, DK-2800 Lyngby, Denmark
3Department of Proteomics and Signal Transduction, Max-Planck-Institute for Biochemistry, DE-82152 Martinsried, Germany

4These authors contributed equally to this work.


Abstract

There is ample evidence for the involvement of protein phosphorylation on serine/threonine/tyrosine in bacterial signaling and regulation, but very few exact phosphorylation sites have been experimentally determined. Recently, gel-free high accuracy MS studies reported over 150 phosphorylation sites in two bacterial model organisms Bacillus subtilis and Escherichia coli. Interestingly, the analysis of these phosphorylation sites revealed that most of them are not characteristic for eukaryotic-type protein kinases, which explains the poor performance of eukaryotic data-trained phosphorylation predictors on bacterial systems. We used these large bacterial datasets and neural network algorithms to create the first bacteria-specific protein phosphorylation predictor: NetPhosBac. With respect to predicting bacterial phosphorylation sites, NetPhosBac significantly outperformed all benchmark predictors. Moreover, NetPhosBac predictions of phosphorylation sites in E. coli proteins were experimentally verified on protein and site-specific levels. In conclusion, NetPhosBac clearly illustrates the advantage of taxa-specific predictors and we hope it will provide a useful asset to the microbiological community.




GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0). If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: