DTU Health Tech

Department of Health Technology

NetPhospan - 1.0

Prediction of phosphorylation using convolutional neural networks (CNNs).

NetPhospan server predicts phophorylation from any human kinase of known sequence using convolutional neural networks (CNNs). The method is trained on more than 8,700 reported phosphorylation sites by 120 different human protein kinase from homo sapiens. Furthermore, the server admits custom kinases provided as full length sequences in FASTA format.

Predictions can be made for peptides of length 21.

Link to table (tab seperated) describing the training data Training data table

SUBMISSION

Hover the mouse cursor over the symbol for a short description of the options

Type of input

Paste a single sequence or several sequences in FASTA format into the field below:

or submit a file in FASTA format directly from your local disk:

Method Selection
Pan-specific method
Generic method

Select kinase group

Select Kinase (max 20 per submission) or type Kinase gene name (e.g. AKT1) separated by commas (and no spaces). (Max 20 kinases per submission)

For list of allowed kinase names click here List of Kinase gene names.

or paste a single full length kinase domain protein sequence in FASTA format into the field below:

or submit a file containing a full kinase domain protein sequence in FASTA format directly from your local disk:



Sort by predicted score 

Save predictions to XLS file 

Restrictions:
At most 5000 sequences per submission; each sequence not more than 20,000 amino acids and not less than 8 amino acids. Max 20 kinase domains per submission.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:

A generic Deep Convolutional Neural Network framework for prediction of Receptor-ligand Interactions. NetPhosPan; Application to Kinase Phosphorylation prediction.
Emilio Fenoy, Jose M. G. Izarzugaza, Vanessa Jurtz, Søren Brunak and Morten Nielsen.
Bioinformatics (2018).
Full text  


DATA RESOURCES

Kinase domain sequences were obtained from

  • KinBase database. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The protein kinase complement of the human genome. Science. 2002 Dec 6;298(5600):1912-34.

Phosphorylated sequences were obtained from

  • Phospho.ELM database. Craveur P, Rebehmed J, de Brevern AG; PTM-SD: a database of structurally resolved and annotated posttranslational modifications in proteins. Database (Oxford) 2014; 2014 bau041. doi: 10.1093/database/bau041.
  • PhosphositePlus database. Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, Skrzypek E; PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res 2015; 43 (D1): D512-D520. doi: 10.1093/nar/gku1267

Usage instructions



1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y and X (unknown)

All the other symbols will be converted to X before processing.

The server allows for input in either FASTA or PEPTIDE format.

Note that for Peptide input, all peptides MUST of equal length. Note also, that you must click the box Click if input is PEPTIDE format if the input is in peptide format.

The sequences can be input in the following two ways:

  • Paste a single sequence (just the amino acids) or a number of sequences in FASTA format or a list of peptides into the upper window of the main server page.

  • Select a FASTA or PEPTIDE file on your local disk, either by typing the file name into the lower window or by browsing the disk.

Both ways can be employed at the same time: all the specified sequences will be processed. However, there may be not more than 10 sequences in total in one submission.
The sequences shorter than 15 or longer than 10000 amino acids will be ignored.


2. Customize your run

Use Method Selection to specify if you want to use the Pan-specific predictor or the Generic predictor. The Pan-specific method is trained with peptides and kinase sequences and predicts phosphorylation for the selected (or provided) kinases. The Generic method was trained only with peptide sequences, without information on the kinase side.

If the method selected was Pan-specific, select the kinase(s) you want to make predictions for from the scroll-down menu, or type in the kinase names separated by commas (without blank spaces).
If the kinase that you are looking for is not in the list, a full length protein kinase domain sequence can be submitted.

Select one option from Sort by score to have the output sorted by descending order.

Click the box save prediction to xls file to save the raw prediction output to an excel file. This file will be available in the bottom of the results output file.


3. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated.
The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.

Output format



DESCRIPTION

The prediction output consists of 6 columns.

  • Residue number
  • Kinase
  • Peptide sequence
  • Protein identifier
  • Prediction score
  • Flag - YES/NO desition, only available for kinases in training set



  • OUTPUT EXAMPLE

    
    
    
    # NetPhospan version 1.0
    
    # Tmpdir made /scratch/netPhospanq4riRd
    # Input is in FSA format
    
    # Peptide length 21
    
    -----------------------------------------------------------------------------------------------------
     pos          kin         peptide               Identity      Pred     Flag
    -----------------------------------------------------------------------------------------------------
        0        PKACA GEIYDALDMLTRENVALKVES           TTBK2      0.028      0
        0        PKACA TRENVALKVESAQQPKQVLKM           TTBK2      0.192      0
        0        PKACA QGRNLADLRRSQSRGTFTIST           TTBK2      0.755      1
        0        PKACA RNLADLRRSQSRGTFTISTTL           TTBK2      0.847      1
        0        PKACA ADLRRSQSRGTFTISTTLRLG           TTBK2      0.602      1
        0        PKACA LRRSQSRGTFTISTTLRLGRQ           TTBK2      0.306      0
        0        PKACA RSQSRGTFTISTTLRLGRQIL           TTBK2      0.305      0
    -----------------------------------------------------------------------------------------------------
    
    Protein TTBK2. Kinase PKACA. Number of peptides 24
    
    -----------------------------------------------------------------------------------------------------
    

    Software Downloads




    GETTING HELP

    If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

    If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

    Correspondence: Technical Support: