DTU Health Tech

Department of Health Technology

NetMHCII - 2.3

Binding of peptides to MHC class II molecules

The NetMHCII 2.3 server predicts binding of peptides to HLA-DR, HLA-DQ, HLA-DP and mouse MHC class II alleles using artificial neuron networks.

Predictions can be obtained for 25 HLA-DR alleles, 20 HLA-DQ, 9 HLA-DP, and 7 mouse H2 class II alleles.

The prediction values are given in nM IC50 values, and as a %-Rank to a set of 1,000,000 random natural peptides. Strong and weak binding peptides are indicated in the output.

Note, if you download the stand alone version of the tool, please access the needed data.tar.gz file from data.Linux.tar.gz (Linux) or data.Darwin.tar.gz (MAC)

Submission


Type of input

Paste a single sequence or several sequences in FASTA format into the field below:

or submit a file in FASTA format directly from your local disk:

Peptide length  

Select Loci

Select Allele (max 15 per submission) or type allele names (ie DRBX_XXXX) seperated by commas (max 15 per submission)  


Threshold  
Threshold for strong binder (% Rank)  
Threshold for weak binder (% Rank)  
Sort by affinity 

Restrictions:
At most 5000 sequences per submission; each sequence not more than 20,000 amino acids and not less than 9 amino acids.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:

  • Improved methods for predicting peptide binding affinity to MHC class II molecules.
    Jensen KK, Andreatta M, Marcatili P, Buus S, Greenbaum JA, Yan Z, Sette A, Peters B, and Nielsen M.
    PMID: 29315598

Instructions


1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y and X

All the other letters will be converted to X before processing. All non-characters will be removed before processing. The sequences can be input in the following two ways:

  • Paste a single sequence (just the amino acids) or a number of sequences in FASTA format into the upper window of the main server page.

  • Select a FASTA file on your local disk, either by typing the file name into the lower window or by browsing the disk.

Both ways can be employed at the same time: all the specified sequences will be processed. However, there may be not more than 10 sequences in total in one submission. The sequences shorter than 15 or longer than 4000 amino acids will be ignored.


2. Customize your run

Select the allele(s) you want to make predictions for.

To limits the amount of output data prduced select a Threshold value. Only predictions with a score greater than the threshold value will be displayed.


3. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.



Article abstract


Improved methods for predicting peptide binding affinity to MHC class II molecules.
Jensen KK, Andreatta M, Marcatili P, Buus S, Greenbaum JA, Yan Z, Sette A, Peters B, Nielsen M
Immunology. 2018 Jan 6. doi: 10.1111/imm.12889

Major histocompatibility complex class II (MHC-II) molecules are expressed on the surface of professional antigen-presenting cells where they display peptides to T helper cells, which orchestrate the onset and outcome of many host immune responses. Understanding which peptides will be presented by the MHC-II molecule is therefore important for understanding the activation of T helper cells and can be used to identify T-cell epitopes. We here present updated versions of two MHC-II-peptide binding affinity prediction methods, NetMHCII and NetMHCIIpan. These were constructed using an extended data set of quantitative MHC-peptide binding affinity data obtained from the Immune Epitope Database covering HLA-DR, HLA-DQ, HLA-DP and H-2 mouse molecules. We show that training with this extended data set improved the performance for peptide binding predictions for both methods.

PMID: 29315598


Output format



DESCRIPTION

In the header of the out is indicated which prediction method and HLA allele were selected, as well as the corresponding two threshold values defining high binding peptides, and weak binding peptides. High binding peptides have an IC50 value below 50 nM, and weak binding peptides an IC50 values below 500 nM. For Artificial Neural network (ANN) prediction the output consists of 6 columns. For weight matrix predictions the output consists of 5 columns:
  • Residue number.
  • Peptide sequence (9mer)
  • Prediction score (called 1-log50K(aff) for ANN predictions)
  • Affinity as IC50 value in nM (only for ANN predictions)
  • Bind Level (WB for weak binder, SB for strong binder)
  • Sequence name

    The predictions for each protein are summarized with a line stating the number of high and weak binding peptides identified.




  • EXAMPLE OUTPUT

    
    NetMHCII version 2.2. 
    
    Strong binder threshold  50.00. Weak binder threshold 500.00.
    
    -----------------------------------------------------------------------------------------------
          Allele  pos          peptide       core 1-log50k(aff) affinity(nM) Bind Level  %Random     Identity
    ------------------------------------------------------------------------------------------------
    HLA-DRB10301    0  ASQKRPSQRHGSKYL  SQKRPSQRH        0.0444      30932.9               50.00   seq2 optio
    HLA-DRB10301    1  SQKRPSQRHGSKYLA  SQRHGSKYL        0.0456      30519.7               50.00   seq2 optio
    HLA-DRB10301    2  QKRPSQRHGSKYLAT  SQRHGSKYL        0.0492      29375.3               50.00   seq2 optio
    HLA-DRB10301    3  KRPSQRHGSKYLATA  SQRHGSKYL        0.0581      26676.2               50.00   seq2 optio
    HLA-DRB10301    4  RPSQRHGSKYLATAS  SQRHGSKYL        0.0528      28231.7               50.00   seq2 optio
    ...
    ------------------------------------------------------------------------------------------------
    
    Allele: HLA-DRB10301. Number of high binders 0. Number of weak binders 5. Number of peptides 138
    
    ------------------------------------------------------------------------------------------------
    

    Training and Evaluation Data


    NN-align. A neural network-based alignment algorithm for MHC class II peptide binding prediction.

    Here, you will find the data set used for training and evaluation of the NN-align method. Fourteen HLA-DR and four mouse class II alleles are included in the benchmark. Follwing the links below you will be directed to a directory containing the data for each allele. Each directory contains 6 data files. The files c000, c001, c002, c003, and c004 contain the split datafile used for cross validation. If for instance the file c004 is used as evaulation set, the other four file c000, c001, c002, and c003 are used as training date. The file all contains all data (i.e. cat c00?).

    The format for each of the files (c00?, all) is

    ACRVKHDSMAEPKTVY 0.227054
    AKRVVRDPQGIRAWV 0.024247
    AQFMWIIRKRIQLP 0.803966
    ATSTKKLHKEPATLIKAIDG 0.000000
    AWVAWRNRCK 0.340978
    CYVSGFHPSDIEVDLL 0.047212
    DGKTPRAVNACGIN 0.000000
    ERAEAWRQKLHGRL 0.614743
    

    where the first column gives the peptide sequence, and the second column the log50k transformed binding affinity (i.e. 1 - log50k( aff nM)).

    When classifying the peptides into binders and non-binders, a threshold of 500 nM is used. This means that peptides with log50k transformed binding affinity values greater than 0.426 are classified as binders.

    DRB1*0101 datasets
    DRB1*0301 datasets
    DRB1*0401 datasets
    DRB1*0404 datasets
    DRB1*0405 datasets
    DRB1*0701 datasets
    DRB1*0802 datasets
    DRB1*0901 datasets
    DRB1*1101 datasets
    DRB1*1302 datasets
    DRB1*1501 datasets
    DRB3*0101 datasets
    DRB4*0101 datasets
    DRB5*0101 datasets
    H2-IAb datasets
    H2-IAd datasets
    H2-IAs datasets

    References

    Morten Nielsen NN-align. A neural network-based alignment algorithm for MHC class II peptide binding prediction.

    Version history


    2.3 The current server (online since Sept, 2017). New in this version:
    • The method has been retrained on an extended data set
    2.2 The current server (online since April 14, 2010). New in this version:
    • The method has been retrained on an extended data set including six HLA-DQ and six HLA-DP alleles.
    2.1 The current server (online since 22 November 2009). New in this version:
    • The method has been retrained on an extended data set including four HLA-DQ and four HLA-DP alleles.
    2.0 Online since 30 September 2009. New in this version:
    • The NN-align method was applied to training the predictor for 14 HLA-DR and three mouse MHC class II alleles.
    • Publication: A neural network-based alignment algorithm for MHC class II peptide binding prediction. Nielsen M and Lund O. BMC Bioinformatics. 2009 Sep 18;10:296.
    1.1 Online since 20 August 2009. New in this version:
    • The method has been retrained on an extended data set including four HLA-DQ and four HLA-DP alleles.
    1.0a Retrained version trained on binding data from Wang et al. PLoS Comput Biol. 2008 Apr 4;4(4).
    • For comparisons the SMM-align method has been retrained on the Wang benchmark data set using 10 fold cross validation. The predictive performance values are available online at Wang Benchmark
    1.0 Original version (online version until August 2009):

    Main publication: Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method. Morten Nielsen, Claus Lundegaard, and Ole Lund. BMC Bioinformatics: 8: 238, 2007.

    • Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method.
      Morten Nielsen*, Claus Lundegaard, and Ole Lund.
      BMC Bioinformatics: 8: 238, 2007.
      View the article at BMC Bioinformatics: Full text

      NOTE! The published version of the manuscript has an inconsistent reference listing. For a correct reference listing click here

    Software Downloads




    GETTING HELP

    If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

    If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

    Correspondence: Technical Support: