NetMHCpan server predicts binding of peptides to any known MHC molecule using artificial neural networks (ANNs). The method is trained on more than 150,000 quantitative binding data covering more than 150 different MHC molecules. Predictions can be made for HLA-A, B, C, E and G alleles, as well as for non-human primates, mouse, Cattle and pig. Further, the user can upload full length MHC protein sequences, and have the server predict MHC restricted peptides from any given protein of interest.
Version 2.8 has been retrained on extented data set including 10 prevalent HLA-C and 7 prevalent BoLA MHC-I molecules.
Predictions can be made for 8-14 mer peptides. Note, that all non 9mer predictions are made using approximations. Most HLA molecules have a strong preference for binding 9mers.
The prediction values are given in nM IC50 values and as %-Rank to a set of 200.000 random natural peptides. For alleles distant to the MHC molecules included in the training of the method, only the Rank score is provided.
The project is a collaboration between CBS, IMMI at Copenhagen University and LIAI.
Link to table (tab seperated) describing the training data Training data table
As of July 8th, the nomenclature for BoLA-I has been updated to follow IPD Release 1.3.
For publication of results, please cite:
Data resources used to develop this server was obtained from
All the other symbols will be converted to X before processing.
The server allows for input in either FASTA or PEPTIDE format.
Note that for Peptide input, all peptides MUST of equal length. Note also, that you must click the box Click if input is PEPTIDE format if the input is in peptide format.
The sequences can be input in the following two ways:
Both ways can be employed at the same time: all the specified sequences will
be processed. However, there may be not more than 10 sequences
in total in one submission. The sequences shorter than 15
or longer than 10000 amino acids will be ignored.
Select the allele(s) you want to make predictions for from the scroll-down menu (select multiple alleles using the ctrl key), or type in the allele names separated by commas (with out blank spaces).
Give threshold value for binding values to be displayed.
Click the box Sort by affinity to have the output sorted by descending predicted binding affinity
Click the box save prediction to xls file to save the raw prediction output to an excel file. This file
will be available in the bottum of the results output file.
At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.
# NetMHCpan version 2.8 # Input is in FSA format HLA-A0101 : Estimated prediction accuracy 0.811 (using nearest neighbor HLA-A0101) # Threshold for Strong binding peptides 50.000 # Threshold for Weak binding peptides 500.000 ----------------------------------------------------------------------------------- pos HLA peptide Identity 1-log50k(aff) Affinity(nM) %Random BindLevel ----------------------------------------------------------------------------------- 0 HLA-A*0101 ASQKRPSQR seq2_optional_c 0.063 25230.98 32.00 1 HLA-A*0101 SQKRPSQRH seq2_optional_c 0.023 38824.58 50.00 2 HLA-A*0101 QKRPSQRHG seq2_optional_c 0.003 48254.07 50.00 3 HLA-A*0101 KRPSQRHGS seq2_optional_c 0.009 45287.57 50.00
Main reference:
NetMHCpan - MHC class I binding prediction beyond humans
Hoof I1,
Peter B3,
Sidney J3,
Pedersen LE2
Lund O1,
Buus S2,
Nielsen M1,
Immunogenetics. 2009 Jan;61(1):1-13.
1Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
2Division of Experimental Immunology,
Institute of Medical Microbiology and Immunology,
University of Copenhagen, Denmark
3La Jolla Institute for Allergy and Immunology, San Diego, California, United States of America
Binding of peptides to major histocompatibility complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC genomic region (called HLA) is extremely polymorphic comprising several thousand alleles, each encoding a distinct MHC molecule. The potentially unique specificity of the majority of HLA alleles that have been identified to date remains uncharacterized. Likewise, only a limited number of chimpanzee and rhesus macaque MHC class I molecules have been characterized experimentally. Here, we present NetMHCpan-2.0, a method that generates quantitative predictions of the affinity of any peptide-MHC class I interaction. NetMHCpan-2.0 has been trained on the hitherto largest set of quantitative MHC binding data available, covering HLA-A and HLA-B, as well as chimpanzee, rhesus macaque, gorilla, and mouse MHC class I molecules. We show that the NetMHCpan-2.0 method can accurately predict binding to uncharacterized HLA molecules, including HLA-C and HLA-G. Moreover, NetMHCpan-2.0 is demonstrated to accurately predict peptide binding to chimpanzee and macaque MHC class I molecules. The power of NetMHCpan-2.0 to guide immunologists in interpreting cellular immune responses in large out-bred populations is demonstrated. Further, we used NetMHCpan-2.0 to predict potential binding peptides for the pig MHC class I molecule SLA-1*0401. Ninety-three percent of the predicted peptides were demonstrated to bind stronger than 500 nM. The high performance of NetMHCpan-2.0 for non-human primates documents the method's ability to provide broad allelic coverage also beyond human MHC molecules. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.
PMID: 19002680
Here, you will find the data set used for evaluation in the above paper. The data falls in five parts a) Non-human primates, b) HLA-A and HLA-B ligand c) HLA-E ligands d) SYFPEITHI HLA-C ligands, and e) SYFPEITHI HLA-G ligands.
a) Non-human primates. The format for data is
Allele Peptide log50k Mamu-A01 YPPMMCYFL 1.0 Mamu-A01 NSPLHCYTM 1.0 Mamu-A01 ITPQPVPTA 0.482918 Mamu-A01 LTPIFSDLL 0.790002 Mamu-A01 GSPTNLEFI 0.634812 Mamu-A01 DSPHYVPIL 0.682619 Mamu-A01 TLPELNLSL 0.787187 Mamu-A01 ASPRIGDQL 0.945674 Mamu-A01 FSPFKLNLI 1.0 Mamu-A01 MIPLLFILF 0.911688
where the first column gives the allele, and the second column gives the peptide and the last column the log50k transformed binding affinity (i.e. 1 - log50k( aff nM)).
When classifying the peptides into binders and non-binders, a threshold of 500 nM is used. This means that peptides with log50k transformed binding affinity values greater than 0.426 are classified as binders.
b) HLA-A and HLA-B ligands. The file contains 596 HLA-A and HLA-B ligands downloaded from the SYFPEITHI database. The FASTA header for each entry has the format
>uniprot|A8KA43 227 KRFGKAYNL B2705
where the first column is the protein identifier, the second column is the location of the HLA ligand in the protein sequence, the third column is the HLA ligand, and the last column is the HLA restriction.
c) HLA-E ligands. The file contains seven HLA-E ligands downloaded from the IEDB database. All ligands are frm the same source protein, and the file contains all 9mer peptides form the source protein (P0A1D4) with the ligands annotated with the value 1 and all other peptides with the value 0. The format of the data is
MAAKDVKFG 0 AAKDVKFGN 0 AKDVKFGND 0 KDVKFGNDA 0 DVKFGNDAR 0 VKFGNDARV 0 KFGNDARVK 0 FGNDARVKM 0 GNDARVKML 0 NDARVKMLR 0
d) HLA-C ligands. The file contains the source proteins for 77 HLA-C ligands from the SYFPEITHI database in FASTA format. The FASTA header for each entry has the format
>gnl|BL_ORD_ID|54508 244 FAPYNKPSL Cw0102
where the first column is the protein identifier, the second column is the location of the HLA ligand in the protein sequence, the third column is the HLA ligand, and the last column is the HLA restriction.
e) HLA-G (HLA-G*0101) ligands. The file contains the source proteins for 11 HLA-G ligands from the SYFPEITHI database in FASTA format. The FASTA header for each entry has the format
>sp|P49327|FAS_HUMAN 751 HVPEHAVVL
where the first column is the protein identifier, the second column is the location of the HLA ligand in the protein sequence, and the third column is the HLA ligand.
a) Non-human primates. NOTE. This data set has been updated Aug 12. 2009, so that it now corresponds to the data presented in the NetMHCpan-2.0 publication
b) HLA-A and HLA-B ligands
c) HLA-E ligands
d) HLA-C SYFPEITHI ligands
e) HLA-G SYFPEITHI ligands
Please click on the version number to activate the corresponding server.
2.8 |
The current server (online since 26 Feb 2013). New in this version:
|
2.4 |
The current server (online since 18 Dec 2010). New in this version:
|
2.3 |
The current server (online since 08 Sept 2010). New in this version:
|
2.2 |
The current server (online since 01 Sept 2009). New in this version:
|
2.1 |
The current server (online since 06 April 2009). New in this version:
|
2.0 |
The current server (online since 24 June, 2008). New in this version:
Main publication:
|
1.1 |
Online from March 2007 til 24 of June, 2008). New in this version:
|
1.0 |
Original version (online version until March 2008, 2006):
Main publication:
|
If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0). If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).
If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.
Correspondence:
Technical Support: