DTU Health Tech
Department of Health Technology
This link is for the general contact of the DTU Health Tech institute.
If you need help with the bioinformatics programs, see the "Getting Help" section below the program.
The NetMHCpan-4.2 server predicts binding of peptides to any MHC molecule of known sequence using artificial neural networks (ANNs). The method is trained on a combination of more than 1,000,000 quantitative Binding Affinity (BA) and Mass-Spectrometry Eluted Ligands (EL) peptides. The BA data covers 171 MHC molecules from human (HLA-A, B, C, E), mouse (H-2), cattle (BoLA), primates (Patr, Mamu, Gogo), swine (SLA) and equine (Eqca). The EL data covers 201 MHC molecules from human (HLA-A, B, C, G), mouse (H-2), cattle (BoLA), primates (Mamu) and dog (DLA). Furthermore, the user can obtain predictions to any custom MHC class I molecule by uploading a full length MHC protein sequence. Predictions can be made for peptides of any length.
The server returns as default the likelihood of a peptide being a natural ligand of the selected MHC(s). If selected, the predicted binding affinity is also reported.
New in this version: NetMHCpan-4.2 is trained on an extended set of BA and EL data including both single-allelic (SA) and multi-allelic (MA) datapoints. The method is based on an updated version of the NNAlign_MA algorithm, which incorporates new features related to amino acid deletions. Futher, the method is fine-tuned on ~43,000 pathogen-derived epitopes from the IEDB and ~5,200 neoepitopes from CEDAR. These fine-tuned methods can be selected instead of the default antigen presentation method in order to predict immunogenicity.
View the version history of this server. All previous versions are available online, for comparison and reference.
The project is a collaboration between DTU Bioinformatics and LIAI.
Updated August 12 2025: A bug related to BA rank scores in the excel output has been fixed. Packages have been updated (4.2b)
For publication of results, please cite:
Data resources used to develop this server was obtained from
Would you prefer to run NetMHCpan at your own site? NetMHCpan v. 4.2 is available as a stand-alone software package, with the same functionality as the service above. Ready-to-ship packages exist for the most common UNIX platforms. There is a download tap for academic users; other users are requested to contact CBS Software Package Manager at health-software@dtu.dk.
>sp|P01308|INS_HUMAN Insulin OS=Homo sapiens OX=9606 GN=INS PE=1 SV=1 MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED LQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCNWith parameters:
# NetMHCpan version 4.2b # Tmpdir made /var/www/services/services/NetMHCpan-4.2/tmp/netMHCpan_JA3ENY # Input is in FASTA format # Peptide length 8,9,10,11,12 # Prediction Mode: EL # HLA-A02:01 : Distance to training data 0.000 (using nearest neighbor HLA-A02:01) # Allele: HLA-A02:01 # Rank Threshold for Strong prediction 0.500 # Rank Threshold for Weak prediction 2.000 -------------------------------------------------------------------------------------------------------------------------------------------------------------- Pos MHC Peptide Core Of Gp Gl Ip Il Icore Identity Score %Rank BindLevel -------------------------------------------------------------------------------------------------------------------------------------------------------------- 34 HLA-A*02:01 HLVEALYLV HLVEALYLV 0 0 0 0 0 HLVEALYLV sp_P01308_INS_H 0.939284 0.030 <= SB 6 HLA-A*02:01 RLLPLLALL RLLPLLALL 0 0 0 0 0 RLLPLLALL sp_P01308_INS_H 0.855630 0.081 <= SB 15 HLA-A*02:01 ALWGPDPAAA ALWPDPAAA 0 3 1 0 0 ALWGPDPAAA sp_P01308_INS_H 0.854039 0.082 <= SB 15 HLA-A*02:01 ALWGPDPAA ALWGPDPAA 0 0 0 0 0 ALWGPDPAA sp_P01308_INS_H 0.673127 0.234 <= SB 15 HLA-A*02:01 ALWGPDPAAAFV ALWPAAAFV 0 3 3 0 0 ALWGPDPAAAFV sp_P01308_INS_H 0.498366 0.440 <= SB 76 HLA-A*02:01 SLQPLALEGSL SLQPLALSL 0 7 2 0 0 SLQPLALEGSL sp_P01308_INS_H 0.285093 0.884 <= WB 15 HLA-A*02:01 ALWGPDPAAAF ALWDPAAAF 0 3 2 0 0 ALWGPDPAAAF sp_P01308_INS_H 0.284544 0.886 <= WB 6 HLA-A*02:01 RLLPLLALLA RLLPLLLLA 0 6 1 0 0 RLLPLLALLA sp_P01308_INS_H 0.252655 1.004 <= WB 2 HLA-A*02:01 ALWMRLLPL ALWMRLLPL 0 0 0 0 0 ALWMRLLPL sp_P01308_INS_H 0.226048 1.101 <= WB 32 HLA-A*02:01 GSHLVEALYLV GLVEALYLV 0 1 2 0 0 GSHLVEALYLV sp_P01308_INS_H 0.216789 1.147 <= WB
The prediction output for each molecule consists of the following columns:
Three amino acid sequences are reported for each row of predictions:
The Peptide is the complete amino acid sequence evaluated by NetMHCpan. Peptides are the
full sequences submitted as a peptide list, or the result of digestion of source proteins (Fasta submission)
The iCore is a substring of Peptide, encompassing
all residues between P1 and P-omega of the MHC. For all intents and purposes, this is the minimal candidate
ligand/epitope that should be considered for further validation.
The Core is always 9 amino acids long,
and is a construction used for sequence aligment and identification of binding anchors.
MAIN REFERENCE
NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features
Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen
Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113
Identification of CD8+ T cell epitopes is crucial for advancing vaccine development and immunotherapy strategies. Traditional methods for predicting T cell epitopes primarily focus on MHC presentation, leveraging immunopeptidome data. Recent advancements however suggest significant performance improvements through transfer learning and refinement using epitope data. To further investigate this, we here develop an enhanced MHC class I (MHC-I) antigen presentation predictor by integrating newly curated binding affinity and eluted ligand datasets, expanding MHC allele coverage, and incorporating novel input features related to the structural constraints of the MHC-I peptide-binding cleft. We next apply transfer learning using experimentally validated pathogen-and cancer-derived epitopes from public databases to refine our prediction method, ensuring comprehensive data partitioning to prevent performance overestimation. Our findings indicate that fine-tuning on epitope data only yields a minor accuracy boost. Moreover, the transferability between cancer and pathogen-derived epitopes is limited, suggesting distinct properties between these data types. In conclusion, while transfer learning can enhance T cell epitope prediction, the performance gains are modest and data type specific. Our final NetMHCpan-4.2 model is publicly accessible at https://services.healthtech.dtu.dk/services/NetMHCpan-4.2, providing a valuable resource for immunological research and therapeutic development.
Birkir Reynisson, Bruno Alvarez, Sinu Paul, Bjoern Peters and Morten Nielsen
Nucleic Acids Research, Volume 48, Issue W1, 02 July 2020, Pages W449–W454; DOI: 10.1093/nar/gkaa379
Full text
Major histocompatibility complex (MHC) molecules are expressed on the cell surface, where they present peptides to T cells, which gives them a key role in the development of T-cell immune responses. MHC molecules come in two main variants: MHC Class I (MHC-I) and MHC Class II (MHC-II). MHC-I predominantly present peptides derived from intracellular proteins, whereas MHC-II predominantly presents peptides from extracellular proteins. In both cases, the binding between MHC and antigenic peptides is the most selective step in the antigen presentation pathway. Therefore, the prediction of peptide binding to MHC is a powerful utility to predict the possible specificity of a T-cell immune response. Commonly MHC binding prediction tools are trained on binding affinity or mass spectrometry-eluted ligands. Recent studies have however demonstrated how the integration of both data types can boost predictive performances. Inspired by this, we here present NetMHCpan-4.1 and NetMHCIIpan-4.0, two web servers created to predict binding between peptides and MHC-I and MHC-II, respectively. Both methods exploit tailored machine learning strategies to integrate different training data types, resulting in state-of-the-art performance and outperforming their competitors. The servers are available at http://www.cbs.dtu.dk/services/NetMHCpan-4.1/ and http://www.cbs.dtu.dk/services/NetMHCIIpan-4.0/.
Vanessa Jurtz 1, Sinu Paul 2, Massimo Andreatta 3, Paolo Marcatili 1, Bjoern Peters 2, and Morten Nielsen1,3
The Journal of Immunology (2017) ji1700893; DOI: 10.4049/jimmunol.1700893
Full text
[PDF]
1
Department of Bio and Health Informatics, Technical University of Denmark, DK-2800 Lyngby, Denmark
2
Division of Vaccine Discovery, La Jolla Institute for Allergy and Immunology, CA92037 La Jolla, USA
3
Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
Cytotoxic T cells are of central importance in the immune system’s response to disease.
They recognize defective cells by binding to peptides presented on the cell surface by MHC class I molecules.
Peptide binding to MHC molecules is the single most selective step in the Ag-presentation pathway.
Therefore, in the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has attracted widespread attention.
In the past, predictors of peptide–MHC interactions have primarily been trained on binding affinity data.
Recently, an increasing number of MHC-presented peptides identified by mass spectrometry have been reported containing information about
peptide-processing steps in the presentation pathway and the length distribution of naturally presented peptides.
In this article, we present NetMHCpan-4.0, a method trained on binding affinity and eluted ligand data
leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an
increase in predictive performance compared with state-of-the-art methods when it comes to identification of
naturally processed ligands, cancer neoantigens, and T cell epitopes.
NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length data sets
Morten Nielsen1,2 and Massimo Andreatta1
Genome Medicine (2016): 8:33
1Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
2Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
Binding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells. Here, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space.
NetMHCpan - MHC class I binding prediction beyond humans
Hoof I1,
Peter B3,
Sidney J3,
Pedersen LE2
Lund O1,
Buus S2,
Nielsen M1
Immunogenetics. (2009) Jan;61(1):1-13.
1Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
2Division of Experimental Immunology,
Institute of Medical Microbiology and Immunology,
University of Copenhagen, Denmark
3La Jolla Institute for Allergy and Immunology, San Diego, California, United States of America
Binding of peptides to major histocompatibility complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC genomic region (called HLA) is extremely polymorphic comprising several thousand alleles, each encoding a distinct MHC molecule. The potentially unique specificity of the majority of HLA alleles that have been identified to date remains uncharacterized. Likewise, only a limited number of chimpanzee and rhesus macaque MHC class I molecules have been characterized experimentally. Here, we present NetMHCpan-2.0, a method that generates quantitative predictions of the affinity of any peptide-MHC class I interaction. NetMHCpan-2.0 has been trained on the hitherto largest set of quantitative MHC binding data available, covering HLA-A and HLA-B, as well as chimpanzee, rhesus macaque, gorilla, and mouse MHC class I molecules. We show that the NetMHCpan-2.0 method can accurately predict binding to uncharacterized HLA molecules, including HLA-C and HLA-G. Moreover, NetMHCpan-2.0 is demonstrated to accurately predict peptide binding to chimpanzee and macaque MHC class I molecules. The power of NetMHCpan-2.0 to guide immunologists in interpreting cellular immune responses in large out-bred populations is demonstrated. Further, we used NetMHCpan-2.0 to predict potential binding peptides for the pig MHC class I molecule SLA-1*0401. Ninety-three percent of the predicted peptides were demonstrated to bind stronger than 500 nM. The high performance of NetMHCpan-2.0 for non-human primates documents the method's ability to provide broad allelic coverage also beyond human MHC molecules. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.
PMID: 19002680
Here, you will find the data set used for training and evaluation of NetMHCpan-4.2.
Download the file and untar the content using
cat NetMHCpan_train.tar.gz | tar xzf -
This will create the directory called NetMHCpan_train. In this directory you will find 22 files. 20 files (c00?_ba, c00?_el, c00?_iedb, c00?_cedar) with partitions with binding affinity (ba), eluted ligand data (el), pathogen epitopes (iedb) or neoepitopes (cedar). The format for each file is (here shown for an el file)
AEQNRKDAEAW 1 Sarkizova_2020__A0202 EQLAEQEAWFNE AEQRGELAIKD 1 Sarkizova_2020__A0202 IADAEQIKDANA AIFDRVLTEL 1 Sarkizova_2020__A0202 GVGAIFTELVSK AKNKLNDLED 1 Sarkizova_2020__A0202 LKDAKNLEDALQ ALADGVVSQA 1 Sarkizova_2020__A0202 DTGALASQAVKE ALADVAYYTM 1 Sarkizova_2020__A0202 HVFALAYTMLRK ALADVMSQL 1 Sarkizova_2020__A0202 GSWALASQLKKK ALAEKLDRL 1 Sarkizova_2020__A0202 LGAALADRLATA ALAELSESL 1 Sarkizova_2020__A0202 NAEALAESLRNR ALAERQQLI 1 Sarkizova_2020__A0202 KKGALAQLIPELwhere the different columns are peptide, target value, MHC_molecule/cell-line, and context. In cases where the 3rd columns is a cell-line ID, the MHC molecules expressed in the cell-line are listed in the allelelist file.
The allelelist file contains the information about alleles expressed in each EL data set, and pseudoseqs the MHC pseudo sequence for each MHC molecule.
Download the file and untar the content using
cat NetMHCpan_eval.tar.gz | tar xzf -
This will create the directory called NetMHCpan_eval. In this directory you will find 12 files. 10 files (c00?_iedb, c00?_cedar) with training partitions with pathogen epitopes (iedb) or neoepitopes (cedar), and two files (iedb_test, cedar_test) containing evaluation IEDB and CEDAR test data. The training partitions were constructed such that there was no 8-mer overlap between the train and test data.
Note: the final NetMHCpan-4.2 method available on this webserver was trained on the full IEDB and CEDAR training partitions (included in NetMHCpan_train.tar.gz).
NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features
Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen
Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113
Please click on the version number to activate the corresponding server (if available).
4.2 |
The current version (online since 15/02/2025). New in this version:
Publication:
|
4.1 |
Online since 10 Dec 2019. New in this version:
Publication:
|
4.0 |
Online since 5 Sep 2017. New in this version:
Publication:
|
3.0 |
Online since 10 Feb 2016. New in this version:
Publication:
|
2.8 |
Online since 26 Feb 2013. New in this version:
|
2.4 |
Online since 18 Dec 2010. New in this version:
|
2.3 |
Online since 08 Sept 2010. New in this version:
|
2.2 |
Online since 01 Sept 2009. New in this version:
|
2.1 |
Online since 06 April 2009. New in this version:
|
2.0 |
Online since 24 June, 2008. New in this version:
Publication:
|
1.1 |
Online from March 2007 until 24 of June, 2008. New in this version:
|
1.0 |
Original version (online version until March 2007):
Publication:
|
If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).
If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.
Correspondence:
Technical Support: