DTU Health Tech
Department of Health Technology
This link is for the general contact of the DTU Health Tech institute.
If you need help with the bioinformatics programs, see the "Getting Help" section below the program.
The NetMHCpan-4.2 server predicts binding of peptides to any MHC molecule of known sequence using artificial neural networks (ANNs). The method is trained on a combination of more than 1,000,000 quantitative Binding Affinity (BA) and Mass-Spectrometry Eluted Ligands (EL) peptides. The BA data covers 171 MHC molecules from human (HLA-A, B, C, E), mouse (H-2), cattle (BoLA), primates (Patr, Mamu, Gogo), swine (SLA) and equine (Eqca). The EL data covers 201 MHC molecules from human (HLA-A, B, C, G), mouse (H-2), cattle (BoLA), primates (Mamu) and dog (DLA). Furthermore, the user can obtain predictions to any custom MHC class I molecule by uploading a full length MHC protein sequence. Predictions can be made for peptides of any length.
The server returns as default the likelihood of a peptide being a natural ligand of the selected MHC(s). If selected, the predicted binding affinity is also reported.
New in this version: NetMHCpan-4.2 is trained on an extended set of BA and EL data including both single-allelic (SA) and multi-allelic (MA) datapoints. The method is based on an updated version of the NNAlign_MA algorithm, which incorporates new features related to amino acid deletions. Futher, the method is fine-tuned on ~43,000 pathogen-derived epitopes from the IEDB and ~5,200 neoepitopes from CEDAR. An option to include predictions for pathogen epitopes and/or neoepitopes in the output is available.
View the version history of this server. All previous versions are available online, for comparison and reference.
The project is a collaboration between DTU Bioinformatics and LIAI.
Updated 24/3/2026: Added options to use peptide-MHC input and to include pathogen epitope and/or neoepitope predictions in the output. Package updated to version 4.2c
For publication of results, please cite:
Data resources used to develop this server was obtained from
Would you prefer to run NetMHCpan at your own site? NetMHCpan v. 4.2 is available as a stand-alone software package, with the same functionality as the service above. Ready-to-ship packages exist for the most common UNIX platforms. There is a download tap for academic users; other users are requested to contact CBS Software Package Manager at health-software@dtu.dk.
INPUT DATAIn this section, the user must define the input for the prediction server following these steps:1) Select the desired INPUT TYPE using the radio buttons:
A C D E F G H I K L M N P Q R S T V W Y and X (unknown) Any other symbol will be converted to X before processing. At most 500 sequences are allowed per submission; each sequence must be not more than 20,000 amino acids long and not less than 8 amino acids long. 4) If FASTA was selected as input type, the user must select the peptide length(s) the prediction server is going to work with. NetMHCpan-4.2 will "chop" the input FASTA sequence in overlapping peptides of the provided length(s) and will predict binding against all of them. By default input proteins are digested into 9-mer peptides. Note that, if Peptide or Peptide-MHC was selected as input type, this step is unnecessary and thus the peptide length selector will directly not appear in the interface. |
|
MHC SELECTIONHere, the user must define which MHC(s) molecule(s) the input data is going to be predicted against:1) First, select the HLA/MHC supertype family. 2) After selecting the MHC family, the user will be able to select a single or multiple MHC molecules from the updated "Select Allele(s)" list. On the other hand, the user may opt to directly type the MHC names in the provided blank field (separated by commas and without blank spaces); if this is the case, there will be no need to select an MHC supertype family from the drop-down menu. Click here for a list of MHC molecule names (use the names in the first column). Please note that a maximum of 20 MHC types is allowed per submission. 3) Optionally, the user may choose to paste a full MHC protein sequence in the blank box, or directly upload it by clicking the "Choose file" button. Such sequence must be in FASTA format. Please note that steps 2) and 3) are mutually exclusive, and are only labeled this way for explanation purposes. Note: If Peptide-MHC input is chosen, this section is not shown. |
|
ADDITIONAL CONFIGURATIONIn this section, the user may define additional parameters to further customize the run:1, 2) Specify thresholds for strong and weak binders. They are expressed in terms of %Rank, that is percentile of the predicted binding affinity compared to the distribution of affinities calculated on a set of random natural peptides. The peptide will be identified as a strong binder if it is found among the top x% predicted peptides, where x% is the specified threshold for strong binders (by default 0.5%). The peptide will be identified as a weak binder if the %Rank is above the threshold of the strong binders but below the specified threshold for the weak binders (by default 2%). 3) Specify a %Rank threshold to filter out predictions. Only sequences with a predicted %Rank value less than the specified threshold will be printed. To print all predictions, leave this value set to -99. 4) Tick this option to include also Binding Affinity predictions together with Eluted Ligand likelihood. 5) Tick this option to include also pathogen epitope predictions in the output. 6) Tick this option to include also neoepitope predictions in the output. 7) Tick this box to have the output sorted by descending Eluted Ligand prediction score. 8) Enable this option to export the prediction output to .XLS format (readable by most spreadsheet software, like Microsoft Excel). |
|
SUBMISSIONAfter the user has finished the "INPUT DATA", "MHC SELECTION" and "ADDITIONAL CONFIGURATION" steps, the submission can now be done. To do so, the user can click on "Submit" to submit the job to the processing server, or click on "Clear fields" to clear the page and start over.The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window. After the server has finished running the corresponding predictions, an output page will be delivered to the user. A description of the output format can be found at output format At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; when it terminates you will be notified by e-mail with a URL to your results. They will be stored on the server for 24 hours. |
|
>sp|P01308|INS_HUMAN Insulin OS=Homo sapiens OX=9606 GN=INS PE=1 SV=1 MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED LQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCNWith parameters:
# NetMHCpan version 4.2c # Tmpdir made /var/www/services/services/NetMHCpan-4.2/tmp/netMHCpan_glIzQM # Input is in FASTA format # Peptide length 8,9,10,11,12 # Prediction Mode: EL # HLA-A02:01 : Distance to training data 0.000 (using nearest neighbor HLA-A02:01) # Allele: HLA-A02:01 # Rank Threshold for Strong binder 0.500 # Rank Threshold for Weak binder 2.000 -------------------------------------------------------------------------------------------------------------------------------------------------------------- Pos MHC Peptide Core Of Gp Gl Ip Il Icore Identity Score_EL %Rank_EL BindLevel -------------------------------------------------------------------------------------------------------------------------------------------------------------- 34 HLA-A*02:01 HLVEALYLV HLVEALYLV 0 0 0 0 0 HLVEALYLV sp_P01308_INS_H 0.939284 0.030 <= SB 6 HLA-A*02:01 RLLPLLALL RLLPLLALL 0 0 0 0 0 RLLPLLALL sp_P01308_INS_H 0.855630 0.081 <= SB 15 HLA-A*02:01 ALWGPDPAAA ALWPDPAAA 0 3 1 0 0 ALWGPDPAAA sp_P01308_INS_H 0.854039 0.082 <= SB 15 HLA-A*02:01 ALWGPDPAA ALWGPDPAA 0 0 0 0 0 ALWGPDPAA sp_P01308_INS_H 0.673127 0.234 <= SB 15 HLA-A*02:01 ALWGPDPAAAFV ALWPAAAFV 0 3 3 0 0 ALWGPDPAAAFV sp_P01308_INS_H 0.498366 0.440 <= SB 76 HLA-A*02:01 SLQPLALEGSL SLQPLALSL 0 7 2 0 0 SLQPLALEGSL sp_P01308_INS_H 0.285093 0.884 <= WB 15 HLA-A*02:01 ALWGPDPAAAF ALWDPAAAF 0 3 2 0 0 ALWGPDPAAAF sp_P01308_INS_H 0.284544 0.886 <= WB 6 HLA-A*02:01 RLLPLLALLA RLLPLLLLA 0 6 1 0 0 RLLPLLALLA sp_P01308_INS_H 0.252655 1.004 <= WB 2 HLA-A*02:01 ALWMRLLPL ALWMRLLPL 0 0 0 0 0 ALWMRLLPL sp_P01308_INS_H 0.226048 1.101 <= WB 32 HLA-A*02:01 GSHLVEALYLV GLVEALYLV 0 1 2 0 0 GSHLVEALYLV sp_P01308_INS_H 0.216789 1.147 <= WB
The prediction output for each molecule consists of the following columns:
Three amino acid sequences are reported for each row of predictions:
The Peptide is the complete amino acid sequence evaluated by NetMHCpan. Peptides are the
full sequences submitted as a peptide list, or the result of digestion of source proteins (Fasta submission)
The iCore is a substring of Peptide, encompassing
all residues between P1 and P-omega of the MHC. For all intents and purposes, this is the minimal candidate
ligand/epitope that should be considered for further validation.
The Core is always 9 amino acids long,
and is a construction used for sequence aligment and identification of binding anchors.
MAIN REFERENCE
NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features
Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen
Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113
Identification of CD8+ T cell epitopes is crucial for advancing vaccine development and immunotherapy strategies. Traditional methods for predicting T cell epitopes primarily focus on MHC presentation, leveraging immunopeptidome data. Recent advancements however suggest significant performance improvements through transfer learning and refinement using epitope data. To further investigate this, we here develop an enhanced MHC class I (MHC-I) antigen presentation predictor by integrating newly curated binding affinity and eluted ligand datasets, expanding MHC allele coverage, and incorporating novel input features related to the structural constraints of the MHC-I peptide-binding cleft. We next apply transfer learning using experimentally validated pathogen-and cancer-derived epitopes from public databases to refine our prediction method, ensuring comprehensive data partitioning to prevent performance overestimation. Our findings indicate that fine-tuning on epitope data only yields a minor accuracy boost. Moreover, the transferability between cancer and pathogen-derived epitopes is limited, suggesting distinct properties between these data types. In conclusion, while transfer learning can enhance T cell epitope prediction, the performance gains are modest and data type specific. Our final NetMHCpan-4.2 model is publicly accessible at https://services.healthtech.dtu.dk/services/NetMHCpan-4.2, providing a valuable resource for immunological research and therapeutic development.
Birkir Reynisson, Bruno Alvarez, Sinu Paul, Bjoern Peters and Morten Nielsen
Nucleic Acids Research, Volume 48, Issue W1, 02 July 2020, Pages W449–W454; DOI: 10.1093/nar/gkaa379
Full text
Major histocompatibility complex (MHC) molecules are expressed on the cell surface, where they present peptides to T cells, which gives them a key role in the development of T-cell immune responses. MHC molecules come in two main variants: MHC Class I (MHC-I) and MHC Class II (MHC-II). MHC-I predominantly present peptides derived from intracellular proteins, whereas MHC-II predominantly presents peptides from extracellular proteins. In both cases, the binding between MHC and antigenic peptides is the most selective step in the antigen presentation pathway. Therefore, the prediction of peptide binding to MHC is a powerful utility to predict the possible specificity of a T-cell immune response. Commonly MHC binding prediction tools are trained on binding affinity or mass spectrometry-eluted ligands. Recent studies have however demonstrated how the integration of both data types can boost predictive performances. Inspired by this, we here present NetMHCpan-4.1 and NetMHCIIpan-4.0, two web servers created to predict binding between peptides and MHC-I and MHC-II, respectively. Both methods exploit tailored machine learning strategies to integrate different training data types, resulting in state-of-the-art performance and outperforming their competitors. The servers are available at http://www.cbs.dtu.dk/services/NetMHCpan-4.1/ and http://www.cbs.dtu.dk/services/NetMHCIIpan-4.0/.
Vanessa Jurtz 1, Sinu Paul 2, Massimo Andreatta 3, Paolo Marcatili 1, Bjoern Peters 2, and Morten Nielsen1,3
The Journal of Immunology (2017) ji1700893; DOI: 10.4049/jimmunol.1700893
Full text
[PDF]
1
Department of Bio and Health Informatics, Technical University of Denmark, DK-2800 Lyngby, Denmark
2
Division of Vaccine Discovery, La Jolla Institute for Allergy and Immunology, CA92037 La Jolla, USA
3
Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
Cytotoxic T cells are of central importance in the immune system’s response to disease.
They recognize defective cells by binding to peptides presented on the cell surface by MHC class I molecules.
Peptide binding to MHC molecules is the single most selective step in the Ag-presentation pathway.
Therefore, in the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has attracted widespread attention.
In the past, predictors of peptide–MHC interactions have primarily been trained on binding affinity data.
Recently, an increasing number of MHC-presented peptides identified by mass spectrometry have been reported containing information about
peptide-processing steps in the presentation pathway and the length distribution of naturally presented peptides.
In this article, we present NetMHCpan-4.0, a method trained on binding affinity and eluted ligand data
leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an
increase in predictive performance compared with state-of-the-art methods when it comes to identification of
naturally processed ligands, cancer neoantigens, and T cell epitopes.
NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length data sets
Morten Nielsen1,2 and Massimo Andreatta1
Genome Medicine (2016): 8:33
1Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
2Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
Binding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells. Here, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space.
NetMHCpan - MHC class I binding prediction beyond humans
Hoof I1,
Peter B3,
Sidney J3,
Pedersen LE2
Lund O1,
Buus S2,
Nielsen M1
Immunogenetics. (2009) Jan;61(1):1-13.
1Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
2Division of Experimental Immunology,
Institute of Medical Microbiology and Immunology,
University of Copenhagen, Denmark
3La Jolla Institute for Allergy and Immunology, San Diego, California, United States of America
Binding of peptides to major histocompatibility complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC genomic region (called HLA) is extremely polymorphic comprising several thousand alleles, each encoding a distinct MHC molecule. The potentially unique specificity of the majority of HLA alleles that have been identified to date remains uncharacterized. Likewise, only a limited number of chimpanzee and rhesus macaque MHC class I molecules have been characterized experimentally. Here, we present NetMHCpan-2.0, a method that generates quantitative predictions of the affinity of any peptide-MHC class I interaction. NetMHCpan-2.0 has been trained on the hitherto largest set of quantitative MHC binding data available, covering HLA-A and HLA-B, as well as chimpanzee, rhesus macaque, gorilla, and mouse MHC class I molecules. We show that the NetMHCpan-2.0 method can accurately predict binding to uncharacterized HLA molecules, including HLA-C and HLA-G. Moreover, NetMHCpan-2.0 is demonstrated to accurately predict peptide binding to chimpanzee and macaque MHC class I molecules. The power of NetMHCpan-2.0 to guide immunologists in interpreting cellular immune responses in large out-bred populations is demonstrated. Further, we used NetMHCpan-2.0 to predict potential binding peptides for the pig MHC class I molecule SLA-1*0401. Ninety-three percent of the predicted peptides were demonstrated to bind stronger than 500 nM. The high performance of NetMHCpan-2.0 for non-human primates documents the method's ability to provide broad allelic coverage also beyond human MHC molecules. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.
PMID: 19002680
Here, you will find the data set used for training and evaluation of NetMHCpan-4.2.
Download the file and untar the content using
cat NetMHCpan_train.tar.gz | tar xzf -
This will create the directory called NetMHCpan_train. In this directory you will find 22 files. 20 files (c00?_ba, c00?_el, c00?_iedb, c00?_cedar) with partitions with binding affinity (ba), eluted ligand data (el), pathogen epitopes (iedb) or neoepitopes (cedar). The format for each file is (here shown for an el file)
AEQNRKDAEAW 1 Sarkizova_2020__A0202 EQLAEQEAWFNE AEQRGELAIKD 1 Sarkizova_2020__A0202 IADAEQIKDANA AIFDRVLTEL 1 Sarkizova_2020__A0202 GVGAIFTELVSK AKNKLNDLED 1 Sarkizova_2020__A0202 LKDAKNLEDALQ ALADGVVSQA 1 Sarkizova_2020__A0202 DTGALASQAVKE ALADVAYYTM 1 Sarkizova_2020__A0202 HVFALAYTMLRK ALADVMSQL 1 Sarkizova_2020__A0202 GSWALASQLKKK ALAEKLDRL 1 Sarkizova_2020__A0202 LGAALADRLATA ALAELSESL 1 Sarkizova_2020__A0202 NAEALAESLRNR ALAERQQLI 1 Sarkizova_2020__A0202 KKGALAQLIPELwhere the different columns are peptide, target value, MHC_molecule/cell-line, and context. In cases where the 3rd columns is a cell-line ID, the MHC molecules expressed in the cell-line are listed in the allelelist file.
The allelelist file contains the information about alleles expressed in each EL data set, and pseudoseqs the MHC pseudo sequence for each MHC molecule.
Download the file and untar the content using
cat NetMHCpan_eval.tar.gz | tar xzf -
This will create the directory called NetMHCpan_eval. In this directory you will find 12 files. 10 files (c00?_iedb, c00?_cedar) with training partitions with pathogen epitopes (iedb) or neoepitopes (cedar), and two files (iedb_test, cedar_test) containing evaluation IEDB and CEDAR test data. The datasets were constructed such that there was no 8-mer overlap between the train and evaluation test data.
Note: the final NetMHCpan-4.2 method available on this webserver was trained on the full IEDB and CEDAR training partitions (included in NetMHCpan_train.tar.gz).
NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features
Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen
Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113
Please click on the version number to activate the corresponding server (if available).
| 4.2 |
The current version (online since 15/02/2025). New in this version:
Publication:
|
| 4.1 |
Online since 10 Dec 2019. New in this version:
Publication:
|
| 4.0 |
Online since 5 Sep 2017. New in this version:
Publication:
|
| 3.0 |
Online since 10 Feb 2016. New in this version:
Publication:
|
| 2.8 |
Online since 26 Feb 2013. New in this version:
|
| 2.4 |
Online since 18 Dec 2010. New in this version:
|
| 2.3 |
Online since 08 Sept 2010. New in this version:
|
| 2.2 |
Online since 01 Sept 2009. New in this version:
|
| 2.1 |
Online since 06 April 2009. New in this version:
|
| 2.0 |
Online since 24 June, 2008. New in this version:
Publication:
|
| 1.1 |
Online from March 2007 until 24 of June, 2008. New in this version:
|
| 1.0 |
Original version (online version until March 2007):
Publication:
|
If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).
If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.
Correspondence:
Technical Support: