DTU Health Tech

Department of Health Technology

NetMHCpan - 4.2

Pan-specific binding of peptides to MHC class I proteins of known sequence

The NetMHCpan-4.2 server predicts binding of peptides to any MHC molecule of known sequence using artificial neural networks (ANNs). The method is trained on a combination of more than 1,000,000 quantitative Binding Affinity (BA) and Mass-Spectrometry Eluted Ligands (EL) peptides. The BA data covers 171 MHC molecules from human (HLA-A, B, C, E), mouse (H-2), cattle (BoLA), primates (Patr, Mamu, Gogo), swine (SLA) and equine (Eqca). The EL data covers 201 MHC molecules from human (HLA-A, B, C, G), mouse (H-2), cattle (BoLA), primates (Mamu) and dog (DLA). Furthermore, the user can obtain predictions to any custom MHC class I molecule by uploading a full length MHC protein sequence. Predictions can be made for peptides of any length.

The server returns as default the likelihood of a peptide being a natural ligand of the selected MHC(s). If selected, the predicted binding affinity is also reported.

New in this version: NetMHCpan-4.2 is trained on an extended set of BA and EL data including both single-allelic (SA) and multi-allelic (MA) datapoints. The method is based on an updated version of the NNAlign_MA algorithm, which incorporates new features related to amino acid deletions. Futher, the method is fine-tuned on ~43,000 pathogen-derived epitopes from the IEDB and ~5,200 neoepitopes from CEDAR. These fine-tuned methods can be selected instead of the default antigen presentation method in order to predict immunogenicity.

View the version history of this server. All previous versions are available online, for comparison and reference.

The project is a collaboration between DTU Bioinformatics and LIAI.

Updated August 12 2025: A bug related to BA rank scores in the excel output has been fixed. Packages have been updated (4.2b)

SUBMISSION

Hover the mouse cursor over the symbol for a short description of the options


Select prediction mode 


INPUT TYPE:

Paste a single sequence or several sequences in FASTA format into the field below:

... or upload a file in FASTA format directly from your local disk:

... or load some sample data:



PEPTIDE LENGTH:  

You may select multiple lengths


SELECT SPECIES/LOCI:



Select Allele(s) (max 20 per submission)


... or type Allele names (i.e. HLA-A01:01) separated by commas and without spaces (max 20 per submission): 

For a list of allowed allele names click here

... or paste a single full length MHC protein sequence in FASTA format into the field below:

... or load a file containing a full length MHC protein sequence in FASTA format directly from your local disk:



ADDITIONAL CONFIGURATION:

Threshold for strong prediction: % Rank 

Threshold for weak prediction: % Rank 

Filtering threshold for %Rank (leave -99 to print all) 

Include BA predictions  

Use context encoding

Sort by prediction score 

Save predictions to XLS file 



Restrictions:
At most 500 sequences per submission; each sequence not more than 20000 amino acids and not less than 8 amino acids. Max 20 MHC alleles per submission.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:

  • NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features
    Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen
    Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113  
  • NetMHCpan-4.1 and NetMHCIIpan-4.0: Improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data
    Birkir Reynisson, Bruno Alvarez, Sinu Paul, Bjoern Peters and Morten Nielsen
    Nucleic Acids Research, Volume 48, Issue W1, 02 July 2020, Pages W449–W454; DOI: 10.1093/nar/gkaa379
    Full text  
  • NetMHCpan-4.0: Improved Peptide MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data
    Vanessa Jurtz, Sinu Paul, Massimo Andreatta, Paolo Marcatili, Bjoern Peters and Morten Nielsen
    The Journal of Immunology (2017) ji1700893; DOI: 10.4049/jimmunol.1700893
    Full text   [PDF]
  • NetMHCpan-3.0: improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length data sets
    Morten Nielsen and Massimo Andreatta
    Genome Medicine (2016): 8:33
    Full text   [PDF]
  • NetMHCpan, a method for MHC class I binding prediction beyond humans
    Ilka Hoof, Bjoern Peters, John Sidney, Lasse Eggers Pedersen, Ole Lund, Soren Buus, and Morten Nielsen
    Immunogenetics 61.1 (2009): 1-13
    PMID: 19002680   Full text

DATA RESOURCES

Data resources used to develop this server was obtained from

  • IEDB database.
    • Quantitative peptide binding data were obtained from the IEDB database.
  • IMGT/HLA database. Robinson J, Malik A, Parham P, Bodmer JG, Marsh SGE: IMGT/HLA - a sequence database for the human major histocompatibility complex. Tissue Antigens (2000), 55:280-287.
    • HLA protein sequences were obtained from the IMGT/HLA database (version 3.1.0).

PORTABLE VERSION

Would you prefer to run NetMHCpan at your own site? NetMHCpan v. 4.2 is available as a stand-alone software package, with the same functionality as the service above. Ready-to-ship packages exist for the most common UNIX platforms. There is a download tap for academic users; other users are requested to contact CBS Software Package Manager at health-software@dtu.dk.

Instructions


PREDICTION MODE

The first step is to select the type of predictions to make:

- Antigen presentation: This is the default mode which predicts the likelihood of peptides being presented by a given MHC molecule.
- Pathogen epitopes: This mode uses a model fine-tuned to predict pathogen-derived epitopes.
- Cancer neoepitopes: This mode uses a model fine-tuned to predict cancer-derived neoepitopes.
modes

INPUT DATA

In this section, the user must define the input for the prediction server following these steps:

1) Specify the desired type of input data (FASTA or PEPTIDE) using the drop down menu.

2) Provide the input data by means of pasting the data into the blank field, uploading it using the "Choose File" button or by loading sample data using the "Load Data" button. All the input sequences must be in one-letter amino acid code. The alphabet is as follows (case sensitive):

A C D E F G H I K L M N P Q R S T V W Y and X (unknown)

Any other symbol will be converted to X before processing. At most 500 sequences are allowed per submission; each sequence must be not more than 20,000 amino acids long and not less than 8 amino acids long.


3) If FASTA was selected as input type, the user must select the peptide length(s) the prediction server is going to work with. NetMHCpan-4.1 will "chop" the input FASTA sequence in overlapping peptides of the provided length(s) and will predict binding against all of them. By default input proteins are digested into 9-mer peptides. Note that, if PEPTIDE was selected as input type, this step is unnecessary and thus the peptide length selector will directly not appear in the interface.

input_data



MHC SELECTION

Here, the user must define which MHC(s) molecule(s) the input data is going to be predicted against:

1) First, select the HLA/MHC supertype family.

2) After selecting the MHC family, the user will be able to select a single or multiple MHC molecules from the updated "Select Allele(s)" list. On the other hand, the user may opt to directly type the MHC names in the provided blank field (separated by commas and without blank spaces); if this is the case, there will be no need to select an MHC supertype familiy from the drop-down menu. Click here for a list of MHC molecule names (use the names in the first column). Please note that a maximum of 20 MHC types is allowed per submission.

3) Optionally, the user may choose to paste a full MHC protein sequence in the blank box, or directly upload it by clicking the "Choose file" button. Such sequence must be in FASTA format.

Please note that steps 2) and 3) are mutually exclusive, and are only labeled this way for explanation purposes.
mhc_selection


ADDITIONAL CONFIGURATION

In this section, the user may define additional parameters to further customize the run:

1, 2) Specify thresholds for strong and weak binders. They are expressed in terms of %Rank, that is percentile of the predicted binding affinity compared to the distribution of affinities calculated on set of random natural peptides. The peptide will be identified as a strong binder if it is found among the top x% predicted peptides, where x% is the specified threshold for strong binders (by default 0.5%). The peptide will be identified as a weak binder if the % Rank is above the threshold of the strong binders but below the specified threshold for the weak binders (by default 2%).

3) Specify a %Rank threshold to filter out predictions. Only sequences with a predicted %Rank value less than the specified threshold will be printed. To print all predictions, leave this value set to -99.

4) Tick this option to include also Binding Affinity predictions together with Eluted Ligand likelihood. Note that this option is only available for the antigen presentation prediction mode.

5) Context encoding informs the network of the proteolytic context of the ligand. Context is automatically generated from the source protein if the user selects FASTA format. The context consists of 12 amino acids: 3 upstream of the ligand, 3 from the N-terminus, 3 from the C-terminus, and 3 downstream from the ligand. If PEPTIDE is selected, the user must specify the ligand context (see PEPTIDECONT). Note that this option is only available for the antigen presentation prediction mode.

6) Tick this box to have the output sorted by descending prediction score.

7) Enable this option to export the prediction output to .XLS format (readable by most spreadsheet softwares, like Microsoft Excel).
configuration


SUBMISSION

After the user has finished the "PREDICTION MODE", "INPUT DATA", "MHC SELECTION" and "ADDITIONAL CONFIGURATION" steps, the submission can now be done. To do so, the user can click on "Submit" to submit the job to the processing server, or click on "Clear fields" to clear the page and start over.

The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

After the server has finished running the corresponding predictions, an output page will be delivered to the user. A description of the output format can be found at output format

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; when it terminates you will be notified by e-mail with a URL to your results. They will be stored on the server for 24 hours.

submission

EXAMPLE

For the following FASTA input example:

>sp|P01308|INS_HUMAN Insulin OS=Homo sapiens OX=9606 GN=INS PE=1 SV=1
MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED
LQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
With parameters:

Prediction mode: Antigen presentation
Peptide length: 8, 9, 10, 11, 12
Allele: HLA-A*02:01
Sort by prediction score: On



NetMHCpan-4.2 will return the following output (showing the first 10 predicted peptides):

# NetMHCpan version 4.2b

# Tmpdir made /var/www/services/services/NetMHCpan-4.2/tmp/netMHCpan_JA3ENY

# Input is in FASTA format

# Peptide length 8,9,10,11,12

# Prediction Mode: EL

# HLA-A02:01 : Distance to training data  0.000 (using nearest neighbor HLA-A02:01)

# Allele: HLA-A02:01
# Rank Threshold for Strong prediction   0.500
# Rank Threshold for Weak prediction   2.000
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Pos         MHC           Peptide      Core Of Gp Gl Ip Il             Icore        Identity    Score    %Rank BindLevel
--------------------------------------------------------------------------------------------------------------------------------------------------------------
  34 HLA-A*02:01         HLVEALYLV HLVEALYLV  0  0  0  0  0         HLVEALYLV sp_P01308_INS_H 0.939284    0.030 <= SB
   6 HLA-A*02:01         RLLPLLALL RLLPLLALL  0  0  0  0  0         RLLPLLALL sp_P01308_INS_H 0.855630    0.081 <= SB
  15 HLA-A*02:01        ALWGPDPAAA ALWPDPAAA  0  3  1  0  0        ALWGPDPAAA sp_P01308_INS_H 0.854039    0.082 <= SB
  15 HLA-A*02:01         ALWGPDPAA ALWGPDPAA  0  0  0  0  0         ALWGPDPAA sp_P01308_INS_H 0.673127    0.234 <= SB
  15 HLA-A*02:01      ALWGPDPAAAFV ALWPAAAFV  0  3  3  0  0      ALWGPDPAAAFV sp_P01308_INS_H 0.498366    0.440 <= SB
  76 HLA-A*02:01       SLQPLALEGSL SLQPLALSL  0  7  2  0  0       SLQPLALEGSL sp_P01308_INS_H 0.285093    0.884 <= WB
  15 HLA-A*02:01       ALWGPDPAAAF ALWDPAAAF  0  3  2  0  0       ALWGPDPAAAF sp_P01308_INS_H 0.284544    0.886 <= WB
   6 HLA-A*02:01        RLLPLLALLA RLLPLLLLA  0  6  1  0  0        RLLPLLALLA sp_P01308_INS_H 0.252655    1.004 <= WB
   2 HLA-A*02:01         ALWMRLLPL ALWMRLLPL  0  0  0  0  0         ALWMRLLPL sp_P01308_INS_H 0.226048    1.101 <= WB
  32 HLA-A*02:01       GSHLVEALYLV GLVEALYLV  0  1  2  0  0       GSHLVEALYLV sp_P01308_INS_H 0.216789    1.147 <= WB


DESCRIPTION

The prediction output for each molecule consists of the following columns:

  • Pos: Residue number (starting from 0) of the peptide in the protein sequence.

  • HLA: Specified MHC molecule / Allele name.

  • Peptide: Amino acid sequence of the potential ligand.

  • Core: The minimal 9 amino acid binding core directly in contact with the MHC (i.e excluding potential insertions).

  • Of: The starting position of the Core within the Peptide (if > 0, the method predicts a N-terminal protrusion).

  • Gp: Position of the deletion, if any.

  • Gl: Length of the deletion, if any.

  • Ip: Position of the insertion, if any.

  • Il: Length of the insertion, if any.

  • Icore: Interaction core. This is the sequence of the peptide bound and presented by the MHC.

  • Identity: Protein identifier, i.e. the name of the FASTA entry.

  • Score: The raw prediction score.

  • %Rank: Rank of the predicted binding score compared to a set of random natural peptides. This measure is not affected by inherent bias of certain molecules towards higher or lower mean predicted affinities. Strong predictions are defined as having %rank<0.5, and weak predictions with %rank<2. We advise to select candidate peptides based on %Rank rather than Score

  • BindLevel: (SB: Strong Binder, WB: Weak Binder). The peptide will be identified as a strong binder if the %Rank is below the specified threshold for the strong binders (by default, 0.5%). The peptide will be identified as a weak binder if the %Rank is above the threshold of the strong binders but below the specified threshold for the weak binders (by default, 2%).



  • NOTES

    Peptide vs. iCore vs. Core

    Three amino acid sequences are reported for each row of predictions:
    The Peptide is the complete amino acid sequence evaluated by NetMHCpan. Peptides are the full sequences submitted as a peptide list, or the result of digestion of source proteins (Fasta submission)
    The iCore is a substring of Peptide, encompassing all residues between P1 and P-omega of the MHC. For all intents and purposes, this is the minimal candidate ligand/epitope that should be considered for further validation.
    The Core is always 9 amino acids long, and is a construction used for sequence aligment and identification of binding anchors.

    ARTICLE ABSTRACTS


    MAIN REFERENCE


    NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features

    Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen

    Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113  

    Identification of CD8+ T cell epitopes is crucial for advancing vaccine development and immunotherapy strategies. Traditional methods for predicting T cell epitopes primarily focus on MHC presentation, leveraging immunopeptidome data. Recent advancements however suggest significant performance improvements through transfer learning and refinement using epitope data. To further investigate this, we here develop an enhanced MHC class I (MHC-I) antigen presentation predictor by integrating newly curated binding affinity and eluted ligand datasets, expanding MHC allele coverage, and incorporating novel input features related to the structural constraints of the MHC-I peptide-binding cleft. We next apply transfer learning using experimentally validated pathogen-and cancer-derived epitopes from public databases to refine our prediction method, ensuring comprehensive data partitioning to prevent performance overestimation. Our findings indicate that fine-tuning on epitope data only yields a minor accuracy boost. Moreover, the transferability between cancer and pathogen-derived epitopes is limited, suggesting distinct properties between these data types. In conclusion, while transfer learning can enhance T cell epitope prediction, the performance gains are modest and data type specific. Our final NetMHCpan-4.2 model is publicly accessible at https://services.healthtech.dtu.dk/services/NetMHCpan-4.2, providing a valuable resource for immunological research and therapeutic development.





    EARLIER REFERENCES


    NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data

    Birkir Reynisson, Bruno Alvarez, Sinu Paul, Bjoern Peters and Morten Nielsen

    Nucleic Acids Research, Volume 48, Issue W1, 02 July 2020, Pages W449–W454; DOI: 10.1093/nar/gkaa379
    Full text  

    Major histocompatibility complex (MHC) molecules are expressed on the cell surface, where they present peptides to T cells, which gives them a key role in the development of T-cell immune responses. MHC molecules come in two main variants: MHC Class I (MHC-I) and MHC Class II (MHC-II). MHC-I predominantly present peptides derived from intracellular proteins, whereas MHC-II predominantly presents peptides from extracellular proteins. In both cases, the binding between MHC and antigenic peptides is the most selective step in the antigen presentation pathway. Therefore, the prediction of peptide binding to MHC is a powerful utility to predict the possible specificity of a T-cell immune response. Commonly MHC binding prediction tools are trained on binding affinity or mass spectrometry-eluted ligands. Recent studies have however demonstrated how the integration of both data types can boost predictive performances. Inspired by this, we here present NetMHCpan-4.1 and NetMHCIIpan-4.0, two web servers created to predict binding between peptides and MHC-I and MHC-II, respectively. Both methods exploit tailored machine learning strategies to integrate different training data types, resulting in state-of-the-art performance and outperforming their competitors. The servers are available at http://www.cbs.dtu.dk/services/NetMHCpan-4.1/ and http://www.cbs.dtu.dk/services/NetMHCIIpan-4.0/.




    NetMHCpan-4.0: Improved Peptide–MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data

    Vanessa Jurtz 1, Sinu Paul 2, Massimo Andreatta 3, Paolo Marcatili 1, Bjoern Peters 2, and Morten Nielsen1,3

    The Journal of Immunology (2017) ji1700893; DOI: 10.4049/jimmunol.1700893
    Full text   [PDF]

    1 Department of Bio and Health Informatics, Technical University of Denmark, DK-2800 Lyngby, Denmark
    2 Division of Vaccine Discovery, La Jolla Institute for Allergy and Immunology, CA92037 La Jolla, USA
    3 Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina

    Cytotoxic T cells are of central importance in the immune system’s response to disease. They recognize defective cells by binding to peptides presented on the cell surface by MHC class I molecules. Peptide binding to MHC molecules is the single most selective step in the Ag-presentation pathway. Therefore, in the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has attracted widespread attention. In the past, predictors of peptide–MHC interactions have primarily been trained on binding affinity data. Recently, an increasing number of MHC-presented peptides identified by mass spectrometry have been reported containing information about peptide-processing steps in the presentation pathway and the length distribution of naturally presented peptides. In this article, we present NetMHCpan-4.0, a method trained on binding affinity and eluted ligand data leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an increase in predictive performance compared with state-of-the-art methods when it comes to identification of naturally processed ligands, cancer neoantigens, and T cell epitopes.


    NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length data sets

    Morten Nielsen1,2 and Massimo Andreatta1

    Genome Medicine (2016): 8:33

    1Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
    2Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark

    Binding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells. Here, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space.

    Full text   [PDF]


    NetMHCpan - MHC class I binding prediction beyond humans

    Hoof I1, Peter B3, Sidney J3, Pedersen LE2 Lund O1, Buus S2, Nielsen M1

    Immunogenetics. (2009) Jan;61(1):1-13.

    1Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark
    2Division of Experimental Immunology, Institute of Medical Microbiology and Immunology, University of Copenhagen, Denmark
    3La Jolla Institute for Allergy and Immunology, San Diego, California, United States of America

    Binding of peptides to major histocompatibility complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC genomic region (called HLA) is extremely polymorphic comprising several thousand alleles, each encoding a distinct MHC molecule. The potentially unique specificity of the majority of HLA alleles that have been identified to date remains uncharacterized. Likewise, only a limited number of chimpanzee and rhesus macaque MHC class I molecules have been characterized experimentally. Here, we present NetMHCpan-2.0, a method that generates quantitative predictions of the affinity of any peptide-MHC class I interaction. NetMHCpan-2.0 has been trained on the hitherto largest set of quantitative MHC binding data available, covering HLA-A and HLA-B, as well as chimpanzee, rhesus macaque, gorilla, and mouse MHC class I molecules. We show that the NetMHCpan-2.0 method can accurately predict binding to uncharacterized HLA molecules, including HLA-C and HLA-G. Moreover, NetMHCpan-2.0 is demonstrated to accurately predict peptide binding to chimpanzee and macaque MHC class I molecules. The power of NetMHCpan-2.0 to guide immunologists in interpreting cellular immune responses in large out-bred populations is demonstrated. Further, we used NetMHCpan-2.0 to predict potential binding peptides for the pig MHC class I molecule SLA-1*0401. Ninety-three percent of the predicted peptides were demonstrated to bind stronger than 500 nM. The high performance of NetMHCpan-2.0 for non-human primates documents the method's ability to provide broad allelic coverage also beyond human MHC molecules. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.

    PMID: 19002680

    Full text

    Supplementary material

    Here, you will find the data set used for training and evaluation of NetMHCpan-4.2.

    Training data

    NetMHCpan_train.tar.gz

    Download the file and untar the content using

    cat NetMHCpan_train.tar.gz | tar xzf -
    

    This will create the directory called NetMHCpan_train. In this directory you will find 22 files. 20 files (c00?_ba, c00?_el, c00?_iedb, c00?_cedar) with partitions with binding affinity (ba), eluted ligand data (el), pathogen epitopes (iedb) or neoepitopes (cedar). The format for each file is (here shown for an el file)

    AEQNRKDAEAW 1 Sarkizova_2020__A0202 EQLAEQEAWFNE
    AEQRGELAIKD 1 Sarkizova_2020__A0202 IADAEQIKDANA
    AIFDRVLTEL 1 Sarkizova_2020__A0202 GVGAIFTELVSK
    AKNKLNDLED 1 Sarkizova_2020__A0202 LKDAKNLEDALQ
    ALADGVVSQA 1 Sarkizova_2020__A0202 DTGALASQAVKE
    ALADVAYYTM 1 Sarkizova_2020__A0202 HVFALAYTMLRK
    ALADVMSQL 1 Sarkizova_2020__A0202 GSWALASQLKKK
    ALAEKLDRL 1 Sarkizova_2020__A0202 LGAALADRLATA
    ALAELSESL 1 Sarkizova_2020__A0202 NAEALAESLRNR
    ALAERQQLI 1 Sarkizova_2020__A0202 KKGALAQLIPEL
    
    where the different columns are peptide, target value, MHC_molecule/cell-line, and context. In cases where the 3rd columns is a cell-line ID, the MHC molecules expressed in the cell-line are listed in the allelelist file.

    The allelelist file contains the information about alleles expressed in each EL data set, and pseudoseqs the MHC pseudo sequence for each MHC molecule.

    Evaluation data

    NetMHCpan_eval.tar.gz

    Download the file and untar the content using

    cat NetMHCpan_eval.tar.gz | tar xzf -
    

    This will create the directory called NetMHCpan_eval. In this directory you will find 12 files. 10 files (c00?_iedb, c00?_cedar) with training partitions with pathogen epitopes (iedb) or neoepitopes (cedar), and two files (iedb_test, cedar_test) containing evaluation IEDB and CEDAR test data. The training partitions were constructed such that there was no 8-mer overlap between the train and test data.
    Note: the final NetMHCpan-4.2 method available on this webserver was trained on the full IEDB and CEDAR training partitions (included in NetMHCpan_train.tar.gz).


    References

    NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features
    Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen
    Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113

    Version history


    Please click on the version number to activate the corresponding server (if available).

    4.2 The current version (online since 15/02/2025). New in this version:
    • NetMHCpan-4.2 is trained on an extended set of BA and EL data including both single-allelic (SA) and multi-allelic (MA) datapoints. The method is based on an updated version of the NNAlign_MA algorithm, which incorporates new features related to amino acid deletions. Futher, the method is fine-tuned on ~43,000 pathogen-derived epitopes from the IEDB database and ~5,200 neoepitopes from the CEDAR database.
    • The server returns a likelihood of a peptide either being a natural ligand, pathogen-derived epitope or cancer-derived neoepitope, depending on the prediction mode chosen.

    Publication:

    • NetMHCpan-4.2: Improved prediction of CD8+ epitopes by use of transfer learning and structural features
      Jonas Birkelund Nilsson, Jason Greenbaum, Bjoern Peters and Morten Nielsen
      Frontiers in Immmunology, 07 August 2025, https://doi.org/10.3389/fimmu.2025.1616113  
    4.1 Online since 10 Dec 2019. New in this version:
    • NetMHCpan is now trained on Mass Spectrometry Eluted Ligands (EL) data from Single Allele (SA, peptides annotated to a single MHC) and Multi Allele (MA, peptides annotated to multiple MHCs) sources. The use of EL MA data is now possible due to an upgrade NNAlign (called NNAlign_MA) which enables pseudo-labelling.
    • The server returns by default the likelihood of a peptide being a natural ligand and, if selected, also the predicted binding affinity.

    Publication:

    • NetMHCpan-4.1 and NetMHCIIpan-4.0: Improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data
      Birkir Reynisson, Bruno Alvarez, Sinu Paul, Bjoern Peters and Morten Nielsen
      Nucleic Acids Research, Volume 48, Issue W1, 02 July 2020, Pages W449–W454; DOI: 10.1093/nar/gkaa379
      Full text  
    4.0 Online since 5 Sep 2017. New in this version:
    • NetMHCpan is now trained both on affinity data and naturally eluted ligands
    • The method predicts two properties: likelihood of ligand presentation and binding affinity

    Publication:

    • NetMHCpan-4.0: Improved Peptide–MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data
      Vanessa Jurtz, Sinu Paul, Massimo Andreatta, Paolo Marcatili, Bjoern Peters and Morten Nielsen
      The Journal of Immunology (2017) ji1700893; DOI: 10.4049/jimmunol.1700893
    3.0 Online since 10 Feb 2016. New in this version:
    • Improved algorithm implementing insertions and deletions in the alignment
    • Method trained on extended data-set including peptides of length 8 to 13
    • Length distribution of predicted binders that reflects the MHC molecules preferences

    Publication:

    • NetMHCpan-3.0: improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length data sets
      Morten Nielsen and Massimo Andreatta
      Genome Medicine (2016): 8:33
    2.8 Online since 26 Feb 2013. New in this version:
    • Method retrained on extented data set including 10 prevalent HLA-C and 7 prevalent BoLA MHC-I molecules.
    2.4 Online since 18 Dec 2010. New in this version:
    • Method retrained on extented data set including several HLA-C allele, and two BoLA alleles.
    2.3 Online since 08 Sept 2010. New in this version:
    • Method retrained on the version 2.2 data excluding data from the Mamu-A1*02601 allele due to data contamination. Also, the method has been updated to include the newest MHC allele releases from the IMGT/HLA and IPD-MHC databases (for non-human primates and pig). These updates include incoportation of the new nomemclature for HLA and Rhesus macaque (Mamu).
    2.2 Online since 01 Sept 2009. New in this version:
    • Method retrained on an extented data covering more than 100 MHC alleles and more than 110,000 peptide/MHC interactions.
    2.1 Online since 06 April 2009. New in this version:
    • Predicted binding score are shown as percent rank to a pool of 1000.000 random natural 9mer peptides. IC50 values are only shown for a set of white-listed alleles (mostly HLA-A and HLA-B alleles) where the values can be relied on.
    2.0 Online since 24 June, 2008. New in this version:
    • Binding predictions for all known MHC molecules including HLA-C, non-classical HLA (HLA-E and HLA-G), non-human primates, pig and mouse.
    • Prediction of performance accuracy

    Publication:

    • NetMHCpan - MHC class I binding prediction beyond humans
      Ilka Hoof, Bjoern Peters, John Sidney, Lasse Eggers Pedersen, Ole Lund, Soren Buus, and Morten Nielsen
      Immunogenetics 61.1 (2009): 1-13

    1.1 Online from March 2007 until 24 of June, 2008. New in this version:
    • Includes prediction of peptides length 8-11 .

    1.0 Original version (online version until March 2007):

    Publication:

    • NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence.
      Nielsen M, et al. (2007) PLoS ONE 2(8): e796. doi:10.1371/journal.pone.0000796
      View the the full text version at PLoSONE: Full text, or Full text including supplementary materials: PDF_fulltext.pdf

    Software Downloads




    GETTING HELP

    If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

    If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

    Correspondence: Technical Support: