For publication of results, please cite:
NetH2pan: A Computational Tool to Guide MHC peptide prediction on Murine Tumors
Christa I. DeVette, Massimo Andreatta, Wilfried Bardet, Steven J. Cate, Vanessa I. Jurtz, Kenneth W. Jackson, Alana L. Welm, Morten Nielsen, William H. Hildebrand
Cancer Immunology Research (2018) DOI: 10.1158/2326-6066.CIR-17-0298
NetMHCpan-4.0: Improved Peptide-MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data
Vanessa Jurtz, Sinu Paul, Massimo Andreatta, Paolo Marcatili, Bjoern Peters and Morten Nielsen
The Journal of Immunology (2017) ji1700893; DOI: 10.4049/jimmunol.1700893
Data resources used to develop this server was obtained from
- IEDB database.
- Quantitative peptide binding data were obtained
from the IEDB database.
- IMGT/HLA database. Robinson J, Malik A, Parham P, Bodmer JG,
Marsh SGE: IMGT/HLA - a sequence database for the human major histocompatibility complex. Tissue Antigens (2000),
- MHC protein sequences were obtained from the IMGT/HLA database (version 3.1.0).
1. Specify the input sequences
All the input sequences must be in one-letter amino acid
code. The alphabet is as follows (case sensitive)
A C D E F G H I K L M N P Q R S T V W Y and X (unknown)
Any other symbol will be converted to X before processing.
The server allows for input in either FASTA or
Sequences can be submitted in the following two formats:
Paste a single sequence (just the amino acids) or a number of sequences in
format or a list of peptides into the upper window of the main server page.
Select a FASTA
file on your local disk, either by typing the file name into the lower window
or by browsing the disk.
At most 5000 sequences per submission; each sequence not more than 20,000 amino acids and not less than 8 amino acids.
2. Customize your run
1. Specify peptide length (only for FASTA input). By default input proteins are digested into 9-mer peptides.
2. Select species/loci from the scroll-down menu.
3. Select allele(s) from the scroll-down menu or type in the allele names separated by commas (without blank spaces). If you choose to type in the allele names, you can consult the List of MHC molecule names.; use the molecule names in the first column.
4. Optionally specify thresholds for strong and weak binders. They are expressed in terms of %Rank, that is percentile of the predicted binding affinity compared to the distribution of affinities calculated on set of random natural peptides. The peptide will be identified as a strong binder if it is found among the top x% predicted peptides, where x% is the specified threshold for strong binders (by default 0.5%). The peptide will be identified as a weak binder if the % Rank is above the threshold of the strong binders but below the specified threshold for the weak binders (by default 2%).
5. Tick the box Make BA predictions to predict binding affinity scores. By default, the method returns scores of eluted ligand likelihood.
6. Tick the box Sort by affinity to have the output sorted by descending predicted binding affinity.
3. Submit the job
Click on the "Submit"
button. The status of your job (either 'queued'
or 'running') will be displayed and constantly updated until it terminates and
the server output appears in the browser window.
At any time during the wait you may enter your e-mail address and simply leave
the window. Your job will continue; when it terminates you will be notified by e-mail with a URL to your results. They will be stored on the server for 24 hours.
A description of the output format can be found on the output tab.
The prediction output for each molecule consists of the following columns:
Pos Residue number (starting from 0)
HLA Molecule/allele name
Peptide Amino acid sequence of the potential ligand
Core The minimal 9 amino acid binding core directly in contact with the MHC
Of The starting position of the Core within the Peptide (if > 0, the method predicts a N-terminal protrusion)
Gp Position of the deletion, if any.
Gl Length of the deletion.
Ip Position of the insertions, if any.
Il Length of the insertion.
Icore Interaction core. This is the sequence of the binding core including eventual insertions of deletions.
Identity Protein identifier, i.e. the name of the Fasta entry.
Score The raw prediction score
Aff(nM) Predicted binding affinity in nanoMolar units (if binding affinity predictions is selected).
%Rank Rank of the predicted affinity compared to a set of random natural peptides. This measure is not affected by inherent bias of certain molecules towards higher or lower mean predicted affinities. Strong binders are defined as having %rank<0.5, and weak binders with %rank<2. We advise to select candidate binders based on %Rank rather than nM Affinity
BindLevel (SB: strong binder, WB: weak binder). The peptide will be identified as a strong binder if the % Rank is below the specified threshold for the strong binders, by default 0.5%. The peptide will be identified as a weak binder if the % Rank is above the threshold of the strong binders but below the specified threshold for the weak binders, by default 2%.
- Fasta input:
>sp|P06437|GB_HHV1K Envelope glycoprotein B OS=Human herpesvirus 1 (strain KOS) GN=gB PE=1 SV=2
- Peptide length: 8, 9, 10, 11
- Allele: H-2-Kb
will return the following predictions:
# NetMHCpan version 4.0_H2
# Tmpdir made /usr/opt/www/webface/tmp/server/netmhcpan/593AB384000035D69F4553E2/netMHCpane1KUiY
# Input is in FSA format
# Peptide length 8,9,10,11
# Make Eluted ligand likelihood predictions
H-2-Kb : Distance to training data 0.000 (using nearest neighbor H2-Kb)
# Rank Threshold for Strong binding peptides 0.500
# Rank Threshold for Weak binding peptides 2.000
Pos HLA Peptide Core Of Gp Gl Ip Il Icore Identity Score %Rank BindLevel
499 H2-Kb SSIEFARL SSIEF-ARL 0 0 0 5 1 SSIEFARL sp_P06437_GB_HH 0.9981590 0.0005 <= SB
391 H2-Kb ISTTFTTNL ISTTFTTNL 0 0 0 0 0 ISTTFTTNL sp_P06437_GB_HH 0.9163720 0.0420 <= SB
280 H2-Kb SVYPYDEF SVYP-YDEF 0 0 0 4 1 SVYPYDEF sp_P06437_GB_HH 0.9095360 0.0447 <= SB
793 H2-Kb FAFRYVMRL FAFRYVMRL 0 0 0 0 0 FAFRYVMRL sp_P06437_GB_HH 0.8978500 0.0493 <= SB
506 H2-Kb LQFTYNHI LQFT-YNHI 0 0 0 4 1 LQFTYNHI sp_P06437_GB_HH 0.8668980 0.0756 <= SB
154 H2-Kb IAPYKFKATM IAYKFKATM 0 2 1 0 0 IAPYKFKATM sp_P06437_GB_HH 0.8105570 0.1299 <= SB
727 H2-Kb ANAAMFAGL ANAAMFAGL 0 0 0 0 0 ANAAMFAGL sp_P06437_GB_HH 0.8054410 0.1355 <= SB
146 H2-Kb IAVVFKENI IAVVFKENI 0 0 0 0 0 IAVVFKENI sp_P06437_GB_HH 0.8039780 0.1371 <= SB
728 H2-Kb NAAMFAGL NAAM-FAGL 0 0 0 4 1 NAAMFAGL sp_P06437_GB_HH 0.7846630 0.1562 <= SB
155 H2-Kb APYKFKATM APYKFKATM 0 0 0 0 0 APYKFKATM sp_P06437_GB_HH 0.7813300 0.1590 <= SB
392 H2-Kb STTFTTNL ST-TFTTNL 0 0 0 2 1 STTFTTNL sp_P06437_GB_HH 0.7616710 0.1751 <= SB
845 H2-Kb EMIRYMAL EMIR-YMAL 0 0 0 4 1 EMIRYMAL sp_P06437_GB_HH 0.7440680 0.1871 <= SB
794 H2-Kb AFRYVMRL A-FRYVMRL 0 0 0 1 1 AFRYVMRL sp_P06437_GB_HH 0.7372430 0.1918 <= SB..
Protein sp_P06437_GB_HH. Allele H2-Kb. Number of high binders 26. Number of weak binders 52. Number of peptides 3582
NetH2pan: A Computational Tool to Guide MHC peptide prediction on Murine Tumors
Christa I. DeVette 1, Massimo Andreatta 2, Wilfried Bardet 1, Steven J. Cate 1, Vanessa I. Jurtz 3, Kenneth W. Jackson 1, Alana L. Welm 4, Morten Nielsen 2,3, and William H. Hildebrand1
Cancer Immunology Research (2018)
University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma, USA.
Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina.
Department of Bio and Health Informatics, Technical University of Denmark, Kgs. Lyngby, Denmark.
Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah, USA.
With the advancement of personalized cancer immunotherapies, new tools are needed for
identifying tumor antigens and evaluating T cell responses in model systems, specifically
those that exhibit clinically relevant tumor progression. Key transgenic mouse models of
breast cancer are generated and maintained on the FVB genetic background, and one such
model is the MMTV-PyMT mouse – an immunocompetent transgenic mouse that exhibits spontaneous
mammary tumor development and metastasis with high penetrance. Backcrossing the MMTV-PyMT
mouse from the FVB strain onto a B6 genetic background, in order to leverage well-developed
B6 immunological tools, results in delayed tumor development and variable metastatic phenotypes.
Therefore, we initiated characterization of the FVB MHC Class I H-2-q haplotype to establish
useful immunological tools for evaluating antigen specificity in the murine FVB strain.
Our study provides the first detailed molecular and immunoproteomic characterization of
the FVB H-2-q MHC Class I alleles, including >8500 unique peptide ligands, a multi-allele
murine MHC peptide prediction tool, and in vivo validation of these data using MMTV-PyMT
primary tumors. This work allows researchers to rapidly predict H-2 peptide ligands for
immune testing, including, but not limited to, the MMTV-PyMT model for metastatic breast
NetMHC pan 4.0: Improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data
Vanessa Jurtz 1, Sinu Paul 2, Massimo Andreatta 3, Paolo Marcatili 1, Bjoern Peters 2, and Morten Nielsen1,3
The Journal of Immunology (2017)
Department of Bio and Health Informatics, Technical University of Denmark, DK-2800 Lyngby, Denmark
Division of Vaccine Discovery, La Jolla Institute for Allergy and Immunology, CA92037 La Jolla, USA
Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
Cytotoxic T cells are of central importance in the immune system's response to disease. They recognize defective cells by binding to peptides presented on the cell surface by MHC (major histocompatibility complex) class I molecules. Peptide binding to MHC molecules is the single most selective step in the antigen presentation pathway. On the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has therefore attracted large attention.
In the past, predictors of peptide-MHC interaction have in most cases been trained on binding affinity data. Recently an increasing amount of MHC presented peptides identified by mass spectrometry has been published containing information about peptide processing steps in the presentation pathway and the length distribution of naturally presented peptides. Here, we present NetMHCpan-4.0, a method trained on both binding affinity and eluted ligand data leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an increased predictive performance compared to state-of-the-art both when it comes to identification of naturally processed ligands, cancer neoantigens, and T cell epitopes.
Full text: [PDF]
NetMHCpan - MHC class I binding prediction beyond humans
Immunogenetics. 2009 Jan;61(1):1-13.
1Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
2Division of Experimental Immunology,
Institute of Medical Microbiology and Immunology,
University of Copenhagen, Denmark
3La Jolla Institute for Allergy and Immunology, San Diego, California, United States of America
Binding of peptides to major histocompatibility complex (MHC) molecules
is the single most selective step in the recognition of pathogens by
the cellular immune system. The human MHC genomic region (called HLA)
is extremely polymorphic comprising several thousand alleles, each
encoding a distinct MHC molecule. The potentially unique specificity
of the majority of HLA alleles that have been identified to date
remains uncharacterized. Likewise, only a limited number of chimpanzee
and rhesus macaque MHC class I molecules have been characterized
experimentally. Here, we present NetMHCpan-2.0, a method that generates
quantitative predictions of the affinity of any peptide-MHC class I
interaction. NetMHCpan-2.0 has been trained on the hitherto largest set
of quantitative MHC binding data available, covering HLA-A and HLA-B,
as well as chimpanzee, rhesus macaque, gorilla, and mouse MHC class
I molecules. We show that the NetMHCpan-2.0 method can accurately
predict binding to uncharacterized HLA molecules, including HLA-C and
HLA-G. Moreover, NetMHCpan-2.0 is demonstrated to accurately predict
peptide binding to chimpanzee and macaque MHC class I molecules. The power
of NetMHCpan-2.0 to guide immunologists in interpreting cellular immune
responses in large out-bred populations is demonstrated. Further, we used
NetMHCpan-2.0 to predict potential binding peptides for the pig MHC class
I molecule SLA-1*0401. Ninety-three percent of the predicted peptides
were demonstrated to bind stronger than 500 nM. The high performance
of NetMHCpan-2.0 for non-human primates documents the method's ability
to provide broad allelic coverage also beyond human MHC molecules. The
method is available at http://www.cbs.dtu.dk/services/NetMHCpan.