NetMHCpan-4.1 and NetMHCIIpan-4.0: Improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data
Birkir Reynisson 1*, Bruno Alvarez 2*, and Morten Nielsen1,3
Accepted for publication, NAR webserver issue 2020
Major Histocompatibility Complex (MHC) molecules are expressed on the cell
surface, where they present peptides to T cells, which gives them a key
role in the development of T cell immune responses. MHC molecules come in
two main variants: MHC Class I (MHC-I) and MHC Class II (MHC-II). MHC-I
predominantly present peptides derived from intracellular proteins,
whereas MHC-II predominantly presents peptides from extracellular
proteins. In both cases, the binding between MHC and antigenic peptides
is the most selective step in the antigen presentation pathway. Therefore,
the prediction of peptide binding to MHC is a powerful utility to predict
the possible specificity of a T cell immune response. Commonly MHC binding
prediction tools are trained on binding affinity or mass spectrometry
eluted ligands. Recent studies have however demonstrated how the
integration of both data types can boost predictive performances. Inspired
by this, we here present NetMHCpan-4.1 and NetMHCIIpan-4.0, two
web-servers created to predict binding between peptides and MHC-I and
MHC-II, respectively. Both methods exploit tailored machine learning
strategies to integrate different training data types, resulting in
state-of-the-art performance and outperforming their competitors. The
servers are available at http://www.cbs.dtu.dk/services/NetMHCpan-4.1/
Vanessa Jurtz 1, Sinu Paul 2, Massimo Andreatta 3, Paolo Marcatili 1, Bjoern Peters 2, and Morten Nielsen1,3
The Journal of Immunology (2017) ji1700893; DOI: 10.4049/jimmunol.1700893
Full text [PDF]
Department of Bio and Health Informatics, Technical University of Denmark, DK-2800 Lyngby, Denmark
2 Division of Vaccine Discovery, La Jolla Institute for Allergy and Immunology, CA92037 La Jolla, USA
3 Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
Cytotoxic T cells are of central importance in the immune system’s response to disease.
They recognize defective cells by binding to peptides presented on the cell surface by MHC class I molecules.
Peptide binding to MHC molecules is the single most selective step in the Ag-presentation pathway.
Therefore, in the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has attracted widespread attention.
In the past, predictors of peptide–MHC interactions have primarily been trained on binding affinity data.
Recently, an increasing number of MHC-presented peptides identified by mass spectrometry have been reported containing information about
peptide-processing steps in the presentation pathway and the length distribution of naturally presented peptides.
In this article, we present NetMHCpan-4.0, a method trained on binding affinity and eluted ligand data
leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an
increase in predictive performance compared with state-of-the-art methods when it comes to identification of
naturally processed ligands, cancer neoantigens, and T cell epitopes.
NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length data sets
Morten Nielsen1,2 and Massimo Andreatta1
Genome Medicine (2016): 8:33
1Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, Buenos Aires, Argentina
2Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark
Binding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells. Here, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space.
NetMHCpan - MHC class I binding prediction beyond humans
Hoof I1, Peter B3, Sidney J3, Pedersen LE2 Lund O1, Buus S2, Nielsen M1
Immunogenetics. (2009) Jan;61(1):1-13.
1Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
2Division of Experimental Immunology, Institute of Medical Microbiology and Immunology, University of Copenhagen, Denmark
3La Jolla Institute for Allergy and Immunology, San Diego, California, United States of America
Binding of peptides to major histocompatibility complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC genomic region (called HLA) is extremely polymorphic comprising several thousand alleles, each encoding a distinct MHC molecule. The potentially unique specificity of the majority of HLA alleles that have been identified to date remains uncharacterized. Likewise, only a limited number of chimpanzee and rhesus macaque MHC class I molecules have been characterized experimentally. Here, we present NetMHCpan-2.0, a method that generates quantitative predictions of the affinity of any peptide-MHC class I interaction. NetMHCpan-2.0 has been trained on the hitherto largest set of quantitative MHC binding data available, covering HLA-A and HLA-B, as well as chimpanzee, rhesus macaque, gorilla, and mouse MHC class I molecules. We show that the NetMHCpan-2.0 method can accurately predict binding to uncharacterized HLA molecules, including HLA-C and HLA-G. Moreover, NetMHCpan-2.0 is demonstrated to accurately predict peptide binding to chimpanzee and macaque MHC class I molecules. The power of NetMHCpan-2.0 to guide immunologists in interpreting cellular immune responses in large out-bred populations is demonstrated. Further, we used NetMHCpan-2.0 to predict potential binding peptides for the pig MHC class I molecule SLA-1*0401. Ninety-three percent of the predicted peptides were demonstrated to bind stronger than 500 nM. The high performance of NetMHCpan-2.0 for non-human primates documents the method's ability to provide broad allelic coverage also beyond human MHC molecules. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.