NetMHCIIpan - 2.1
Pan-specific binding of peptides to MHC class II alleles of known sequenceNetMHCIIpan server predicts binding of peptides to more than 500 HLA-DR alleles using artificial neural networks (ANNs). The prediction values are given in nM IC50 values and as %-Rank to a set of 200.000 random natural peptides. The project is a collaboration between CBS and IMMI.
New in version 2.1. User can upload full length MHC class II beta chain and have the server predict MHC restricted peptides from any given protein of interest.
View the version history of this server. All the previous versions are available on line, for comparison and reference.
1. Specify the input sequencesAll the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:
All the other symbols will be converted to X before processing.
The server allows for input in either FASTA or PEPTIDE format.
The sequences can be input in the following two ways:
Paste a single sequence (just the amino acids) or a number of sequences in
format or a list of peptides into the upper window of the main server page.
- Select a FASTA or PEPTIDE file on your local disk, either by typing the file name into the lower window or by browsing the disk.
Both ways can be employed at the same time: all the specified sequences will
2. Customize your run
For FASTA input, select the length of the peptides. The fasta input is divided into overlapping peptides of the given length.
Select the allele(s) you want to make predictions for from the scroll-down menu (select multiple alleles using the ctrl key), or type in the allele names separated by commas (with out blank spaces).
Give threshold value for binding values to be displayed.
Click the box Sort by affinity to have the output sorted by descending predicted binding affinity
Click the box save prediction to xls file to save the raw prediction output to an excel file. This file
will be available in the bottum of the results output file.
3. Submit the jobClick on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.
At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.
# Input is in FSA format # Threshold for Strong binding peptides 50.000 # Threshold for Weak binding peptides 500.000 ----------------------------------------------------------------------------------------------------------------- pos HLA peptide Identity Pos Core 1-log50k(aff) Affinity(nM) %Rank BindLevel ----------------------------------------------------------------------------------------------------------------- 0 DRB1*0401 ASQKRPSQRHGSKYL seq2_optional_c 6 SQRHGSKYL 0.041 10132.07 50.00 1 DRB1*0401 SQKRPSQRHGSKYLA seq2_optional_c 5 SQRHGSKYL 0.060 8394.55 50.00 2 DRB1*0401 QKRPSQRHGSKYLAT seq2_optional_c 4 SQRHGSKYL 0.077 7144.65 50.00 3 DRB1*0401 KRPSQRHGSKYLATA seq2_optional_c 4 QRHGSKYLA 0.098 5824.18 50.00 4 DRB1*0401 RPSQRHGSKYLATAS seq2_optional_c 3 QRHGSKYLA 0.121 4663.47 50.00 5 DRB1*0401 PSQRHGSKYLATAST seq2_optional_c 6 SKYLATAST 0.167 3012.29 50.00 6 DRB1*0401 SQRHGSKYLATASTM seq2_optional_c 6 KYLATASTM 0.324 665.84 32.00 7 DRB1*0401 QRHGSKYLATASTMD seq2_optional_c 6 YLATASTMD 0.476 153.62 8.00 <= WB 8 DRB1*0401 RHGSKYLATASTMDH seq2_optional_c 5 YLATASTMD 0.595 48.95 2.00 <= SB 9 DRB1*0401 HGSKYLATASTMDHA seq2_optional_c 4 YLATASTMD 0.695 18.83 0.40 <= SB 10 DRB1*0401 GSKYLATASTMDHAR seq2_optional_c 3 YLATASTMD 0.740 12.24 0.15 <= SB 11 DRB1*0401 SKYLATASTMDHARH seq2_optional_c 2 YLATASTMD 0.722 14.55 0.20 <= SB 12 DRB1*0401 KYLATASTMDHARHG seq2_optional_c 1 YLATASTMD 0.663 25.44 0.70 <= SB 13 DRB1*0401 YLATASTMDHARHGF seq2_optional_c 0 YLATASTMD 0.480 148.99 7.00 <= WB 14 DRB1*0401 LATASTMDHARHGFL seq2_optional_c 0 LATASTMDH 0.225 1727.75 50.00
NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure
Nielsen M1, Lundegaard C1, Justesen S2, Lund O1, and Buus S2
Immunome Res. 2010 Nov 13;6(1):9. .
1Center for Biological Sequence Analysis,
Technical University of Denmark,
DK-2800 Lyngby, Denmark
2Division of Experimental Immunology, Institute of Medical Microbiology and Immunology, University of Copenhagen, Denmark
BACKGROUND: Binding of peptides to Major Histocompatibility class II (MHC-II) molecules play a central role in governing responses of the adaptive immune system. MHC-II molecules sample peptides from the extracellular space allowing the immune system to detect the presence of foreign microbes from this compartment. Predicting which peptides bind to an MHC-II molecule is therefore of pivotal importance for understanding the immune response and its effect on host-pathogen interactions. The experimental cost associated with characterizing the binding motif of an MHC-II molecule is significant and large efforts have therefore been placed in developing accurate computer methods capable of predicting this binding event. Prediction of peptide binding to MHC-II is complicated by the open binding cleft of the MHC-II molecule, allowing binding of peptides extending out of the binding groove. Moreover, the genes encoding the MHC molecules are immensely diverse leading to a large set of different MHC molecules each potentially binding a unique set of peptides. Characterizing each MHC-II molecule using peptide-screening binding assays is hence not a viable option.
RESULTS: Here, we present an MHC-II binding prediction algorithm aiming at dealing with these challenges. The method is a pan-specific version of the earlier published allele-specific NN-align algorithm and does not require any pre-alignment of the input data. This allows the method to benefit also from information from alleles covered by limited binding data. The method is evaluated on a large and diverse set of benchmark data, and is shown to significantly out-perform state-of-the-art MHC-II prediction methods. In particular, the method is found to boost the performance for alleles characterized by limited binding data where conventional allele-specific methods tend to achieve poor prediction accuracy.
CONCLUSIONS: The method thus shows great potential for efficient boosting the accuracy of MHC-II binding prediction, as accurate predictions can be obtained for novel alleles at highly reduced experimental costs. Pan-specific binding predictions can be obtained for all alleles with know protein sequence and the method can benefit by including data in the training from alleles even where only few binders are known. The method and benchmark data are available at www.cbs.dtu.dk/services/NetMHCIIpan-2.0.
Here, you will find the data set used for training and testing, as well as the T cell epitope data used for evaluation of the NetMHCIIpan-3.2 method.
The training binding data are partitioned in 5 files to be used for cross-validation. For instance does the train1 file contain training data, and test1 file test data for the first cross-validation partitioning. It is critical that this data partitioning is maintained.
The format for each of the files is
AAAGAEAGKATTEEQ 0.190842 DRB1_0101 AAAGAEAGKATTEEQ 0.006301 DRB1_0301 AAAGAEAGKATTEEQ 0.066851 DRB1_0401 AAAGAEAGKATTEEQ 0.006344 DRB1_0405 AAAGAEAGKATTEEQ 0.035130 DRB1_0701 AAAGAEAGKATTEEQ 0.006288 DRB1_0802 AAAGAEAGKATTEEQ 0.176268 DRB1_0901 AAAGAEAGKATTEEQ 0.042555 DRB1_1101 AAAGAEAGKATTEEQ 0.114855 DRB1_1302 AAAGAEAGKATTEEQ 0.006377 DRB1_1501
where the first column gives the peptide, the second column the log50k transformed binding affinity (i.e. 1 - log50k( aff nM)), and the last column the class II allele.
When classifying the peptides into binders and non-binders for calculation of the AUC values for instance, a threshold of 500 nM is used. This means that peptides with log50k transformed binding affinity values greater than 0.426 are classified as binders.
train1 (Train data) test1 (Test data)
train2 (Train data) test2 (Test data)
train3 (Train data) test3 (Test data)
train4 (Train data) test4 (Test data)
train5 (Train data) test5 (Test data)
T cell evaluation data
The format is
>0705172A=AAHAEINEA=H2-IAb 385 gi|223299|prf||0705172A GSIGAASMEFCFDVFKELKVHHANENIFYCPIAIMSALAMVYLGAKDSTRTQINKVVRFD KLPGFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYAEERYPILPEYLQC VKELYRGGLEPINFQTAADQARELINSWVESQTNGIIRNVLQPSSVDSQTAMVLVNAIVF KGLWEKAFKDEDTQAMPFRVTEQESKPVQMMYQIGLFRVASMASEKMKILELPFASGTMS MLVLLPDEVSGLEQLESIINFEKLTEWTSSNVMEERKIKVYLPRMKMEEKYNLTSVLMAM GITDVFSSSANLSGISSAESLKISQAVHAAHAEINEAGREVVGSAEAGVDAASVSEEFRA DHPFLFCIKHIATNAVLFFGRCVSP
where the first part of the fasta header contains the proteinID (0705172A), the epitope (AAHAEINEA), and the MHC restriction (H2-IAb)
ReferencesImproved methods for predicting peptide binding affinity to MHC class II molecules.
Jensen KK, Andreatta M, Marcatili P, Buus S, Greenbaum JA, Yan Z, Sette A, Peters B, Nielsen M.
Immunology. 2018 Jan 6. doi: 10.1111/imm.12889.