DTU Health Tech
Department of Health Technology
This link is for the general contact of the DTU Health Tech institute.
If you need help with the bioinformatics programs, see the "Getting Help" section below the program.
NetTCR-2.2 allows for predictions of binding probability between a T-cell receptor (TCR) and MHC-I peptides. In contrast to NetTCR 2.1, NetTCR 2.2 has a pre-trained component, which has been trained on 26 different MHC-I peptides, prior to training further models specific for each of these peptides.
NetTCR-2.2 thus contains a model specific for predictions on each of the following peptides:
GILGFVFTL, RAKFKQLL, KLGGALQAK, AVFDRKSDAK, ELAGIGILTV, NLVPMVATV, IVTDFSVIK, LLWNGPMAV, CINGVCWTV, GLCTLVAML, SPRWYFYYL, ATDALMTGF, DATYQRTRALVR, KSKRTPMGF, YLQPRTFLL, HPVTKYIM, RFPLTFGWCF, GPRLGVRAT, CTELKLSDY, RLRAEAQVK, RLPGVLPRA, SLFNTVATLY, RPPIFIRRL, FEDLRLLSF, VLFGLGFAI and FEDLRVLSF
To further improve performance, the model predictions are scaled by similarity to known binders, which is calculated with the TCRbase tool.
While NetTCR-2.2 primarily attempts to use the pre-trained models, which have the best performance, pan-specific predictions can also be carried out for peptides other than the one listed above.
Note however that performance may vary a lot, and that performance is generally poor for peptides that are not highly similar (>95% kernel similarity) to the peptides in the training data (in other words, use with caution!). Predictions for these peptides are also not scaled via TCRbase.
The ability to predict binding between peptides presented by the Major Histocompatibility Complex (MHC) class I molecules and T-cell receptors (TCR) is of great interest in areas of vaccine development, cancer treatment and treatment of autoimmune diseases. However, the scarcity of paired-chain data, combined with the bias towards a few well-studied epitopes, has challenged the development of pan-specific machine-learning (ML) models with accurate predictive power towards peptides characterized by little or no TCR data. To deal with this, we here benefit from a larger paired-chain peptide-TCR dataset and explore different ML model architectures and training strategies to better deal with imbalanced data. We show that while simple changes to the architecture and training results in greatly improved performance, particularly for peptides with little available data, predictions on unseen peptides remain challenging, especially for peptides distant to the training peptides. We also demonstrate that ML models can be used to detect potential outliers, and that the removal of such outliers from training further improves the overall performance. Furthermore, we show that a model combining the properties of pan-specific and peptide-specific models achieves improved performance, and that performance can be further improved by integrating similarity-based predictions, especially when a low false positive rate is desirable. Moreover, in the context of the IMMREP benchmark, this updated modeling framework archived state-of-the-art performance. Finally, we show that combining all these approaches results in acceptable predictive accuracy for peptides characterized with as little as 15 positive TCRs. This observation thus places great promise on rapidly expanding the peptide covering of the current models for predicting TCR specificity. The final NetTCR 2.2 models are available at https://github.com/mnielLab/NetTCR-2.2, and as a web server at https://services.healthtech.dtu.dk/services/NetTCR-2.2/.
If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).
If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.
Correspondence:
Technical Support: