DTU Health Tech

Department of Health Technology

NetTCR - 2.1

Sequence-based prediction of peptide-TCR binding.

NetTCR-2.1 predicts binding probability between a T-cell receptor (TCR) CDR loops and MHC-I peptides

Submit data


Paste in CDR sequences (all the six CDRs or only CDR3 αβ). One TCR sequence per line is required. For each TCR, the different CDR suqiences should be comma separated.
Alternatively, load and example input or upload a file from your local machine.

Only amino acid input is accepted. For detailed instructions, see Instructions tab above.

For an overview of the method and citation information, see Abstract tab.

Sequence submission

Paste the sequence(s):

or load some sample data:
or upload a local file:

Select CDR loops

CDR3 loops   All CDRs

Select one or more peptides


Cite

Montemurro, A., Jessen, L. E., & Nielsen, M. (2022). NetTCR-2.1: Lessons and guidance on how to develop models for TCR specificity predictions. Frontiers in Immunology, 13. https://doi.org/10.3389/fimmu.2022.1055151

Instructions for NetTCR-2.1

Input format

  • The server only accepts amino acid sequences takes in newspace separated TCR CDR sequences. The CDR sequences should be comma-separated. (Load Example on Submission page for illustration of the format);
  • The sequences should be maximum 30 amino acid long and should contain only uppercase standard amino acid;

Submission

  1. Paste CDR sequence(s) into the box, or load an example file, or load a file from your lcoal machine. In case CDR3 is selected, two columns are expexted in the input file; for "All CDRs" option, 6 columns are expexted. The input file should be a text or .csv file with no headers for the columns.
  2. Select the desire CDR3s to use;
  3. Select the peptide(s) to pair the CDR sequences with.
Click the submit button when protein sequences are entered.

NetTCR-2.1: Lessons and guidance on how to develop models for TCR specificity predictions

Alessandro Montemurro, Leon Eyrich Jessen, and Morten Nielsen
Frontiers in Immunology, 13. https://doi.org/10.3389/fimmu.2022.1055151

Abstract

T cell receptors (TCR) define the specificity of T cells and are responsible for their interaction with peptide antigen targets presented in complex with major histocompatibility complex (MHC) molecules. Understanding the rules underlying this interaction hence forms the foundation for our understanding of basic adaptive immunology.

Over the last decade, efforts have been dedicated to developing assays for high throughput identification of peptide-specific TCRs. Based on such data, several computational methods have been proposed for predicting the TCR-pMHC interaction. The general conclusion from these studies is that the prediction of TCR interactions with MHC-peptide complexes remains highly challenging. Several reasons form the basis for this including scarcity and quality of data, and ill-defined modeling objectives imposed by the high redundancy of the available data.

In this work, we propose a framework for dealing with this redundancy, allowing us to address essential questions related to the modeling of TCR specificity including the use of peptide- versus pan-specific models, how to best define negative data, and the performance impact of integrating of CDR1 and 2 loops. Further, we illustrate how and why it is strongly recommended to include simple similarity-based modeling approaches when validating an improved predictive power of machine learning models, and that such validation should include a performance evaluation as a function of "distance" to the training data, to quantify the potential for generalization of the proposed model.

The conclusion of the work is that, given current data, TCR specificity is best modeled using peptide-specific approaches, integrating information from all 6 CDR loops, and with negative data constructed from a combination of true and mislabeled negatives. Comparing such machine learning models to similarity-based approaches demonstrated an increased performance gain of the former as the "distance" to the training data was increased; thus demonstrating an improved generalization ability of the machine learning-based approaches. We believe these results demonstrate that the outlined modeling framework and proposed evaluation strategy form a solid basis for investigating the modeling of TCR specificities and that adhering to such a framework will allow for faster progress within the field.

The final devolved model, NetTCR-2.1, is available at https://services.healthtech.dtu.dk/service.php?NetTCR-2.1.



GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: