Services
NetTCR - 2.0
Sequence-based prediction of peptide-TCR binding.
NetTCR-2.0 server predicts binding probability between a T-cell receptor CDR3 protein sequence and a MHC-I peptide binding to HLA-A*02:01
Submit data
Instructions for NetTCR-2.0
Input format
- The server only accepts amino acid sequences takes in newspace separated TCR CDR3 sequences. In case of paired αβ data, CDR3α and CDR3β should comma-separated. (Load Example on Submission page for illustration of the format);
- The sequences should be maximum 30 amino acid long and should contain only uppercase standard amino acid;
- Input sequences should be in a trimmed format, i.e., without leading cysteine ('C') and ending phenylalanine or tryptophane ('F/W').
Submission
- Paste TCR CDR3 sequence(s) into the box;
- load an example input or upload a file from your local machine, Note that when α+β chain are selected, the input file must have two columns (with order CDR3α, CDR3β); when α or β is selected, the input should have one column. The input file should be a text or .csv file with no headers for the columns.
- Select the desire chain(s) to use;
- Select the peptide(s) to pair the CDR3 sequences with.
Output
After the server successfully finishes the job, a Server Output page shows up. If an error happens during prediction a log will appear specifying the error.Computational time can range from a couple of seconds to several minutes depending on the queue and the sample size.
You can download a .csv file with the predictions.
An example of output page is showed below.
Abstract
Prediction of T cell receptor (TCR) interactions with MHC-peptide complexes remains highly challenging. This challenge is primarily due to three dominant factors: data accuracy, data scarceness, and problem complexity.
Here, we showcase that "shallow" convolutional neural network (CNN) architectures are adequate to deal with the problem complexity imposed by the length variations of TCRs.
We demonstrate that current public bulk CDR3β-pMHC binding data overall is of low quality and that the development of accurate prediction models is contingent on paired α/β TCR sequence data corresponding to at least 150 distinct pairs for each investigated pMHC.
In comparison, models trained on CDR3α or CDR3β data demonstrated a variable and pMHC specific relative performance drop. Together these findings support that T cell specificity is predictable given the availability of accurate and sufficient paired TCR sequence data.