We present a predictive method that can simulate an essential step in the antigen presentation in higher vertebrates, namely the step involving the proteasomal degradation of polypeptides into fragments which have the potential to bind to MHC Class I molecules. Proteasomal cleavage prediction algorithms published so far were trained on data from in vitro digestion experiments with constitutive proteasomes. As a result, they did not take into account the characteristics of the structurally modified proteasomes--often called immunoproteasomes--found in cells stimulated by gamma-interferon under physiological conditions. Our algorithm has been trained not only on in vitro data, but also on MHC Class I ligand data, which reflect a combination of immunoproteasome and constitutive proteasome specificity. This feature, together with the use of neural networks, a non-linear classification technique, make the prediction of MHC Class I ligand boundaries more accurate: 65% of the cleavage sites and 85% of the non-cleavage sites are correctly determined. Moreover, we show that the neural networks trained on the constitutive proteasome data learns a specificity that differs from that of the networks trained on MHC Class I ligands, i.e. the specificity of the immunoproteasome is different than the constitutive proteasome. The tools developed in this study in combination with a predictor of MHC and TAP binding capacity should give a more complete prediction of the generation and presentation of peptides on MHC Class I molecules. Here we demonstrate that such an approach produces an accurate prediction of the CTL the epitopes in HIV Nef. The method is available at www.cbs.dtu.dk/services/NetChop/.
Cytotoxic T cells (CTLs) perceive the world through small peptides that are eight to ten amino acids long. These peptides (epitopes) are initially generated by the proteasome, a multi-subunit protease that is responsible for the majority of intra-cellular protein degradation. The proteasome generates the exact C-terminal of CTL epitopes, and the N-terminal with a possible extension. CTL responses may diminish if the epitopes are destroyed by the proteasomes. Therefore, the prediction of the proteasome cleavage sites is important to identify potential immunogenic regions in the proteomes of pathogenic microorganisms (or humans). We have recently shown that NetChop, a neural network-based prediction method, is the best method available at the moment to do such predictions; however, its performance is still lower than desired. Here, we use novel sequence encoding methods and show that the new version of NetChop predicts approximately 10% more of the cleavage sites correctly while lowering the number of false positives with close to 15%. With this more reliable prediction tool, we study two important questions concerning the function of the proteasome. First, we estimate the N-terminal extension of epitopes after proteasomal cleavage and find that the average extension is relatively short. However, more than 30% of the peptides have N-terminal extensions of three amino acids or more, and thus, N-terminal trimming might play an important role in the presentation of a substantial fraction of the epitopes. Second, we show that good TAP ligands have an increased chance of being cleaved by the proteasome, i.e., the specificity of TAP has evolved to fit the specificity of the proteasome. This evolutionary relationship allows for a more efficient antigen presentation.