Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach.
Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, Lund O.
Bioinformatics. 2004 20:1388-97
MOTIVATION: Prediction of which peptides will bind a specific major histocompatibility complex (MHC) constitutes an
important step in identifying potential T-cell epitopes suitable as vaccine candidates. MHC class II binding peptides have a
broad length distribution complicating such predictions. Thus, identifying the correct alignment is a crucial part of identifying
he core of an MHC class II binding motif. In this context, we wish to describe a novel Gibbs motif sampler method ideally
suited for recognizing such weak sequence motifs. The method is based on the Gibbs sampling method, and it incorporates
novel features optimized for the task of recognizing the binding motif of MHC classes I and II. The method locates the binding
motif in a set of sequences and characterizes the motif in terms of a weight-matrix. Subsequently, the weight-matrix can be
applied to identifying effectively potential MHC binding peptides and to guiding the process of rational vaccine design.
RESULTS: We apply the motif sampler method to the complex problem of MHC class II binding. The input to the method is
amino acid peptide sequences extracted from the public databases of SYFPEITHI and MHCPEP and known to bind to the MHC
class II complex HLA-DR4(B1*0401). Prior identification of information-rich (anchor) positions in the binding motif is
shown to improve the predictive performance of the Gibbs sampler. Similarly, a consensus solution obtained from an ensemble
average over suboptimal solutions is shown to outperform the use of a single optimal solution. In a large-scale benchmark
calculation, the performance is quantified using relative operating characteristics curve (ROC) plots and we make a detailed
comparison of the performance with that of both the TEPITOPE method and a weight-matrix derived using the conventional
alignment algorithm of ClustalW. The calculation demonstrates that the predictive performance of the Gibbs sampler is higher
than that of ClustalW and in most cases also higher than that of the TEPITOPE method.