DTU Health Tech

Department of Health Technology

BepiPred - 2.0

Prediction of potential linear B-cell epitopes

Paste or upload protein sequence(s) as fasta format to predict potential B-cell epitopes. Prediction can take a few minutes per sequence.


Submit data


At most 50 sequences and 300,000 amino acids per submission; each sequence not less than 10 and not more than 6000 amino acids.


For example file Click here
Format directly from your local disk:



The BepiPred-2.0 server predicts B-cell epitopes from a protein sequence, using a Random Forest algorithm trained on epitopes and non-epitope amino acids determined from crystal structures. A sequential prediction smoothing is performed afterwards.

Instructions

The BepiPred-2.0 server requires protein sequence(s) in fasta format, and can not handle nucleic acid sequences.

Paste protein sequence(s) in fasta format into field marked by arrow A or upload a fasta file marked by arrow B. Click the submit button, marked by arrow C, when protein sequences are entered.

After the server successfully finishes the job, a summary page shows up. If an error happens during modelling a log will appear specifying the error.

Use the navigation bar (arrow A) to flip through the various output pages. The default page is the summary, which shows a scrollable table (arrow D) illustrating the sequences with a sequence markup illustrating the B-cell epitope predictions. Hover over the sequence name to reveal the description of the sequence obtained from the fasta header.

Use the Epitope Threshold slider (arrow C) to interactively change the classified epitope residues, (E for epitopes and . for non-epitopes). For guidance of which Epitope Threshold is suitable for you, click the on the slider. Use the dropdown download button (arrow B) to download the output as JSON or CSV, or select "All Downloads" to get short description of the download files.

A more advanced output visualitation is available, if clicked the "Advanced Output is Off" button.

Advanced Output Format

When advanced outformat is activated, predictions from NetsurfP is additionally available as sequence markup for each sequence and a description of the markup types are revealed (arrow A).

Two new gradients are added, illustrating to easily distinguish between secondary structure types. Coils, relative surface accessibility and BepiPred-2.0 epitope predictions all carry same gradient, as epitopes often are found in coils and needs to be exposed. This can be used to easily find correlations of high scoring areas between these three types.


Here, one can download the data used for training, testing and evaluating this method.

IEDB Linear Epitopes

File Format: Fasta
Header: <Positive/Negative>ID_<IEDB_Epitope_ID>
The header explains whether the sequence has a negative or positive epitope sequence mapped onto it.

Epitope: Uppercased
Non-Epitope: Lowercased

Download: IEDB Linear Epitope Dataset

PDB Structural Epitopes

File Format: Fasta
Header: <PDBID>_<Heavy_Chain_ID><Light_Chain_ID> <Antigen_Chain_ID> <Partition>
The header explains which antibody chains and antigen chains in a given PDB have been used. The partition is which of the 5 randomly split partitions it belonged to and the partition EVAL is the completely left out PDBs.

Epitope: Uppercased
Non-Epitope: Lowercased

Download: PDB Combined Epitopes Dataset

Please cite:

Jespersen MC, Peters B, Nielsen M, Marcatili P. BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res 2017 (Web Server issue). doi: 10.1093/nar/gkx352

Abstract

Antibodies have become an indispensable tool for many biotechnological and clinical applications. They bind their molecular target (antigen) by recognizing a portion of its structure (epitope) in a highly specific manner. The ability to predict epitopes from antigen sequences alone is a complex task. Despite substantial effort, limited advancement has been achieved over the last decade in the accuracy of epitope prediction methods, especially for those that rely on the sequence of the antigen only. Here, we present BepiPred-2.0, a web server for predicting B-cell epitopes from antigen sequences. BepiPred-2.0 is based on a random forest algorithm trained on epitopes annotated from antibody-antigen protein structures. This new method was found to outperform other available tools for sequence-based epitope prediction both on epitope data derived from solved 3D structures, and on a large collection of linear epitopes downloaded from the IEDB database. The method displays results in a user-friendly and informative way, both for computer-savvy and non-expert users. We believe that BepiPred-2.0 will be a valuable tool for the bioinformatics and immunology community.

Graphical Abstract

Software Downloads




GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: