Cofactory - 1.0

Identification of Rossmann folds and prediction of FAD, NAD and NADP specificity

Cofactory is a server that identifies Rossmann fold sequence domains and predicts their specificity for the cofactors FAD, NAD or NADP.

Submission

Restrictions:
At most 2,000 sequences and 200,000 amino acids per submission; each sequence not more than 6,000 amino acids.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.

CITATIONS

For publication of results, please cite:

Cofactory: Sequence-based prediction of cofactor specificity of Rossmann folds.
Geertz-Hansen HM, Blom N, Feist AM, Brunak S, Petersen TN.
Proteins. 2014 Feb 13. doi: 10.1002/prot.24536

Instructions

1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y and X (unknown)

All the alphabetic symbols not in the allowed alphabet will be converted to X before processing. All the non-alphabetic symbols, including white space and digits, will be ignored.

The sequences can be input in the following two ways:

Paste a single sequence (just the amino acids) or a number of sequences in FASTA format into the upper window of the main server page.
Select a FASTA file on your local disk, either by typing the file name into the lower window or by browsing the disk.

Both ways can be employed at the same time: all the specified sequences will be processed. However, there may be not more than 2,000 sequences and 200,000 amino acids in total in one submission. The sequences may not be longer than 6,000 amino acids.

2. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.

Output format

Short description:

Each identified Rossmann fold sequence domain is associated with three neural network scores, one for each of the three cofactors FAD, NAD and NADP. A score above 0.5 indicates that the domain is predicted to be specific for the particular cofactor. The prediction scores are followed by a summary of the predicted cofactor specificity. If multiple specificities are predicted the identifiers are separated with a slash e.g. NAD/NADP. The approximate Rossmann fold sequence boundaries are provided next to the summary and these are followed by the amino acid sequence. If no Rossmann fold sequence domains are identified the domain count is 0 and no scores or domain boundaries are reported.

Example:

# SEQUENCE ID             Domain FAD   NAD   NADP  Cofactor(s)  From  To    Sequence
Input_1                   1      0.670 0.387 0.585 FAD/NADP     5     46    SQKRVVVLGSGVIGLSSALILARKGYSVHILARDLPEDVSSQ
Input_2                   1      0.765 0.258 0.073 FAD          1     48    MRVVVIGAGVIGLSTALCIHERYHSVLQPLDVKVYADRFTPFTTTDVA
Input_3                   1      0.939 0.228 0.169 FAD          1     43    MKVIVLGSSHGGYEAVEELLNLHPDAEIQWYEKGDFISFLSGM
Input_3                   2      0.502 0.837 0.063 FAD/NAD      147   185   EVNNVVVIGSGYIGIEAAEAFAKAGKKVTVIDILDRPLG
Input_4                   0      -     -     -     -            -     -     -

Paper abstract

REFERENCE

Cofactory: A sequence-based prediction method of cofactor specificity of Rossmann folds.
Henrik M. Geertz-Hansen, Nikolaj Blom, Adam Feist, Søren Brunak and Thomas Nordahl Petersen¹.
Proteins. 2014 Feb. 13. doi: 10.1002/prot.24536.

¹to whom correspondence should be addressed, e-mail: tnp@cbs.dtu.dk

Center for Biological Sequence Analysis, CBS, Department of Systems Biology.
The Technical University of Denmark, DK-2800 Lyngby, Denmark.

ABSTRACT

Suboptimal cofactor usage is a frequent bottleneck in metabolically engineered microbial production strains. To facilitate identification of heterologous enzymes with altered cofactor requirements, we have developed Cofactory, a method for prediction of enzyme cofactor requirements solely from amino acid sequence information. Given an input of protein sequences, the algorithm identifies potential cofactor binding Rossmann folds and predicts the specificity for the cofactors FAD(H2), NAD(H) and NADP(H). The Rossmann fold sequence search is carried out using hidden Markov models whereas artificial neural networks do the specificity prediction. The training was carried out using experimental data from protein-cofactor structure complexes. The overall performance was benchmarked against an independent evaluation set obtaining Matthews correlation coefficients of 0.94, 0.79 and 0.65 for FAD(H2), NAD(H) and NADP(H), respectively.

The electronic version of this article is found here: view

Software Downloads

Version 1.0

Linux

GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: