Explore

Services

NetCorona - 1.0

Coronavirus 3C-like proteinase cleavage sites in proteins


NetCorona predicts coronavirus 3C-like proteinase (or protease) cleavage sites using artificial neural networks on amino acid sequences. Every potential site is scored and a list is compiled in addition to a graphical representation. Refer to publication for more detailed information and performance values.

Submission


Sequence submission: paste the sequence(s) and/or upload a local file

Paste a single sequence or several sequences in FASTA format into the field below:

Submit a file in FASTA format directly from your local disk:

Generate graphics   


Restrictions:
At most 10 sequences and 50,000 amino acids per submission; each sequence not more than 10,000 amino acids.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS
For publication of results, please cite:

Coronavirus 3CL-pro proteinase cleavage sites: Possible relevance to SARS virus pathology
L. Kiemer, O. Lund, S. Brunak, and N. Blom
BMC Bioinformatics 2004, 5:72

Download the PDF

Instructions


1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y

Please note that the sequences containing other symbols e.g. X (unknown) will be discarded before processing. The sequences can be input in the following two ways:

  • Paste a single sequence (just the amino acids) or a number of sequences in FASTA format into the upper window of the main server page.

  • FASTA file on your local disk, either by typing the file name into the lower window or by browsing the disk.

Both ways can be employed at the same time: all the specified sequences will be processed. However, there may be not more than 10 sequences in total in one submission. Sequences exceeding 10000 amino acids will be ignored.


2. Customize your run

The button "Generate graphics" is used to disable the graphics generated by default. If disabled only the text output is shown.


3. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.


Output format


DESCRIPTION

The output with graphics enabled consist of three parts.

The first part is an output in HOW format which contains on the first line the number of residues and the sequence name. Then follows the amino acid sequence and below is a representation of each amino acid with the corresponding prediction ('C' for cleavage site and '.' for nothing).

The second part is a listing of the examined residues (glutamines), their score, and in case the score exceeds the 0.5 threshold the amino acids around the cleavage site.

The third part is a graphical representation of the table in the second part of the output.

Multiple sequences are separated with '//'.




EXAMPLE OUTPUT




   703 AT6B_HUMAN 

MAELMLLSEIADPTRFFTDNLLSPEDWGLQNSTLYSGLDEVAEEQTQLFRCPEQDVPFDGSSLDVGMDVSPSEPPWELLP      80

IFPDLQVKSEPSSPCSSSSLSSESSRLSTEPSSEALGVGEVLHVKTESLAPPLCLLGDDPTSSFETVQINVIPTSDDSSD     160

VQTKIEPVSPCSSVNSEASLLSADSSSQAFIGEEVLEVKTESLSPSGCLLWDVPAPSLGAVQISMGPSLDGSSGKALPTR     240

KPPLQPKPVVLTTVPMPSRAVPPSTTVLLQSLVQPPPVSPVVLIQGAIRVQPEGPAPSLPRPERKSIVPAPMPGNSCPPE     320

VDAKLLKRQQRMIKNRESACQSRRKKKEYLQGLEARLQAVLADNQQLRRENAALRRRLEALLAENSELKLGSGNRKVVCI     400

MVFLLFIAFNFGPVSISEPPSAPISPRMNKGEPQPRRHLLGFSEQEPVQGVEPLQGSSQGPKEPQPSPTDQPSFSNLTAF     480

PGGAKELLLRDLDQLFLSSDCRHFNRTESLRLADELSGWVQRHQRGRRKIPQRAQERQKSQPRKKSPPVKAVPIQPPGPP     560

ERDSVGQLQLYRHPDRSQPAFLDAIDRREDTFYVVSFRRDHLLLPAISHNKTSRPKMSLVMPAMAPNETLSGRGAPGDYE     640

EMMQIECEVMDTRVIHIKTSTVPPSLRKQPSPTPGNATGGPLPVSAASQAHQASHQPLYLNHP

................................................................................      80

................................................................................     160

................................................................................     240

................................................................................     320

.....................................C..........................................     400

................................................................................     480

................................................................................     560

................................................................................     640

...............................................................

 

Pos     Score   Cleavage

___________________________

30      0.251   none

45      0.067   none

47      0.063   none

54      0.067   none

86      0.235   none

148     0.069   none

162     0.153   none

188     0.169   none

222     0.063   none

245     0.462   none

270     0.393   none

274     0.075   none

285     0.108   none

291     0.102   none

329     0.065   none

330     0.066   none

341     0.281   none

351     0.333   none

358     0.916   EARLQ^AVLAD

365     0.067   none

366     0.062   none

434     0.066   none

445     0.061   none

449     0.100   none

455     0.196   none

459     0.064   none

465     0.064   none

471     0.125   none

494     0.068   none

521     0.073   none

524     0.080   none

532     0.071   none

535     0.071   none

538     0.063   none

541     0.082   none

555     0.073   none

567     0.083   none

569     0.085   none

578     0.071   none

644     0.076   none

669     0.072   none

689     0.150   none

692     0.097   none

696     0.079   none

___________________________







//





References



Coronavirus 3CL-pro proteinase cleavage sites: Possible relevance to SARS virus pathology
Lars Kiemer, Ole Lund, Søren Brunak, and Nikolaj Blom
BMC Bioinformatics 2004, 5:72

PMID: 15180906         doi: 10.1186/1471-2105-5-72

Abstract

Background: Despite the passing of a year since the first outbreak of SARS, efficient counter-measures are still few and many believe that the reappearance of SARS, which is caused by a coronavirus, is not unlikely. For other virus families like the picornaviruses it is known that pathology is related to proteolytic cleavage of host proteins by viral proteinases. Furthermore, several studies indicate that virus proliferation can be arrested using specific proteinase inibitors supporting the belief that proteinases are indeed important during infection. Prompted by this, we set out to analyse and predict cleavage by the coronavirus main proteinase using computational methods.

Results: We retrieved sequence data on seven fully sequenced coronaviruses and identified the main 3CL proteinase cleavage sites in polyproteins using alignments. A neural network was trained to recognise the cleavage sites in the genomes obtaining a sensitivity of 87.0% and a specificity of 99.0%. Several proteins known to be cleaved by other viruses were submitted to prediction as well as proteins suspected relevant in coronavirus pathology. Cleavage sites were predicted in proteins such as the cystic fibrosis transmembrane conductance regulator (CFTR), transcription factors CREB-RP and OCT-1, and components of the ubiquitin pathway.

Conclusion: Our prediction method NetCorona predicts coronavirus cleavage sites with high specificity and several potential cleavage candidates were identified which might be important to elucidate coronavirus pathology. Furthermore the method might assist in design of proteinase inhibitors for treatment of SARS and possible future diseases caused by coronaviruses.



GETTING HELP

Correspondence:        Technical Support: