Submission


Paste in one or more GenBank file(s)

Upload file containing one or more GenBank entries


View example GenBank file
Notice: Multiple GenBank format files can be concatenated. A comprehensive source for GenBank files is the NCBI web-site: http://www.ncbi.nlm.nih.gov/.

Mar 13th, 2017:Light version (automatic look-up of GenBank IDs disabled).


Instructions: Basic usage - Paste in or upload a set of GenBank format files and hit submit. The FeatureExtract server will then by default extract all protein coding genes with full intron/exon annotation.

Please read the DTU Health Tech access policies for information about limitations on the daily number of submissions. For processing large datasets (e.g the Human Genome builds from NCBI) it is recommended to download the command-line version of FeatureExtract from the "Software download" page, and do the processing locally.


Basic options

Select type of features to extract

Alternatively, enter the desired feature type(s) below:

Example: CDS,rRNA,tRNA

Include intergenic regions.
[details]

Naming preferences

1) Gene name
2) Systematic name
3) EntryId + distance

If the desired type of naming is not available, fall back to the level below: 1 -> 2 -> 3.
[details]

Flanking regions

bp : Upstream (5')
bp : Downstream (3')

Optional: Define flanking regions
[details]


Advanced options

Frameshifts

(bp): Frameshift cutoff

"Introns" shorter than this length are considered annotated frameshifts
[details]

Custom defined annotation

Example: snRNA=(N),promoter={P},unknown=QQQ
[details]

Splicing (new in 1.2)

Splice all intron containing seqeunces
Full length sequences are kept in the comments field

Only output intron containing sequnces
Can be used in combination with the "splice all..." option

[details]

Feature types to annotate in flanking regions

Alternatively, enter the desired feature type(s) below:

Example: MOST,polyA
[details]

Flanking region annotation scheme

Full annotation
Uppercase = same strand, Lowercase = opposite strand.
Presence/absence annotation
+ = same strand, - = opposite strand, # = overlapping
[details]

Trouble shooting

Produce verbose information

Verbose: Output additional information about the contents of the GenBank files and the general progress of the extraction.
[details]

Restrictions: A maximum of 100mb of GenBank files will be processed in each run.

Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:

FeatureExtract - extraction of sequence annotation made easy.
Rasmus Wernersson.
Nucleic Acids Research, 2005, Vol. 33, Web Server issue W567-W569


PORTABLE VERSION

The commandline version of FeatureExtract is open source software (GPL license) and can be downloaded on the "Software download" page.

If you require FeatureExtract on a commerical license, please contact software@cbs.dtu.dk.