Output format



DESCRIPTION

An example of output is found below. The output is composed of the following sections:
  1. Run identifiers and settings
  2. The parameters specificied by the user are reported here, together with the number of sequences loaded as input.

  3. Barplot of KLD vs number of clusters
  4. For each initial number of clusters, the information content of the alignments is shown as a barplot. The relative size of each block within a bar is proportional to the size of a given cluster. In the example below, the run with 3 initial clusters produced one empty cluster, therefore only 2 boxes are depicted in the third column of the barplot.

  5. Sequence logos of the optimal solution
  6. The sequence motifs identified by the Gibbs Clustering are shown to the right of the barplot. This is the optimal solution over the range of initial numbers of clusters. Hovering the cursor over the logos shows the KLD of each cluster in the solution.

  7. Complete results for all initial number of clusters
  8. If a range of initial numbers of clusters was specified (1 to 3 in the example below), the results for each case are listed in succession. The optimal number of cluster shown above is just a suggestion and depends on the specificied parameters, so the user is encouraged to inspect all solutions with different numbers of clusters.

    It is possible to customize the logos further by clicking on the LOGO button. This transfers the data to the Seq2Logo server, which allows plotting several different kinds of sequence logo.

    Inspect the complete Clustering Report and the formatted Clustering Solution for a tabular version of the results. Format of the Clustering Solution files:

    Remember that results are only stored on the CBS server for about 24 hours. Save your results to disk by clicking on the DOWNLOAD link at the bottom of the results page.



EXAMPLE OUTPUT



   

GibbsCluster Server - Results

Technical University of Denmark





Version: 2.0
Run ID: 27911
Run name: gibbs_27911
Platform: Linux x86_64


Read 200 unique sequences from file

Settings:
No shift moves, cluster moves at every iteration
Number of clusters: 1 - 3
Motif length: 9
Initial MC temperture: 0.8
Number of temperature steps: 20
Number of iterations x Sequence x Tstep: 100
Max insertion length: 1
Max deletion length: 5
Interval between Indel moves: 10
Number of initial seeds: 3
Penalty lambda: 0.8
Weight on small clusters: 10
Sequence weighting type: 0
Background model: Uniprot pre-calculated
Use trash cluster to remove outliers: 1
Threshold for trash cluster: 0

KLD vs. Number of clusters with λ = 0.8
Identified 2 sequence motifs

View the barplot in full size


RESULTS for 1 CLUSTERS

Final Average KLD: 9.466122

 Group   Size   KLD   Seq2Logo   Matrix 
 1   197   9.466 
  Mat_1.1  
 Outliers   3         

Raw Clustering Report
Formatted Clustering Solution
Clustered Alignment Cores

RESULTS for 2 CLUSTERS

Final Average KLD: 10.917174

 Group   Size   KLD   Seq2Logo   Matrix 
 1   100   9.372 
  Mat_1.2  
 2   100   12.462 
  Mat_2.2  
 Outliers   0         

Raw Clustering Report
Formatted Clustering Solution
Clustered Alignment Cores

RESULTS for 3 CLUSTERS

Final Average KLD: 10.907671

 Group   Size   KLD   Seq2Logo   Matrix 
 1   101   12.348 
  Mat_1.3  
 2   99   9.438 
  Mat_2.3  
 3   0   0.000 
 Outliers   0         

Raw Clustering Report
Formatted Clustering Solution
Clustered Alignment Cores



See the Activity log for this job