Revision as of 14:45, 30 January 2017

Introduction

Quality evaulation of a de-novo transcriptome assembly from the creators of RSEM (Deweylab). It was designed to answer the shortcomings in the N50 score typically used to evaluate assembly. It is a much more comprehensive quality evaluation, requiring as input, no only the de-novo assembly, but all the raw reads that were used to assemble it.

Layout

Two main aspects to this program. Also included here are the executables associated

rsem-eval
1. rsem-eval-calculate-score
2. rsem-eval-estimate-transcript-length-distribution
3. rsem-plot-model
4. rsem-build-read-index
5. rsem-eval-run-em
6. rsem-extract-reference-transcripts
7. rsem-parse-alignments
8. rsem-preref
9. rsem-sam-validator
10. rsem-scan-for-paired-end-reads
11. rsem-simulate-reads
12. rsem-synthesis-reference-transcripts

ref-eval
1. ref-eval
2. ref-eval-estimate-true-assembly

Usage

The detonate module must be loaded beforehand

module load detonate

After de-novo assembly of your transcriptome, the first step is

rsem-eval-estimate-transcript-length-distribution <contigs.fasta> <outputld.txt>

Explanation:

rsem-eval-estimate-transcript-length-distribution, a perl script
<contigs.fasta>, your de-novo assembly
<outputld.txt>, your chosen name for the output text file which will hold the mean and SD of the contig lengths distribution.

Next the RSEM-EVAL score can be calculated. There is one executable for this, and it has various options. Executing

rsem-eval-calculate-score --help

will allow you view them.

The standard usage example is:

rsem-eval-calculate-score -p 8 --transcript-length-parameters human.txt /data/reads.fq assembly1.fa assembly1_rsem_eval 76

Explanation: There are two option with an associated prefix: -p and --transcript-length-parameters. The rest are all positional

-p 8, this refers to the number of threads the program run will use. In this case 8.
--transcript-length-parameters, a filename contianing the output of the previous rsem-eval-estimate-transcript-length-distribution command.
The next option is a comma separated list of the raw reads used to assemble the de-novo transcriptome in the first place. It's probably best to build a list of these files beforehand, and pass them as a subprocess with
```
$(cat fq.lst |tr '\n' ',')
```

* the next option sis hte de-novo assembled fasta file * the penultimate option in thsi example is the output prefix, a anme of the users choice which will be used to prefix the output files. * the final option is a number representing the length of the raw reads. It's best to wrap this command in a job submission script as follows: #!/bin/bash #$ -V #$ -cwd #$ -j y #$ -S /bin/bash #$ -q highmemory.q #$ -pe multi 8 module load detonate rsem-eval-calculate-score -p $NSLOTS --transcript-length-parameters outputld.txt $(cat fq.lst |tr '\n' ',') <denovoassemblyname.fa> <output_prefix> <length_of_short_reads>

Links

The published paper describing Detonate

Difference between revisions of "Detonate"

Revision as of 14:45, 30 January 2017

Contents

Introduction

Layout

Usage

Links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

@@ Line 53: / Line 53: @@
 * -p 8, this refers to the number of threads the program run will use. In this case 8.
 * --transcript-length-parameters, a filename contianing the output of the previous '''rsem-eval-estimate-transcript-length-distribution''' command.
-* The next option is a comma seperated list of the raw reads used to assemble the de-novo transcriptome in the first place.
+* The next option is a comma separated list of the raw reads used to assemble the de-novo transcriptome in the first place. It's probably best to build a list of these files beforehand, and pass them as a subprocess with<pre>&#10;$(cat fq.lst |tr '\n' ',')
+* the next option sis hte de-novo assembled fasta file
+* the penultimate option in thsi example is the output prefix, a anme of the users choice which will be used to prefix the output files.
+* the final option is a number representing the length of the raw reads.
 It's best to wrap this command in a job submission script as follows:
@@ Line 64: / Line 67: @@
   #$ -q highmemory.q
   #$ -pe multi 8
+ module load detonate
   rsem-eval-calculate-score -p $NSLOTS --transcript-length-parameters outputld.txt $(cat fq.lst |tr '\n' ',') <denovoassemblyname.fa> <output_prefix> <length_of_short_reads>
 =Links=
 [http://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0553-5 The published paper describing Detonate]