Difference between revisions of "Detonate"
(Created page with "= Introduction = Quality evaulation of a de-novo transcriptome assembly from the creators of RSEM (Deweylab). = Layout = Two main aspects to this program. Also included her...") |
|||
(21 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Introduction = | = Introduction = | ||
− | Quality evaulation of a de-novo transcriptome assembly from the creators of RSEM (Deweylab). | + | Quality evaulation of a de-novo transcriptome assembly from the creators of RSEM (Deweylab). It was designed to answer the shortcomings in the N50 score typically used to evaluate assembly. It is a much more comprehensive quality evaluation, requiring as input, no only the de-novo assembly, but all the raw reads that were used to assemble it. |
= Layout = | = Layout = | ||
Line 7: | Line 7: | ||
Two main aspects to this program. Also included here are the executables associated | Two main aspects to this program. Also included here are the executables associated | ||
* rsem-eval | * rsem-eval | ||
− | * | + | *# rsem-eval-calculate-score |
− | * | + | *# rsem-eval-estimate-transcript-length-distribution |
− | * | + | *# rsem-plot-model |
− | * | + | *# rsem-build-read-index |
− | * | + | *# rsem-eval-run-em |
− | * | + | *# rsem-extract-reference-transcripts |
− | * | + | *# rsem-parse-alignments |
− | * | + | *# rsem-preref |
− | * | + | *# rsem-sam-validator |
− | * | + | *# rsem-scan-for-paired-end-reads |
− | * | + | *# rsem-simulate-reads |
+ | *# rsem-synthesis-reference-transcripts | ||
* ref-eval | * ref-eval | ||
+ | *# ref-eval | ||
+ | *# ref-eval-estimate-true-assembly | ||
+ | |||
+ | = Usage = | ||
+ | |||
+ | The detonate module must be loaded beforehand | ||
+ | |||
+ | module load detonate | ||
+ | |||
+ | After de-novo assembly of your transcriptome, the first step is | ||
+ | |||
+ | rsem-eval-estimate-transcript-length-distribution <contigs.fasta> <outputld.txt> | ||
+ | |||
+ | <ins>Explanation</ins>: | ||
+ | * rsem-eval-estimate-transcript-length-distribution, a perl script | ||
+ | * <contigs.fasta>, your de-novo assembly | ||
+ | * <outputld.txt>, your chosen name for the output text file which will hold the mean and SD of the contig lengths distribution. | ||
+ | |||
+ | Next the RSEM-EVAL score can be calculated. There is one executable for this, and it has various options. Executing | ||
+ | |||
+ | rsem-eval-calculate-score --help | ||
+ | |||
+ | will allow you view them. | ||
+ | |||
+ | The standard usage example is: | ||
+ | |||
+ | rsem-eval-calculate-score -p 8 --transcript-length-parameters human.txt /data/reads.fq assembly1.fa assembly1_rsem_eval 76 | ||
+ | |||
+ | <ins>Explanation</ins>: | ||
+ | There are two option with an associated prefix: '''-p''' and '''--transcript-length-parameters'''. The rest are all positional | ||
+ | * -p 8, this refers to the number of threads the program run will use. In this case 8. | ||
+ | * --transcript-length-parameters, a filename contianing the output of the previous '''rsem-eval-estimate-transcript-length-distribution''' command. | ||
+ | * The next option is a comma separated list of the raw reads used to assemble the de-novo transcriptome in the first place. It's probably best to build a list of these files beforehand, and pass them as a subprocess with<pre> $(cat fq.lst |tr '\n' ',')</pre> | ||
+ | |||
+ | * the next option sis hte de-novo assembled fasta file | ||
+ | * the penultimate option in this example is the output prefix, a name (of the users choice) which will be used to prefix the output files. | ||
+ | * the final option is a number representing the length of the raw reads. | ||
+ | |||
+ | It's best to wrap this command in a job submission script as follows: | ||
+ | |||
+ | #!/bin/bash | ||
+ | #$ -V | ||
+ | #$ -cwd | ||
+ | #$ -j y | ||
+ | #$ -S /bin/bash | ||
+ | #$ -q highmemory.q | ||
+ | #$ -pe multi 8 | ||
+ | module load detonate | ||
+ | rsem-eval-calculate-score -p $NSLOTS --transcript-length-parameters outputld.txt $(cat fq.lst |tr '\n' ',') <denovoassemblyname.fa> <output_prefix> <length_of_short_reads> | ||
+ | |||
+ | =Links= | ||
+ | * The [http://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0553-5 published paper] describing Detonate. | ||
+ | * [https://github.com/deweylab/detonate source code webpage] with some usage instructions |
Latest revision as of 14:57, 30 January 2017
Contents
Introduction
Quality evaulation of a de-novo transcriptome assembly from the creators of RSEM (Deweylab). It was designed to answer the shortcomings in the N50 score typically used to evaluate assembly. It is a much more comprehensive quality evaluation, requiring as input, no only the de-novo assembly, but all the raw reads that were used to assemble it.
Layout
Two main aspects to this program. Also included here are the executables associated
- rsem-eval
- rsem-eval-calculate-score
- rsem-eval-estimate-transcript-length-distribution
- rsem-plot-model
- rsem-build-read-index
- rsem-eval-run-em
- rsem-extract-reference-transcripts
- rsem-parse-alignments
- rsem-preref
- rsem-sam-validator
- rsem-scan-for-paired-end-reads
- rsem-simulate-reads
- rsem-synthesis-reference-transcripts
- ref-eval
- ref-eval
- ref-eval-estimate-true-assembly
Usage
The detonate module must be loaded beforehand
module load detonate
After de-novo assembly of your transcriptome, the first step is
rsem-eval-estimate-transcript-length-distribution <contigs.fasta> <outputld.txt>
Explanation:
- rsem-eval-estimate-transcript-length-distribution, a perl script
- <contigs.fasta>, your de-novo assembly
- <outputld.txt>, your chosen name for the output text file which will hold the mean and SD of the contig lengths distribution.
Next the RSEM-EVAL score can be calculated. There is one executable for this, and it has various options. Executing
rsem-eval-calculate-score --help
will allow you view them.
The standard usage example is:
rsem-eval-calculate-score -p 8 --transcript-length-parameters human.txt /data/reads.fq assembly1.fa assembly1_rsem_eval 76
Explanation: There are two option with an associated prefix: -p and --transcript-length-parameters. The rest are all positional
- -p 8, this refers to the number of threads the program run will use. In this case 8.
- --transcript-length-parameters, a filename contianing the output of the previous rsem-eval-estimate-transcript-length-distribution command.
- The next option is a comma separated list of the raw reads used to assemble the de-novo transcriptome in the first place. It's probably best to build a list of these files beforehand, and pass them as a subprocess with
$(cat fq.lst |tr '\n' ',')
- the next option sis hte de-novo assembled fasta file
- the penultimate option in this example is the output prefix, a name (of the users choice) which will be used to prefix the output files.
- the final option is a number representing the length of the raw reads.
It's best to wrap this command in a job submission script as follows:
#!/bin/bash #$ -V #$ -cwd #$ -j y #$ -S /bin/bash #$ -q highmemory.q #$ -pe multi 8 module load detonate rsem-eval-calculate-score -p $NSLOTS --transcript-length-parameters outputld.txt $(cat fq.lst |tr '\n' ',') <denovoassemblyname.fa> <output_prefix> <length_of_short_reads>
Links
- The published paper describing Detonate.
- source code webpage with some usage instructions