Difference between revisions of "Thor"
Line 6: | Line 6: | ||
* [https://groups.google.com/forum/#!categories/rgtusers/odinthor https://groups.google.com/forum/#!categories/rgtusers/odinthor] | * [https://groups.google.com/forum/#!categories/rgtusers/odinthor https://groups.google.com/forum/#!categories/rgtusers/odinthor] | ||
− | |||
− | |||
<ins>Notes</ins>: | <ins>Notes</ins>: | ||
− | * | + | * Thor follows on the ODIN tool which is largely supersedes. |
* For normalization it can use the TMM method widelused in RNA-Seq (EdgeR) or a bed-format list of housekeeping genes. | * For normalization it can use the TMM method widelused in RNA-Seq (EdgeR) or a bed-format list of housekeeping genes. | ||
+ | * There is no verbose options to see what might be going wrong, though the developers are working on it. | ||
+ | * There is no parallelism in this tool, it can run quite slowly. For a big sample set resercve as much as two days. | ||
+ | * Neither Thor nor Odin take accoutn of paired reads in the BAM file. Recommendation from devs is to filter discordant pairs from the BAM file. | ||
== Housekeeping genes == | == Housekeeping genes == | ||
Line 69: | Line 70: | ||
Normalize input of Signal 1, Rep 1 with factor 0.647 | Normalize input of Signal 1, Rep 1 with factor 0.647 | ||
Use global TMM approach | Use global TMM approach | ||
+ | Compute HMM's training set | ||
+ | No differential peaks detected | ||
+ | |||
+ | == A second run, with four housekeeping genes == | ||
+ | |||
+ | rgt-THOR first.config --report --housekeeping-genes fourhk_yeast.bed --no-correction --output-dir ./thorsec -n THOR_DCsec | ||
+ | |||
+ | Call DPs on whole genome. | ||
+ | Computing read extension sizes for ChIP-seq profiles | ||
+ | Compute GC-content | ||
+ | Compute factors | ||
+ | Normalize input of Signal 0, Rep 0 with factor 0.644 | ||
+ | Normalize input of Signal 0, Rep 1 with factor 0.645 | ||
+ | Normalize input of Signal 1, Rep 0 with factor 0.647 | ||
+ | Normalize input of Signal 1, Rep 1 with factor 0.647 | ||
+ | Use housekeeping gene approach | ||
+ | -Housekeeping gene matrix (columns-genes, rows-samples) | ||
+ | [[ 5655. 3364. 4323. 4224.] | ||
+ | [ 5569. 2860. 4550. 4301.] | ||
+ | [ 6860. 3415. 4752. 4436.] | ||
+ | [ 6176. 3574. 5364. 4361.]] | ||
+ | |||
+ | -gene (column) wise evaluation | ||
+ | cdc19 0.00021061043602 | ||
+ | act1 0.000309042887452 | ||
+ | tdh3 0.000258788890782 | ||
+ | fba1 0.000205695296173 | ||
+ | |||
+ | -sample (row) wise evaluation | ||
+ | WTIP_S5_L001_srtd 0.000177121339107 | ||
+ | WTIP_S5_L002_srtd 0.000458559742218 | ||
+ | USL1IP_S7_L001_srtd 0.000271335438277 | ||
+ | USL1IP_S7_L002_srtd 0.000422582780881 | ||
+ | |||
+ | Compute GC-content | ||
+ | Compute factors | ||
+ | Normalize input of Signal 0, Rep 0 with factor 0.644 | ||
+ | Normalize input of Signal 0, Rep 1 with factor 0.645 | ||
+ | Normalize input of Signal 1, Rep 0 with factor 0.647 | ||
+ | Normalize input of Signal 1, Rep 1 with factor 0.647 | ||
+ | Use housekeeping gene approach | ||
+ | -Housekeeping gene matrix (columns-genes, rows-samples) | ||
+ | [[ 5655. 3364. 4323. 4224.] | ||
+ | [ 5569. 2860. 4550. 4301.] | ||
+ | [ 6860. 3415. 4752. 4436.] | ||
+ | [ 6176. 3574. 5364. 4361.]] | ||
+ | |||
+ | -gene (column) wise evaluation | ||
+ | cdc19 0.00021061043602 | ||
+ | act1 0.000309042887452 | ||
+ | tdh3 0.000258788890782 | ||
+ | fba1 0.000205695296173 | ||
+ | |||
+ | -sample (row) wise evaluation | ||
+ | WTIP_S5_L001_srtd 0.000177121339107 | ||
+ | WTIP_S5_L002_srtd 0.000458559742218 | ||
+ | USL1IP_S7_L001_srtd 0.000271335438277 | ||
+ | USL1IP_S7_L002_srtd 0.000422582780881 | ||
+ | |||
Compute HMM's training set | Compute HMM's training set | ||
No differential peaks detected | No differential peaks detected |
Revision as of 09:18, 6 June 2017
Contents
Introduction
This is one of the newer Differential Peak Callers from Costalab in Aachen.
It has a low traffic google groups page, but a google groups page nonetheless at:
Notes:
- Thor follows on the ODIN tool which is largely supersedes.
- For normalization it can use the TMM method widelused in RNA-Seq (EdgeR) or a bed-format list of housekeeping genes.
- There is no verbose options to see what might be going wrong, though the developers are working on it.
- There is no parallelism in this tool, it can run quite slowly. For a big sample set resercve as much as two days.
- Neither Thor nor Odin take accoutn of paired reads in the BAM file. Recommendation from devs is to filter discordant pairs from the BAM file.
Housekeeping genes
The rationale for this approach is that they are genes that have a stable expression pattern, and so can be used to normalise all the others that don't.
Usage
a first run
The first few runs are usually error-prone. Thor is no exception:
rgt-THOR first.config --report --no-correction --output-dir ./thorfirst -n THOR_DCfirst
Explanation:
- first.config, this is a configuration file, see below
- --output-dir, program will fail to run if this already exists.
The contents of the first.config
is as follows:
#rep1 WTIP_S5_L001/WTIP_S5_L001_srtd.bam WTIP_S5_L002/WTIP_S5_L002_srtd.bam #rep2 USL1IP_S7_L001/USL1IP_S7_L001_srtd.bam USL1IP_S7_L002/USL1IP_S7_L002_srtd.bam #genome /storage/home/users/as363/w303_genome/w303.fa #chrom_sizes /storage/home/users/as363/w303_genome/w303_.sizes #inputs1 WTINP_S4_L001/WTINP_S4_L001_srtd.bam WTINP_S4_L002/WTINP_S4_L002_srtd.bam #inputs2 ULS1INP_S6_L001/ULS1INP_S6_L001_srtd.bam ULS1INP_S6_L002/ULS1INP_S6_L002_srtd.bam
output from this run
Call DPs on whole genome. Computing read extension sizes for ChIP-seq profiles Compute GC-content [fai_load] build FASTA index. Compute factors Normalize input of Signal 0, Rep 0 with factor 0.644 Normalize input of Signal 0, Rep 1 with factor 0.645 Normalize input of Signal 1, Rep 0 with factor 0.647 Normalize input of Signal 1, Rep 1 with factor 0.647 Use global TMM approach TMM normalization not successfully performed, do not normalize data TMM normalization not successfully performed, do not normalize data TMM normalization not successfully performed, do not normalize data TMM normalization not successfully performed, do not normalize data Compute GC-content Compute factors Normalize input of Signal 0, Rep 0 with factor 0.644 Normalize input of Signal 0, Rep 1 with factor 0.645 Normalize input of Signal 1, Rep 0 with factor 0.647 Normalize input of Signal 1, Rep 1 with factor 0.647 Use global TMM approach Compute HMM's training set No differential peaks detected
A second run, with four housekeeping genes
rgt-THOR first.config --report --housekeeping-genes fourhk_yeast.bed --no-correction --output-dir ./thorsec -n THOR_DCsec
Call DPs on whole genome. Computing read extension sizes for ChIP-seq profiles Compute GC-content Compute factors Normalize input of Signal 0, Rep 0 with factor 0.644 Normalize input of Signal 0, Rep 1 with factor 0.645 Normalize input of Signal 1, Rep 0 with factor 0.647 Normalize input of Signal 1, Rep 1 with factor 0.647 Use housekeeping gene approach -Housekeeping gene matrix (columns-genes, rows-samples) [[ 5655. 3364. 4323. 4224.] [ 5569. 2860. 4550. 4301.] [ 6860. 3415. 4752. 4436.] [ 6176. 3574. 5364. 4361.]] -gene (column) wise evaluation cdc19 0.00021061043602 act1 0.000309042887452 tdh3 0.000258788890782 fba1 0.000205695296173 -sample (row) wise evaluation WTIP_S5_L001_srtd 0.000177121339107 WTIP_S5_L002_srtd 0.000458559742218 USL1IP_S7_L001_srtd 0.000271335438277 USL1IP_S7_L002_srtd 0.000422582780881 Compute GC-content Compute factors Normalize input of Signal 0, Rep 0 with factor 0.644 Normalize input of Signal 0, Rep 1 with factor 0.645 Normalize input of Signal 1, Rep 0 with factor 0.647 Normalize input of Signal 1, Rep 1 with factor 0.647 Use housekeeping gene approach -Housekeeping gene matrix (columns-genes, rows-samples) [[ 5655. 3364. 4323. 4224.] [ 5569. 2860. 4550. 4301.] [ 6860. 3415. 4752. 4436.] [ 6176. 3574. 5364. 4361.]] -gene (column) wise evaluation cdc19 0.00021061043602 act1 0.000309042887452 tdh3 0.000258788890782 fba1 0.000205695296173 -sample (row) wise evaluation WTIP_S5_L001_srtd 0.000177121339107 WTIP_S5_L002_srtd 0.000458559742218 USL1IP_S7_L001_srtd 0.000271335438277 USL1IP_S7_L002_srtd 0.000422582780881 Compute HMM's training set No differential peaks detected