Difference between revisions of "Estimating Gene Count Talk"
(Created page with "Estimating gene count = Estimating Gene Count = How many reads are overlapping genomic features? - or - Can we confidently assign each read to a feature/transcript/gene? Not...") |
(No difference)
|
Revision as of 12:33, 9 May 2017
Estimating gene count
Contents
- 1 Estimating Gene Count
- 2 Multi mapping reads
- 3 Multi mapping reads
- 4 Transcripts/Genes
- 5 Transcripts/Genes
- 6 Transcripts/Genes
- 7 Transcripts/Genes
- 8 Transcripts/Genes
- 9 Transcripts/Genes
- 10 Transcripts/Genes
- 11 HTSeq-count
- 12 HTSeq-count
- 13 Probabilistic approach
- 14 Probabilistic approach
- 15 Probabilistic approach
- 16 Probabilistic approach
Estimating Gene Count
How many reads are overlapping genomic features? - or - Can we confidently assign each read to a feature/transcript/gene? Not so simple.
We also have:
- Multi mapping reads
- Overlapping genes/transcripts
Two approaches:
- Focus on what’s known with certainty
- Probabilistic
Multi mapping reads
- Unsolved problem:
– Can account for 10-30% of reads GeneA – chr11 GeneB – chr5
– Ignore them … (decrease sensitivity) – Weighted assignment
Multi mapping reads
- Unsolved problem:
– Can account for 10-30% of reads GeneA – chr11 GeneB – chr5
Solution is to use longer reads – Ignore them … (decrease sensitivity) – Weighted assignment
Transcripts/Genes
- Transcripts/Isoforms or Genes
T1 GeneA
Transcripts/Genes
- Transcripts/Isoforms or Genes
T1 T2
GeneA
T3
Transcripts/Genes
- Transcripts/Isoforms or Genes
T1 T2
GeneA
Transcripts/Genes
- Transcripts/Isoforms or Genes
T1 T2
GeneA
T3
Transcripts/Genes
- Transcripts/Isoforms or Genes
T1 T2
GeneA
T3
Transcripts/Genes
- Transcripts/Isoforms or Genes
T1 T2
GeneA
T3
Transcripts/Genes
- Transcripts/Isoforms or Genes
T1 T2
GeneA
T3
Gene level is aggregating transcripts Transcript level needs longer reads
HTSeq-count
- Designed for RNA-Seq counting
- Simple to use (especially since v0.6.0)
- Work at gene level
- Remove multi-mapped reads
- Several modes to resolve remaining uncertainty
HTSeq-count
Probabilistic approach
Cufflink
cuffdiff
Probabilistic approach
Cufflinks: Reconstruct the transcripts from the data and annotation
Probabilistic approach
Cufflinks: Reconstruct the transcripts from the data and annotation
Cuffdiff: Assign each read/fragment to a transcript with a probability maximum likelihood.
Probabilistic approach
Cufflinks: Reconstruct the transcripts from the data and annotation Pros: - Better methodology - Integrated package (ease of use) Cons: Cuffdiff: - Do not support alternative experiment design - History of heterogeneous results/versions
- Assign each read/fragment to a transcript with a probability maximum likelihood.