Revision as of 13:14, 9 May 2017

1 Estimating Gene Count
2 Multi mapping reads
3 One transcript, one set of reads
4 Two transcripts, another set of reads
5 Aggregation to Gene-level 1
6 Third transcript, another set of reads
7 Aggregation to Gene-level 2
8 HTSeq-count
9 HTSeq-count
10 Probabilistic approach
11 Probabilistic approach
12 Probabilistic approach
13 Probabilistic approach

Estimating Gene Count

How many reads are overlapping genomic features? - or - Can we confidently assign each read to a feature/transcript/gene? Not so simple.

We also have:

Multi mapping reads
Overlapping genes/transcripts

Two approaches:

Focus on what’s known with certainty
Probabilistic

Multi mapping reads

Unsolved problem:

- this can account for 10-30% of reads

Ignore them, but then again this decreases sensitivity
Weighted assignment

Of course, longer reads would solve this problem.

One transcript, one set of reads

Two transcripts, another set of reads

[[File: t1t2.png

Aggregation to Gene-level 1

File:Tt1t2aggreg.png

Third transcript, another set of reads

Aggregation to Gene-level 2

File:T1t2tt3aggreg.png

HTSeq-count

Designed for RNA-Seq counting
Simple to use (especially since v0.6.0)
Work at gene level
Remove multi-mapped reads
Several modes to resolve remaining uncertainty

HTSeq-count

Probabilistic approach

Cufflink

cuffdiff

Probabilistic approach

Cufflinks: Reconstruct the transcripts from the data and annotation

Probabilistic approach

Cufflinks: Reconstruct the transcripts from the data and annotation

Cuffdiff: Assign each read/fragment to a transcript with a probability maximum likelihood.

Probabilistic approach

Cufflinks: Reconstruct the transcripts from the data and annotation Pros: - Better methodology - Integrated package (ease of use) Cons: Cuffdiff: - Do not support alternative experiment design - History of heterogeneous results/versions

Assign each read/fragment to a transcript with a probability maximum likelihood.

@@ Line 1: / Line 1: @@
-Estimating gene count
 = Estimating Gene Count =
 How many reads are overlapping genomic features?
@@ Line 17: / Line 15: @@
 = Multi mapping reads =
 * Unsolved problem:
-– Can account for 10-30% of reads
+:- this can account for 10-30% of reads
-GeneA – chr11
-GeneB – chr5
-– Ignore them … (decrease sensitivity)
+[[File:unsolved.png]]
-– Weighted assignment
-= Multi mapping reads =
+* Ignore them, but then again this decreases sensitivity
-* Unsolved problem:
+* Weighted assignment
-– Can account for 10-30% of reads
-GeneA – chr11
-GeneB – chr5
-Solution is to use longer reads
+Of course, longer reads would solve this problem.
-– Ignore them … (decrease sensitivity)
-– Weighted assignment
-= Transcripts/Genes =
+= One transcript, one set of reads =
-* Transcripts/Isoforms or Genes
-T1
-GeneA
-= Transcripts/Genes =
+[[File:t1.png]]
-* Transcripts/Isoforms or Genes
-T1
-T2
-GeneA
+= Two transcripts, another set of reads =
-T3
+[[File: t1t2.png
-= Transcripts/Genes =
+= Aggregation to Gene-level 1 =
-* Transcripts/Isoforms or Genes
-T1
-T2
-GeneA
+[[File:tt1t2aggreg.png]]
-= Transcripts/Genes =
+= Third transcript, another set of reads =
-* Transcripts/Isoforms or Genes
-T1
-T2
-GeneA
+[[File:t1t2t3.png]]
-T3
+= Aggregation to Gene-level 2 =
-= Transcripts/Genes =
+[[File:t1t2tt3aggreg.png]]
-* Transcripts/Isoforms or Genes
-T1
-T2
-GeneA
+= HTSeq-count =
-T3
-= Transcripts/Genes =
-* Transcripts/Isoforms or Genes
-T1
-T2
-GeneA
-T3
-= Transcripts/Genes =
-* Transcripts/Isoforms or Genes
-T1
-T2
-GeneA
-T3
+[[File:htseq.png]]
-Gene level is aggregating transcripts
-Transcript level needs longer reads
-= HTSeq-count =
 * Designed for RNA-Seq counting
@@ Line 103: / Line 55: @@
 = HTSeq-count =
+[[File:htcats.png]]
 = Probabilistic approach =

Difference between revisions of "Estimating Gene Count Talk"

Revision as of 13:14, 9 May 2017

Contents

Estimating Gene Count

Multi mapping reads

One transcript, one set of reads

Two transcripts, another set of reads

Aggregation to Gene-level 1

Third transcript, another set of reads

Aggregation to Gene-level 2

HTSeq-count

HTSeq-count

Probabilistic approach

Probabilistic approach

Probabilistic approach

Probabilistic approach

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools