Revision as of 13:21, 9 May 2017

1 Estimating Gene Count
2 Multi mapping reads
3 One transcript, one set of reads
4 Two transcripts, another set of reads
5 Aggregation to Gene-level 1
6 Third transcript, another set of reads
7 Aggregation to Gene-level 2
8 HTSeq-count
9 HTSeq-count modes
10 Probabilistic approach
11 Probabilistic approach
12 Probabilistic approach
13 Probabilistic approach

Estimating Gene Count

How many reads are overlapping genomic features? - or - Can we confidently assign each read to a feature/transcript/gene? Not so simple.

We also have:

Multi mapping reads
Overlapping genes/transcripts

Two approaches:

Focus on what’s known with certainty
Probabilistic

Multi mapping reads

Unsolved problem:

- this can account for 10-30% of reads

Ignore them, but then again this decreases sensitivity
Weighted assignment

Of course, longer reads would solve this problem.

One transcript, one set of reads

Two transcripts, another set of reads

Aggregation to Gene-level 1

Third transcript, another set of reads

Aggregation to Gene-level 2

HTSeq-count

Designed for RNA-Seq counting
Work at gene level
Remove multi-mapped reads
Several modes to resolve remaining uncertainty

HTSeq-count modes

Probabilistic approach

Cufflink

cuffdiff

Probabilistic approach

Cufflinks: Reconstruct the transcripts from the data and annotation

Probabilistic approach

Cufflinks: Reconstruct the transcripts from the data and annotation

Cuffdiff: Assign each read/fragment to a transcript with a probability maximum likelihood.

Probabilistic approach

Cufflinks: Reconstruct the transcripts from the data and annotation Pros: - Better methodology - Integrated package (ease of use) Cons: Cuffdiff: - Do not support alternative experiment design - History of heterogeneous results/versions

Assign each read/fragment to a transcript with a probability maximum likelihood.

@@ Line 30: / Line 30: @@
 = Two transcripts, another set of reads =
-[[File: t1t2.png
+[[File:t1t2.png]]
 = Aggregation to Gene-level 1 =
-[[File:tt1t2aggreg.png]]
+[[File:t1t2aggreg.png]]
 = Third transcript, another set of reads =
@@ Line 42: / Line 42: @@
 = Aggregation to Gene-level 2 =
-[[File:t1t2tt3aggreg.png]]
+[[File:t1t2t3aggreg.png]]
 = HTSeq-count =
-[[File:htseq.png]]
 * Designed for RNA-Seq counting
-* Simple to use (especially since v0.6.0)
 * Work at gene level
 * Remove multi-mapped reads
 * Several modes to resolve remaining uncertainty
-= HTSeq-count =
+[[File:htseq.png]]
+= HTSeq-count modes =
 [[File:htcats.png]]
@@ Line 74: / Line 73: @@
 Cuffdiff:
-Assign each read/fragment to a transcript
+Assign each read/fragment to a transcript with a probability maximum likelihood.
-with a probability maximum likelihood.
 = Probabilistic approach =

Difference between revisions of "Estimating Gene Count Talk"

Revision as of 13:21, 9 May 2017

Contents

Estimating Gene Count

Multi mapping reads

One transcript, one set of reads

Two transcripts, another set of reads

Aggregation to Gene-level 1

Third transcript, another set of reads

Aggregation to Gene-level 2

HTSeq-count

HTSeq-count modes

Probabilistic approach

Probabilistic approach

Probabilistic approach

Probabilistic approach

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools