Difference between revisions of "Bedtools"

From wiki
Jump to: navigation, search
Line 4: Line 4:
  
 
The focus of bedtools is genomic features, which can be variously defined.
 
The focus of bedtools is genomic features, which can be variously defined.
 +
 +
= Usage =
 +
 +
bedtools is made out of various subcommands, and these may be dealt with separately
 +
 +
== genomecov ==
 +
 +
The example in the documentation ([http://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html bedtools genomecov docs]) isn't very well explained, let's go over it. Note that the input is not a bam file as is more common, but a bed file, in this case.
 +
 +
$ cat A.bed
 +
chr1  10  20
 +
chr1  20  30
 +
chr2  0  500
 +
 +
$ cat my.genome
 +
chr1  1000
 +
chr2  500
 +
 +
$ bedtools genomecov -i A.bed -g my.genome
 +
chr1  0  980  1000  0.98
 +
chr1  1  20  1000  0.02
 +
chr2  1  500  500  1
 +
genome 0  980  1500  0.653333
 +
genome 1  520  1500  0.346667
 +
 +
Bed files at a minimum give start and end positions on chromosomes or contigs. Here we are dealing with the minimum. The '''A.bed''' file merely states that a feature, here an arbitrary one, exists for positions 10 to 19 of chromosome 1, and 20 to 29. It's possible that they are different features because otherwise
 +
 +
chr1 10 30
 +
 +
would have also sufficed. chr2 is self-explanatory. The '''my.genome''' merely gives the total length of both chromosomes. From this, genomecov can get to work, though it focuses on giving aggregate values, not positional values, for each chromosome. So the resulting output says that therw is zero coverage of a feature in 98% of chromsome 1, and then lumps the separately lined features on chromosome 1 together, so that they represent 2% of chromosome 1.

Revision as of 17:17, 1 December 2016

Introduction

General bioinformatics tool bult by the active Aaron Quinlan at University of Utah.

The focus of bedtools is genomic features, which can be variously defined.

Usage

bedtools is made out of various subcommands, and these may be dealt with separately

genomecov

The example in the documentation (bedtools genomecov docs) isn't very well explained, let's go over it. Note that the input is not a bam file as is more common, but a bed file, in this case.

$ cat A.bed
chr1  10  20
chr1  20  30
chr2  0   500

$ cat my.genome
chr1  1000
chr2  500

$ bedtools genomecov -i A.bed -g my.genome
chr1   0  980  1000  0.98
chr1   1  20   1000  0.02
chr2   1  500  500   1
genome 0  980  1500  0.653333
genome 1  520  1500  0.346667

Bed files at a minimum give start and end positions on chromosomes or contigs. Here we are dealing with the minimum. The A.bed file merely states that a feature, here an arbitrary one, exists for positions 10 to 19 of chromosome 1, and 20 to 29. It's possible that they are different features because otherwise

chr1 10 30

would have also sufficed. chr2 is self-explanatory. The my.genome merely gives the total length of both chromosomes. From this, genomecov can get to work, though it focuses on giving aggregate values, not positional values, for each chromosome. So the resulting output says that therw is zero coverage of a feature in 98% of chromsome 1, and then lumps the separately lined features on chromosome 1 together, so that they represent 2% of chromosome 1.