Calculating coverage
Introduction
Coverage can be calculated in several different ways and this pages aims to describe a few of them
Bedtools genomecov
This is one of the most commonly used methods and requires loading of the bedtools module, also used in the callSNPs.py tool. The base of the command is
bedtools genomecov -bga -ibam <bamfile>
Explanation
in a script with AWK
Here is an example script for using genomecov with AWK to calculate coverage
#!/bin/bash EXPECTED_ARGS=2 if [ $# -ne $EXPECTED_ARGS ]; then echo "This script uses awk to calculate average coverage in a bamfile. It needs $EXPECTED_ARGS arguments" echo "Correct usage: $0 <subfolder_with_all_the_bamfiles>> <minimum_coverage_value>" exit fi module load bedtools O=$(find $1 -iname "*.bam") ARRAY=( $O ) for BF in ${ARRAY[@]}; do echo -n "$BF: " bedtools genomecov -bga -ibam $BF |awk -v t=$2 '{if($4>=t) {j=$3-$2+1;i=i+j;SUM=SUM+$4*j;}} END {print SUM/(i+0.0);}' done