Revision as of 11:18, 15 February 2017

Introduction

Samtools originator Heng Li codde this aligner.

Usage

As with samtools, bwa also went through some re-structuring, so that it has an old-style tw-step (aln and sam{s,p}e) usage, characterised by the following typical sequence of commands:

bwa index reference.fa
bwa aln -I -t 8 reference.fa s_1.txt > out.sai
bwa samse reference.fa out.sai s_1.txt > out.sam
samtools view -bSu out.sam | samtools sort -  out.sorted

And then a more modern usage which consists of just one step: bwa mem.

Note how there are no option switches. The prefix for the reference must come first, and the reads must come second.

Also note that if there are several readsets, it is best to align and obtain a sam/bam file for each individually (of course, not pairs, the pairs count as one readset). Canonical usage

bwa mem index_prefix [input_reads.fastq|input_reads_pair_1.fastq input_reads_pair_2.fastq] [options]

Indexing

Before alignment, indexing the reference is necessary. When bwa indexes a reference, it will use the whole filename and generate output index files with extensions added onto this name.

bwa index input_reference.fasta index_prefix

This index_prefix is then used for the actual alignment step.

Indexing may take a long time with large reference files. Here is some example output

[bwa_index] Pack FASTA... 150.69 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=4745806978, availableWord=345932252
[BWTIncConstructFromPacked] 10 iterations done. 99999986 characters processed.
[BWTIncConstructFromPacked] 20 iterations done. 199999986 characters processed.
.
.
.
[BWTIncConstructFromPacked] 530 iterations done. 4719199682 characters processed.
[BWTIncConstructFromPacked] 540 iterations done. 4741303618 characters processed.
[bwt_gen] Finished constructing BWT in 543 iterations.
[bwa_index] 2470.36 seconds elapse.
[bwa_index] Update BWT... 129.37 sec
[bwa_index] Pack forward-only FASTA... 302.45 sec
[bwa_index] Construct SA from BWT and Occ... 1220.18 sec
[main] Version: 0.7.12-r1039
[main] CMD: bwa index unplaced.scaf.fa
[main] Real time: 5398.987 sec; CPU: 4273.064 sec

The output files will be, in this case:

unplaced.scaf.fa.amb, a text file
unplaced.scaf.fa.ann, a text file
unplaced.scaf.fa.bwt, a binary file
unplaced.scaf.fa.pac, a binary file
unplaced.scaf.fa.sa, a binary file

Actual alignment step

bwa mem index_prefix input_reads_pair_1.fastq input_reads_pair_2.fastq

In this case we have paired read-files. With single reads of course only one name would be required.

@@ Line 1: / Line 1: @@
 = Introduction =
-Heng Li's aligner.
+Samtools originator Heng Li codde this aligner.
 = Usage =
@@ Line 12: / Line 12: @@
   samtools view -bSu out.sam | samtools sort -  out.sorted
 And then a more modern usage which consists of just one step: '''bwa mem'''.
+Note how there are no option switches. The prefix for the reference must come first, and the reads must come second.
+Also note that if there are several readsets, it is best to align and obtain a sam/bam file for each individually (of course, not pairs, the pairs count as one readset). Canonical usage
+ bwa mem index_prefix [input_reads.fastq|input_reads_pair_1.fastq input_reads_pair_2.fastq] [options]
 == Indexing ==

Difference between revisions of "Bwa"

Revision as of 11:18, 15 February 2017

Contents

Introduction

Usage

Indexing

Actual alignment step

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools