A file format that records the variants manifested by reads against the reference they are aligned to. Most often it refers to multiple samples and the nature of the variants are described in columns dedicated to each sample.

BCF is simply the binary (and therefore compressed) version of the file format.

It used to be maintained by the 1000 Genomes Project. The latest version 4.2 and its specification is hosted at the samtools website.


  • DP refer to overall depth at the locus/position without taking into account base quality
  • DP4 refers to 4 numbers,