SNP Analysis and phylogenetics
Phylogenies have typically be inferred from gene sequences where both variant AND invariant sites are included and this is very useful for the issue of branch length estimation.
When constructing trees from SNP analyses, what we have is an account of variant sites only.
Andrew Rambaut suggestion
Is to introduce the number of invariant sites in the shape of counts for each of the four bases. His post is here.
Where it mentions that the tree topology will probably be coorect, but that the branch lengths may be unreliable and most likely on the long side.
- Human Y-chromosome variation in the genome-sequencing era by Mark A. Jobling and Chris Tyler-Smith.
- Bakker et al. in their paper "A Whole-Genome Single Nucleotide Polymorphism-Based Approach To Trace and Identify Outbreaks Linked to a Common Salmonella enterica subsp. enterica Serovar Montevideo Pulsed-Field Gel Electrophoresis Type (div-cross)" discuss using nucmer and show-snps from the MUMmer package.