Bottlenose dolphin population genomic analysis

From wiki
Revision as of 00:10, 31 December 2016 by Rf (talk | contribs)
Jump to: navigation, search

Introduction

Used for population genetics studies.

Stages

Quality trimming

Using trimmomatic

MITOREFLOC=/storage/home/users/ml228/Genomics/NEA_BGI_data_ALL/NEA_BGI_clean_raw_data/Refs/mitogenome/gi_557468684_gb_KF570351.1_.fasta

  • FQN1, the forward paired output from Trimmomatic
  • FQN2, the reverse paired output from Trimmomatic
  • UBAM, the unmapped bam reflecting exclusion of mitochondiral mappings

MTBAM=${SSD1}/${MDN}/${SSD1}_${SRN1}_${SLN1}_to_mt_sorted.bam # the alignment to mito, sorted and now in bam format. MTSAM=${SSD1}/${MDN}/mito_aln-pe${SSD1}_${SRN1}_${SLN1}.sam # the raw sam alignment to mito, unsorted TFQ1=${SSD1}/${SDN}/${SSD1}_${SRN1}_${SLN1}_forward_paired.fq.gz # trimmed fq data file, forward paired reads TFQ2=${SSD2}/${SDN}/${SSD2}_${SRN2}_${SLN2}_reverse_paired.fq.gz # trimmed fq data file, reverse paired reads ACPTSZ=1000000 # acceptable size of a bam, a somewhat (only) risky way of seeing that it's OK.

REABAM=${SSD1}/${MOD}/aln-pe${SSD1}_${SRN1}_${SLN1}_sorted.bam REASAM=${SSD1}/${MOD}/aln-pe${SSD1}_${SRN1}_${SLN1}.sam

First we align to the mitochondrial reference:

bwa mem -t $NSLOTS $MITOREFLOC $TFQ1 $TFQ2 > $MTSAM

The resulting sam is sorted and converted to bam format:

samtools view -bSh $MTSAM | samtools sort - ${MTBAM%.*} -@ $NSLOTS

Actually we're only interested in the reads that did not map to the mitochondrial, and we can extract them into the their own bam with:

samtools view -f4 -bh $MTBAM > $UBAM -@ $NSLOTS

We can convert this extracted alignment back into FASTQ files with picard-tools and then compress as follows:

java -Xmx16g -jar $PICARDJARFILE SamToFastq I=${UBAM} F=${FQN1%.*} F2=${FQN2%.*}
pigz -f -p $NSLOTS ${FQN1%.*}
pigz -f -p $NSLOTS ${FQN2%.*}

These new reads can be aligned to the true reference

bwa mem -t $NSLOTS $REFLOC $FQN1 $FQN2 > $REASAM

The raw sam can be sorted and converted to bam to produce the alignment with the reads mapping to teh mitochondiral reference excluded as follows:

samtools view -bSh ${REASAM} | samtools sort - ${REABAM%.*} -@ $NSLOTS