Pyrad

From wiki
Jump to: navigation, search

Introduction

Software for RADseq

module load pyrad

will load the necessary software, mainly muscle and vsearch.

Guides

The go-to tutorial for this is at:

(note this tutorial assumes the executable is pyRAD, while in actuality, there are are only lower-case letter in the executable name).

highlights

  • with the pyrad module loaded, the "pyrad "executable is immediately available on the command line
  • one of "-p", "-d", "-D" and "-n" options are essential.
  • "pyrad -n" generates a "params.txt" file which contains settings for the run. A large part of the analysis can be configured by editing this file.
  • If using a Gridengine job script, there are two important parameters: no. 7 "N processors" and no. 37 "vsearch max threads". Multiply these together and supply the result to the "-pe multi" option in the gridengine jobscript. This is the parallel environment (i.e. total number of threads/cores that pyrad will use).
  • The first stage is de-multiplexing your short read dataset using the barcodes, a processes which will generate new fastq file depending on the barcodes. This is done via:
pyrad -p params.txt -s 1
  • The second stage editing the raw reads for quality, whereupon they will be converted to fasta. This is done via:
pyrad -p params.txt -s 2
  • The third stage is about de-replicating (unsure as to meaning) and clustering the short reads. This is done via:
pyrad -p params.txt -s 3
  • the pattern can be observed, that the "-s" option defines the stage, and the "params.txt" must always be referred to.

The rest of the tutorial describes stages 4 to 7 of the procedure.