Software for RADseq
module load pyrad
will load the necessary software, mainly muscle and vsearch.
The go-to tutorial for this is at:
(note this tutorial assumes the executable is pyRAD, while in actuality, there are are only lower-case letter in the executable name).
- with the pyrad module loaded, the "pyrad "executable is immediately available on the command line
- one of "-p", "-d", "-D" and "-n" options are essential.
- "pyrad -n" generates a "params.txt" file which contains settings for the run. A large part of the analysis can be configured by editing this file.
- If using a Gridengine job script, there are two important parameters: no. 7 "N processors" and no. 37 "vsearch max threads". Multiply these together and supply the result to the "-pe multi" option in the gridengine jobscript. This is the parallel environment (i.e. total number of threads/cores that pyrad will use).
- The first stage is de-multiplexing your short read dataset using the barcodes, a processes which will generate new fastq file depending on the barcodes. This is done via:
pyrad -p params.txt -s 1
- The second stage editing the raw reads for quality, whereupon they will be converted to fasta. This is done via:
pyrad -p params.txt -s 2
- The third stage is about de-replicating (unsure as to meaning) and clustering the short reads. This is done via:
pyrad -p params.txt -s 3
- the pattern can be observed, that the "-s" option defines the stage, and the "params.txt" must always be referred to.
The rest of the tutorial describes stages 4 to 7 of the procedure.