Difference between revisions of "Pyrad"
(3 intermediate revisions by the same user not shown) | |||
Line 19: | Line 19: | ||
* one of "-p", "-d", "-D" and "-n" options are essential. | * one of "-p", "-d", "-D" and "-n" options are essential. | ||
* "pyrad -n" generates a "params.txt" file which contains settings for the run. A large part of the analysis can be configured by editing this file. | * "pyrad -n" generates a "params.txt" file which contains settings for the run. A large part of the analysis can be configured by editing this file. | ||
− | * If using a Gridengine job script, | + | * If using a Gridengine job script, there are two important parameters: no. 7 "N processors" and no. 37 "vsearch max threads". Multiply these together and supply the result to the "-pe multi" option in the gridengine jobscript. This is the parallel environment (i.e. total number of threads/cores that pyrad will use). |
* The first stage is de-multiplexing your short read dataset using the barcodes, a processes which will generate new fastq file depending on the barcodes. This is done via: | * The first stage is de-multiplexing your short read dataset using the barcodes, a processes which will generate new fastq file depending on the barcodes. This is done via: | ||
pyrad -p params.txt -s 1 | pyrad -p params.txt -s 1 | ||
− | * The second stage editing the raw reads for quality, whereupon they will be converted to | + | * The second stage editing the raw reads for quality, whereupon they will be converted to fasta. This is done via: |
pyrad -p params.txt -s 2 | pyrad -p params.txt -s 2 | ||
+ | * The third stage is about de-replicating (unsure as to meaning) and clustering the short reads. This is done via: | ||
+ | pyrad -p params.txt -s 3 | ||
+ | * the pattern can be observed, that the "-s" option defines the stage, and the "params.txt" must always be referred to. | ||
+ | |||
+ | The rest of the tutorial describes stages 4 to 7 of the procedure. |
Latest revision as of 17:39, 20 April 2016
Introduction
Software for RADseq
module load pyrad
will load the necessary software, mainly muscle and vsearch.
Guides
The go-to tutorial for this is at:
(note this tutorial assumes the executable is pyRAD, while in actuality, there are are only lower-case letter in the executable name).
highlights
- with the pyrad module loaded, the "pyrad "executable is immediately available on the command line
- one of "-p", "-d", "-D" and "-n" options are essential.
- "pyrad -n" generates a "params.txt" file which contains settings for the run. A large part of the analysis can be configured by editing this file.
- If using a Gridengine job script, there are two important parameters: no. 7 "N processors" and no. 37 "vsearch max threads". Multiply these together and supply the result to the "-pe multi" option in the gridengine jobscript. This is the parallel environment (i.e. total number of threads/cores that pyrad will use).
- The first stage is de-multiplexing your short read dataset using the barcodes, a processes which will generate new fastq file depending on the barcodes. This is done via:
pyrad -p params.txt -s 1
- The second stage editing the raw reads for quality, whereupon they will be converted to fasta. This is done via:
pyrad -p params.txt -s 2
- The third stage is about de-replicating (unsure as to meaning) and clustering the short reads. This is done via:
pyrad -p params.txt -s 3
- the pattern can be observed, that the "-s" option defines the stage, and the "params.txt" must always be referred to.
The rest of the tutorial describes stages 4 to 7 of the procedure.