Difference between revisions of "Miseq Prokaryote FASTQ analysis"
| Line 11: | Line 11: | ||
# Create symlinks from your mounted directory to the new directory n scrathc you've just created. An example is: | # Create symlinks from your mounted directory to the new directory n scrathc you've just created. An example is: | ||
for i in $(ls /storage/home/users/ramon/mnt/miseqda/2016-07-15_160715_M01714_0021_000000000-ANWN5/*.fastq.gz); do ln -s $i; done | for i in $(ls /storage/home/users/ramon/mnt/miseqda/2016-07-15_160715_M01714_0021_000000000-ANWN5/*.fastq.gz); do ln -s $i; done | ||
| + | |||
| + | # then we can run a quality detection program, witht emost widely used one being FASTQC. The following script will do each fastqc pair in parallel. | ||
| + | |||
| + | |||
| + | #!/bin/bash | ||
| + | #$ -cwd | ||
| + | #$ -j y | ||
| + | #$ -S /bin/bash | ||
| + | #$ -V | ||
| + | #$ -q marvin.q | ||
| + | |||
| + | # some quick "argument accounting" | ||
| + | EXPECTED_ARGS=1 # change value to suit! | ||
| + | if [ $# -ne $EXPECTED_ARGS ]; then | ||
| + | echo "error, this script should be fed with one argument: a filelist of fastq(.gz) files" | ||
| + | exit | ||
| + | fi | ||
| + | module load FASTQC | ||
| + | N=( $(sed -n "${SGE_TASK_ID}p" $1) ) | ||
| + | R1=${N[0]} | ||
| + | R2=${N[1]} | ||
| + | # echo "fastqc $R1 $R2" | ||
| + | fastqc $R1 $R2 | ||
Revision as of 16:24, 18 July 2016
Introduction
We have our own Miseq machine and each week a run is carried out (each run costs about £1000 in consumibles) from various samples.
They are uploaded onto HDRIVE.
Procedure
- Go into marvin scratch area and create a new directory, reflecting the date of the run.
- Make sure you have mounted the hdrive onto marvin.
- Create symlinks from your mounted directory to the new directory n scrathc you've just created. An example is:
for i in $(ls /storage/home/users/ramon/mnt/miseqda/2016-07-15_160715_M01714_0021_000000000-ANWN5/*.fastq.gz); do ln -s $i; done
- then we can run a quality detection program, witht emost widely used one being FASTQC. The following script will do each fastqc pair in parallel.
#!/bin/bash
#$ -cwd
#$ -j y
#$ -S /bin/bash
#$ -V
#$ -q marvin.q
# some quick "argument accounting"
EXPECTED_ARGS=1 # change value to suit!
if [ $# -ne $EXPECTED_ARGS ]; then
echo "error, this script should be fed with one argument: a filelist of fastq(.gz) files"
exit
fi
module load FASTQC
N=( $(sed -n "${SGE_TASK_ID}p" $1) )
R1=${N[0]}
R2=${N[1]}
# echo "fastqc $R1 $R2"
fastqc $R1 $R2