Main Page
KENNEDY HPC for Bioinf community
Contents
Usage of Cluster
- Cluster Manual
- Kennedy manual
- Why a Queue Manager?
- Available Software
- how to use the cluster training course
- windows network connect
Documented Programs
The following can be seen as extra notes referring to these programs usage on the marvin cluster, with an emphais on example use-cases. Most, if not all, will have their own special websites, with more detailed manuals and further information.
* abacas | * albacore | * ariba | * aspera | * assembly-stats | * augustus | |
* BamQC | * bamtools | * banjo | * bcftools | * bedtools | * bgenie | |
* BLAST | * Blat | * blast2go: b2g4pipe | * bowtie | * bowtie2 | * bwa | |
* BUSCO | * CAFE | * canu | * cd-hit | * cegma | * clustal | |
* cramtools | * conda | * deeptools | * detonate | * diamond | * ea-utils | * ensembl |
* ETE | * FASTQC and MultiQC | * Archaeopteryx and Forester | * GapFiller | * GenomeTools | * gubbins | |
* JBrowse | * kallisto | * kentUtils | * last | * lastz | * macs2 | |
* Mash | * mega | * meryl | * MUMmer | * NanoSim | * nseq | |
* OrthoFinder | * PASA | * perl | * PGAP | * picard-tools | * poRe | |
* poretools | * prokka | * pyrad | * python | * qualimap | * quast | |
* qiime2 | * R | * RAxML | * Repeatmasker | * Repeatmodeler | * rnammer | |
* roary | * RSeQC | * samtools | * Satsuma | * sickle | * SPAdes | |
* squid | * sra-tools | * srst2 | * SSPACE | * stacks | * Thor | |
* Tophat | * trimmomatic | * Trinity | * t-coffee | * Unicycler | * velvet | |
* ViennaRNA |
Queue Manager Tips
A cluster is a shared resource with different users running different types of analyses. Nearly all clusters use a piece of software called a queue manager to fairly share out the resource. The queue manager on marvin is called Grid Engine, and it has several commands available, all beginning with q and with qsub being the most commonly used as it submits a command via a jobscript to be processed. Here are some tips:
- Queue Manager Tips
- Queue Manager : shell script command
- Queue Manager emailing when jobs run
- General Command-line Tips
- DRMAA for further Gridengine automation
Data Examples
Procedures
(short sequence of tasks with a certain short-term goal, often, a simple script)
Explanations
Pipelines
(Workflow with specific end-goals)
- Trinity_Protocol
- STAR BEAST
- callSNPs.py
- pairwiseCallSNPs
- mapping.py
- Edgen RNAseq
- Miseq Prokaryote FASTQ analysis
- snpcallphylo
- Bottlenose dolphin population genomic analysis
- ChIP-Seq Top2 in Yeast
- ChIP-Seq Top2 in Yeast 12.09.2017
- ChIP-Seq Top2 in Yeast 07.11.2017
- Bisulfite Sequencing
- microRNA and Salmo Salar
Protocols
(Extensive workflows with different with several possible end goals)
- Synthetic Long reads
- MinION (Oxford Nanopore)
- MinKNOW folders and log files
- Research Data Management
- MicroRNAs
Tech Reviews
Cluster Administration
- StABDMIN
- Hardware Issues
- marvin and IPMI (remote hardware control)
- restart a node
- mounting drives
- Admin Tips
- RedHat
- Globus_gridftp
- Galaxy Setup
- Son of Gridengine
- Blas Libraries
- CMake
- conda bioconda
- Users and Groups
- Installing software on marvin
- emailing
- biotime machine
- SCAN-pc laptop
- node1 issues
- 6TB storage expansion
- PIs storage sacrifice
- SAN relocation task
- Home directories max-out incident 28.11.2016
- Frontend Restart
- environment-modules
- H: drive on cluster
- Incident: Can't connect to BerkeleyDB
- Bioinformatics Wordpress Site
- Backups
- users disk usage
- Updating BLAST databases
- Python DRMAA
- message of the day
- SAN disconnect incident 10.01.2017
- Memory repair glitch 16.02.2017
- node9 network failure incident 16-20.03.2017
- Incorrect rebooting of marvin 19.09.2017
- ansible
- webstie and word press
- allow user access to other peoples data
- RAM and RAM slots
- ldap is not ldap
- reset a password
- sending emails from command line examples
Courses
I2U4BGA
- Original schedule
- New schedule
- Actual schedule
- Course itself
- Biolinux Source course
- Directory Organization Exercise
- Glossary
- Key Bindings
- one-liners
- Cheatsheets
- Links
- pandoc modified manual
- Command Line Exercises
hdi2u
The half-day linux course held on 20th April. Modified version of I2U4BGA.
RNAseq for DGE
- Theoretical background
- Quality Control and Preprocessing
- Mapping to Reference
- Mapping Quality Exercise
- Key Aspects of using R
- Estimating Gene Count Exercise
- Differential Expression Exercise
- Functional Analysis Exercise
Introduction to Unix 2017