Difference between revisions of "Main Page"
PeterThorpe (talk | contribs)  (→Usage of Cluster)  | 
				PeterThorpe (talk | contribs)   (→Cluster Administration)  | 
				||
| (3 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| + | |||
| + | '''KENNEDY HPC for Bioinf community '''  | ||
| + | * [[Kennedy manual]]  | ||
| + | |||
| + | |||
= Usage of Cluster=  | = Usage of Cluster=  | ||
* [[Cluster Manual]]  | * [[Cluster Manual]]  | ||
| Line 215: | Line 220: | ||
* [[reset a password]]  | * [[reset a password]]  | ||
* [[sending emails from command line examples]]  | * [[sending emails from command line examples]]  | ||
| + | * [[disk management after shelf disk failure]]  | ||
= Courses =  | = Courses =  | ||
Revision as of 11:12, 19 June 2020
KENNEDY HPC for Bioinf community
Contents
Usage of Cluster
- Cluster Manual
 - Kennedy manual
 - Why a Queue Manager?
 - Available Software
 - how to use the cluster training course
 - windows network connect
 
Documented Programs
The following can be seen as extra notes referring to these programs usage on the marvin cluster, with an emphais on example use-cases. Most, if not all, will have their own special websites, with more detailed manuals and further information.
| * abacas | * albacore | * ariba | * aspera | * assembly-stats | * augustus | |
| * BamQC | * bamtools | * banjo | * bcftools | * bedtools | * bgenie | |
| * BLAST | * Blat | * blast2go: b2g4pipe | * bowtie | * bowtie2 | * bwa | |
| * BUSCO | * CAFE | * canu | * cd-hit | * cegma | * clustal | |
| * cramtools | * conda | * deeptools | * detonate | * diamond | * ea-utils | * ensembl | 
| * ETE | * FASTQC and MultiQC | * Archaeopteryx and Forester | * GapFiller | * GenomeTools | * gubbins | |
| * JBrowse | * kallisto | * kentUtils | * last | * lastz | * macs2 | |
| * Mash | * mega | * meryl | * MUMmer | * NanoSim | * nseq | |
| * OrthoFinder | * PASA | * perl | * PGAP | * picard-tools | * poRe | |
| * poretools | * prokka | * pyrad | * python | * qualimap | * quast | |
| * qiime2 | * R | * RAxML | * Repeatmasker | * Repeatmodeler | * rnammer | |
| * roary | * RSeQC | * samtools | * Satsuma | * sickle | * SPAdes | |
| * squid | * sra-tools | * srst2 | * SSPACE | * stacks | * Thor | |
| * Tophat | * trimmomatic | * Trinity | * t-coffee | * Unicycler | * velvet | |
| * ViennaRNA | 
Queue Manager Tips
A cluster is a shared resource with different users running different types of analyses. Nearly all clusters use a piece of software called a queue manager to fairly share out the resource. The queue manager on marvin is called Grid Engine, and it has several commands available, all beginning with q and with qsub being the most commonly used as it submits a command via a jobscript to be processed. Here are some tips:
- Queue Manager Tips
 - Queue Manager : shell script command
 - Queue Manager emailing when jobs run
 - General Command-line Tips
 - DRMAA for further Gridengine automation
 
Data Examples
Procedures
(short sequence of tasks with a certain short-term goal, often, a simple script)
Explanations
Pipelines
(Workflow with specific end-goals)
- Trinity_Protocol
 - STAR BEAST
 - callSNPs.py
 - pairwiseCallSNPs
 - mapping.py
 - Edgen RNAseq
 - Miseq Prokaryote FASTQ analysis
 - snpcallphylo
 - Bottlenose dolphin population genomic analysis
 - ChIP-Seq Top2 in Yeast
 - ChIP-Seq Top2 in Yeast 12.09.2017
 - ChIP-Seq Top2 in Yeast 07.11.2017
 - Bisulfite Sequencing
 - microRNA and Salmo Salar
 
Protocols
(Extensive workflows with different with several possible end goals)
- Synthetic Long reads
 - MinION (Oxford Nanopore)
 - MinKNOW folders and log files
 - Research Data Management
 - MicroRNAs
 
Tech Reviews
Cluster Administration
- StABDMIN
 - Hardware Issues
 - marvin and IPMI (remote hardware control)
 - restart a node
 - mounting drives
 - Admin Tips
 - RedHat
 - Globus_gridftp
 - Galaxy Setup
 - Son of Gridengine
 - Blas Libraries
 - CMake
 - conda bioconda
 - Users and Groups
 - Installing software on marvin
 - emailing
 - biotime machine
 - SCAN-pc laptop
 - node1 issues
 - 6TB storage expansion
 - PIs storage sacrifice
 - SAN relocation task
 - Home directories max-out incident 28.11.2016
 - Frontend Restart
 - environment-modules
 - H: drive on cluster
 - Incident: Can't connect to BerkeleyDB
 - Bioinformatics Wordpress Site
 - Backups
 - users disk usage
 - Updating BLAST databases
 - Python DRMAA
 - message of the day
 - SAN disconnect incident 10.01.2017
 - Memory repair glitch 16.02.2017
 - node9 network failure incident 16-20.03.2017
 - Incorrect rebooting of marvin 19.09.2017
 - ansible
 - webstie and word press
 - allow user access to other peoples data
 - RAM and RAM slots
 - ldap is not ldap
 - reset a password
 - sending emails from command line examples
 - disk management after shelf disk failure
 
Courses
I2U4BGA
- Original schedule
 - New schedule
 - Actual schedule
 - Course itself
 - Biolinux Source course
 - Directory Organization Exercise
 - Glossary
 - Key Bindings
 - one-liners
 - Cheatsheets
 - Links
 - pandoc modified manual
 - Command Line Exercises
 
hdi2u
The half-day linux course held on 20th April. Modified version of I2U4BGA.
RNAseq for DGE
- Theoretical background
 - Quality Control and Preprocessing
 - Mapping to Reference
 - Mapping Quality Exercise
 - Key Aspects of using R
 - Estimating Gene Count Exercise
 - Differential Expression Exercise
 - Functional Analysis Exercise
 
Introduction to Unix 2017