Difference between revisions of "Main Page"
| PeterThorpe (talk | contribs)   (→Cluster Administration) | |||
| (169 intermediate revisions by 3 users not shown) | |||
| Line 1: | Line 1: | ||
| + | |||
| + | '''KENNEDY HPC for Bioinf community ''' | ||
| + | * [[Kennedy manual]] | ||
| + | |||
| + | |||
| = Usage of Cluster= | = Usage of Cluster= | ||
| − | [[Cluster Manual]] | + | * [[Cluster Manual]] | 
| + | * [[Kennedy manual]] | ||
| + | * [[Why a Queue Manager?]] | ||
| + | * [[Available Software]] | ||
| + | * [[how to use the cluster training course]] | ||
| + | * [[windows network connect]] | ||
| + | |||
| + | = Documented Programs = | ||
| − | + | The following can be seen as extra notes referring to these programs usage on the marvin cluster, with an emphais on example use-cases. Most, if not all, will have their own special websites, with more detailed manuals and further information. | |
| − | * [[abacas]] | + | {|style="width:85%" | 
| − | * [[augustus]] | + | |* [[abacas]] | 
| − | * [[BamQC]] | + | |* [[albacore]] | 
| − | * [[BLAST]] | + | |* [[ariba]] | 
| − | * [[blast2go: b2g4pipe]] | + | |* [[aspera]] | 
| − | * [[bowtie2]] | + | |* [[assembly-stats]] | 
| − | * [[BUSCO]] | + | |* [[augustus]] | 
| − | * [[CAFE]] | + | |- | 
| − | * [[cd-hit]] | + | |* [[BamQC]] | 
| − | * [[cegma]] | + | |* [[bamtools]] | 
| − | * [[diamond]] | + | |* [[banjo]] | 
| − | * [[ensembl]] | + | |* [[bcftools]] | 
| − | * [[FASTQC and MultiQC]] | + | |* [[bedtools]] | 
| − | * [[JBrowse]] | + | |* [[bgenie]] | 
| − | * [[kallisto]] | + | |- | 
| − | * [[ | + | |* [[BLAST]] | 
| − | * [[MUMmer]] | + | |* [[Blat]] | 
| − | * [[OrthoFinder]] | + | |* [[blast2go: b2g4pipe]] | 
| − | * [[PGAP]] | + | |* [[bowtie]] | 
| − | * [[prokka]] | + | |* [[bowtie2]] | 
| − | * [[pyrad]] | + | |* [[bwa]] | 
| − | * [[python]] | + | |- | 
| − | * [[RAxML]] | + | |* [[BUSCO]] | 
| − | * [[samtools]] | + | |* [[CAFE]] | 
| − | * [[SPAdes]] | + | |* [[canu]] | 
| − | * [[sra-tools]] | + | |* [[cd-hit]] | 
| − | * [[srst2]] | + | |* [[cegma]] | 
| − | * [[stacks]] | + | |* [[clustal]] | 
| − | * [[trimmomatic]] | + | |- | 
| − | * [[velvet]] | + | |* [[cramtools]] | 
| + | |* [[conda]] | ||
| + | |* [[deeptools]] | ||
| + | |* [[detonate]] | ||
| + | |* [[diamond]] | ||
| + | |* [[ea-utils]] | ||
| + | |* [[ensembl]] | ||
| + | |- | ||
| + | |* [[ETE]] | ||
| + | |* [[FASTQC and MultiQC]] | ||
| + | |* [[Archaeopteryx and Forester]] | ||
| + | |* [[GapFiller]] | ||
| + | |* [[GenomeTools]] | ||
| + | |* [[gubbins]] | ||
| + | |- | ||
| + | |* [[JBrowse]] | ||
| + | |* [[kallisto]] | ||
| + | |* [[kentUtils]] | ||
| + | |* [[last]] | ||
| + | |* [[lastz]] | ||
| + | |* [[macs2]] | ||
| + | |- | ||
| + | |* [[Mash]] | ||
| + | |* [[mega]] | ||
| + | |* [[meryl]] | ||
| + | |* [[MUMmer]] | ||
| + | |* [[NanoSim]] | ||
| + | |* [[nseq]] | ||
| + | |- | ||
| + | |* [[OrthoFinder]] | ||
| + | |* [[PASA]] | ||
| + | |* [[perl]] | ||
| + | |* [[PGAP]] | ||
| + | |* [[picard-tools]] | ||
| + | |* [[poRe]] | ||
| + | |- | ||
| + | |* [[poretools]] | ||
| + | |* [[prokka]] | ||
| + | |* [[pyrad]] | ||
| + | |* [[python]] | ||
| + | |* [[qualimap]] | ||
| + | |* [[quast]] | ||
| + | |- | ||
| + | |* [[qiime2]] | ||
| + | |* [[R]] | ||
| + | |* [[RAxML]] | ||
| + | |* [[Repeatmasker]] | ||
| + | |* [[Repeatmodeler]] | ||
| + | |* [[rnammer]] | ||
| + | |- | ||
| + | |* [[roary]] | ||
| + | |* [[RSeQC]] | ||
| + | |* [[samtools]] | ||
| + | |* [[Satsuma]] | ||
| + | |* [[sickle]] | ||
| + | |* [[SPAdes]] | ||
| + | |- | ||
| + | |* [[squid]] | ||
| + | |* [[sra-tools]] | ||
| + | |* [[srst2]] | ||
| + | |* [[SSPACE]] | ||
| + | |* [[stacks]] | ||
| + | |* [[Thor]] | ||
| + | |- | ||
| + | |* [[Tophat]] | ||
| + | |* [[trimmomatic]] | ||
| + | |* [[Trinity]] | ||
| + | |* [[t-coffee]] | ||
| + | |* [[Unicycler]] | ||
| + | |* [[velvet]] | ||
| + | |- | ||
| + | |* [[ViennaRNA]] | ||
| + | |} | ||
| = Queue Manager Tips = | = Queue Manager Tips = | ||
| A cluster is a shared resource with different users running different types of analyses. Nearly all clusters use a piece of software called a queue manager to fairly share out the resource. The queue manager on marvin is called Grid Engine, and it has several commands available, all beginning with '''q''' and with '''qsub''' being the most commonly used as it submits a command via a jobscript to be processed. Here are some tips: | A cluster is a shared resource with different users running different types of analyses. Nearly all clusters use a piece of software called a queue manager to fairly share out the resource. The queue manager on marvin is called Grid Engine, and it has several commands available, all beginning with '''q''' and with '''qsub''' being the most commonly used as it submits a command via a jobscript to be processed. Here are some tips: | ||
| * [[Queue Manager Tips]] | * [[Queue Manager Tips]] | ||
| + | * [[Queue Manager : shell script command]] | ||
| + | * [[Queue Manager emailing when jobs run]] | ||
| + | * [[General Command-line Tips]] | ||
| + | * [[DRMAA for further Gridengine automation]] | ||
| = Data Examples = | = Data Examples = | ||
| Line 45: | Line 133: | ||
| (short sequence of tasks with a certain short-term goal, often, a simple script) | (short sequence of tasks with a certain short-term goal, often, a simple script) | ||
| * [[Calculating coverage]] | * [[Calculating coverage]] | ||
| + | * [[MinION Coverage sensitivity analysis]] | ||
| = Navigating genomic data websites= | = Navigating genomic data websites= | ||
| * [[Patric]] | * [[Patric]] | ||
| + | * [[NCBI]] | ||
| + | * [[IGSR/1000 Genomes]] | ||
| − | =  | + | = Explanations= | 
| + | * [[ITUcourse]] | ||
| * [[VCF]] | * [[VCF]] | ||
| + | * [[Maximum Likelihood]] | ||
| + | * [[SNP Analysis and phylogenetics]] | ||
| + | * [[Normalization]] | ||
| = Pipelines = | = Pipelines = | ||
| − | (Workflow with  | + | (Workflow with specific end-goals) | 
| * [[Trinity_Protocol]] | * [[Trinity_Protocol]] | ||
| * [[STAR BEAST]] | * [[STAR BEAST]] | ||
| * [[callSNPs.py]] | * [[callSNPs.py]] | ||
| + | * [[pairwiseCallSNPs]] | ||
| + | * [[mapping.py]] | ||
| * [[Edgen RNAseq]] | * [[Edgen RNAseq]] | ||
| * [[Miseq Prokaryote FASTQ analysis]] | * [[Miseq Prokaryote FASTQ analysis]] | ||
| + | * [[snpcallphylo]] | ||
| + | * [[Bottlenose dolphin population genomic analysis]] | ||
| + | * [[ChIP-Seq Top2 in Yeast]] | ||
| + | * [[ChIP-Seq Top2 in Yeast 12.09.2017]] | ||
| + | * [[ChIP-Seq Top2 in Yeast 07.11.2017]] | ||
| + | * [[Bisulfite Sequencing]] | ||
| + | * [[microRNA and Salmo Salar]] | ||
| =Protocols= | =Protocols= | ||
| (Extensive workflows with different with several possible end goals) | (Extensive workflows with different with several possible end goals) | ||
| − | [[Synthetic Long reads]] | + | * [[Synthetic Long reads]] | 
| + | * [[MinION (Oxford Nanopore)]] | ||
| + | * [[MinKNOW folders and log files]] | ||
| + | * [[Research Data Management]] | ||
| + | * [[MicroRNAs]] | ||
| + | |||
| + | = Tech Reviews = | ||
| + | * [[SWATH-MS Data Analysis]] | ||
| = Cluster Administration = | = Cluster Administration = | ||
| + | * [[StABDMIN]] | ||
| + | * [[Hardware Issues]] | ||
| + | * [[marvin and IPMI (remote hardware control)]] | ||
| + | * [[restart a node]] | ||
| + | * [[mounting drives]] | ||
| * [[Admin Tips]] | * [[Admin Tips]] | ||
| * [[RedHat]] | * [[RedHat]] | ||
| * [[Globus_gridftp]] | * [[Globus_gridftp]] | ||
| * [[Galaxy Setup]] | * [[Galaxy Setup]] | ||
| + | * [[Son of Gridengine]] | ||
| + | * [[Blas Libraries]] | ||
| * [[CMake]] | * [[CMake]] | ||
| + | * [[conda bioconda]] | ||
| * [[Users and Groups]] | * [[Users and Groups]] | ||
| + | * [[Installing software on marvin]] | ||
| + | * [[emailing]] | ||
| + | * [[biotime machine]] | ||
| + | * [[SCAN-pc laptop]] | ||
| * [[node1 issues]] | * [[node1 issues]] | ||
| * [[6TB storage expansion]] | * [[6TB storage expansion]] | ||
| + | * [[PIs storage sacrifice]] | ||
| + | * [[SAN relocation task]] | ||
| + | * [[Home directories max-out incident 28.11.2016]] | ||
| * [[Frontend Restart]] | * [[Frontend Restart]] | ||
| * [[environment-modules]] | * [[environment-modules]] | ||
| * [[H: drive on cluster]] | * [[H: drive on cluster]] | ||
| + | * [[Incident: Can't connect to BerkeleyDB]] | ||
| + | * [[Bioinformatics Wordpress Site]] | ||
| + | * [[Backups]] | ||
| + | * [[users disk usage]] | ||
| + | * [[Updating BLAST databases]] | ||
| + | * [[Python DRMAA]] | ||
| + | * [[message of the day]] | ||
| + | * [[SAN disconnect incident 10.01.2017]] | ||
| + | * [[Memory repair glitch 16.02.2017]] | ||
| + | * [[node9 network failure incident 16-20.03.2017]] | ||
| + | * [[Incorrect rebooting of marvin 19.09.2017]] | ||
| + | * [[ansible]] | ||
| + | * [[webstie and word press]] | ||
| + | * [[allow user access to other peoples data]] | ||
| + | * [[RAM and RAM slots]] | ||
| + | * [[ldap is not ldap]] | ||
| + | * [[reset a password]] | ||
| + | * [[sending emails from command line examples]] | ||
| + | * [[disk management after shelf disk failure]] | ||
| + | * [[firewall and iptables]] | ||
| + | |||
| + | = Courses = | ||
| + | |||
| + | ==I2U4BGA== | ||
| + | * [[Original schedule]] | ||
| + | * [[New schedule]] | ||
| + | * [[Actual schedule]] | ||
| + | * [[Course itself]] | ||
| + | * [[Biolinux Source course]] | ||
| + | * [[Directory Organization Exercise]] | ||
| + | * [[Glossary]] | ||
| + | * [[Key Bindings]] | ||
| + | * [[one-liners]] | ||
| + | * [[Cheatsheets]] | ||
| + | * [[Links]] | ||
| + | * [[pandoc modified manual]] | ||
| + | * [[Command Line Exercises]] | ||
| + | |||
| + | = hdi2u = | ||
| + | |||
| + | The half-day linux course held on 20th April. Modified version of I2U4BGA. | ||
| + | |||
| + | * [[hdi2u_intro]] | ||
| + | * [[hdi2u_commandbased_exercises]] | ||
| + | * [[hdi2u_dirorg_exercise]] | ||
| + | * [[hdi2u_rendertotsv_exercise]] | ||
| + | |||
| + | = RNAseq for DGE = | ||
| + | * [[Theoretical background]] | ||
| + | * [[Quality Control and Preprocessing]] | ||
| + | * [[Mapping to Reference]] | ||
| + | * [[Mapping Quality Exercise]] | ||
| + | * [[Key Aspects of using R]] | ||
| + | * [[Estimating Gene Count Exercise]] | ||
| + | * [[Differential Expression Exercise]] | ||
| + | * [[Functional Analysis Exercise]] | ||
| + | |||
| + | = Introduction to Unix 2017 = | ||
| + | * [[Introduction_to_Unix_2017]] | ||
| + | |||
| + | |||
| + | |||
| + | ==Templates== | ||
| + | * [[edgenl2g]] | ||
Latest revision as of 19:14, 22 July 2020
KENNEDY HPC for Bioinf community
Contents
Usage of Cluster
- Cluster Manual
- Kennedy manual
- Why a Queue Manager?
- Available Software
- how to use the cluster training course
- windows network connect
Documented Programs
The following can be seen as extra notes referring to these programs usage on the marvin cluster, with an emphais on example use-cases. Most, if not all, will have their own special websites, with more detailed manuals and further information.
| * abacas | * albacore | * ariba | * aspera | * assembly-stats | * augustus | |
| * BamQC | * bamtools | * banjo | * bcftools | * bedtools | * bgenie | |
| * BLAST | * Blat | * blast2go: b2g4pipe | * bowtie | * bowtie2 | * bwa | |
| * BUSCO | * CAFE | * canu | * cd-hit | * cegma | * clustal | |
| * cramtools | * conda | * deeptools | * detonate | * diamond | * ea-utils | * ensembl | 
| * ETE | * FASTQC and MultiQC | * Archaeopteryx and Forester | * GapFiller | * GenomeTools | * gubbins | |
| * JBrowse | * kallisto | * kentUtils | * last | * lastz | * macs2 | |
| * Mash | * mega | * meryl | * MUMmer | * NanoSim | * nseq | |
| * OrthoFinder | * PASA | * perl | * PGAP | * picard-tools | * poRe | |
| * poretools | * prokka | * pyrad | * python | * qualimap | * quast | |
| * qiime2 | * R | * RAxML | * Repeatmasker | * Repeatmodeler | * rnammer | |
| * roary | * RSeQC | * samtools | * Satsuma | * sickle | * SPAdes | |
| * squid | * sra-tools | * srst2 | * SSPACE | * stacks | * Thor | |
| * Tophat | * trimmomatic | * Trinity | * t-coffee | * Unicycler | * velvet | |
| * ViennaRNA | 
Queue Manager Tips
A cluster is a shared resource with different users running different types of analyses. Nearly all clusters use a piece of software called a queue manager to fairly share out the resource. The queue manager on marvin is called Grid Engine, and it has several commands available, all beginning with q and with qsub being the most commonly used as it submits a command via a jobscript to be processed. Here are some tips:
- Queue Manager Tips
- Queue Manager : shell script command
- Queue Manager emailing when jobs run
- General Command-line Tips
- DRMAA for further Gridengine automation
Data Examples
Procedures
(short sequence of tasks with a certain short-term goal, often, a simple script)
Explanations
Pipelines
(Workflow with specific end-goals)
- Trinity_Protocol
- STAR BEAST
- callSNPs.py
- pairwiseCallSNPs
- mapping.py
- Edgen RNAseq
- Miseq Prokaryote FASTQ analysis
- snpcallphylo
- Bottlenose dolphin population genomic analysis
- ChIP-Seq Top2 in Yeast
- ChIP-Seq Top2 in Yeast 12.09.2017
- ChIP-Seq Top2 in Yeast 07.11.2017
- Bisulfite Sequencing
- microRNA and Salmo Salar
Protocols
(Extensive workflows with different with several possible end goals)
- Synthetic Long reads
- MinION (Oxford Nanopore)
- MinKNOW folders and log files
- Research Data Management
- MicroRNAs
Tech Reviews
Cluster Administration
- StABDMIN
- Hardware Issues
- marvin and IPMI (remote hardware control)
- restart a node
- mounting drives
- Admin Tips
- RedHat
- Globus_gridftp
- Galaxy Setup
- Son of Gridengine
- Blas Libraries
- CMake
- conda bioconda
- Users and Groups
- Installing software on marvin
- emailing
- biotime machine
- SCAN-pc laptop
- node1 issues
- 6TB storage expansion
- PIs storage sacrifice
- SAN relocation task
- Home directories max-out incident 28.11.2016
- Frontend Restart
- environment-modules
- H: drive on cluster
- Incident: Can't connect to BerkeleyDB
- Bioinformatics Wordpress Site
- Backups
- users disk usage
- Updating BLAST databases
- Python DRMAA
- message of the day
- SAN disconnect incident 10.01.2017
- Memory repair glitch 16.02.2017
- node9 network failure incident 16-20.03.2017
- Incorrect rebooting of marvin 19.09.2017
- ansible
- webstie and word press
- allow user access to other peoples data
- RAM and RAM slots
- ldap is not ldap
- reset a password
- sending emails from command line examples
- disk management after shelf disk failure
- firewall and iptables
Courses
I2U4BGA
- Original schedule
- New schedule
- Actual schedule
- Course itself
- Biolinux Source course
- Directory Organization Exercise
- Glossary
- Key Bindings
- one-liners
- Cheatsheets
- Links
- pandoc modified manual
- Command Line Exercises
hdi2u
The half-day linux course held on 20th April. Modified version of I2U4BGA.
RNAseq for DGE
- Theoretical background
- Quality Control and Preprocessing
- Mapping to Reference
- Mapping Quality Exercise
- Key Aspects of using R
- Estimating Gene Count Exercise
- Differential Expression Exercise
- Functional Analysis Exercise
Introduction to Unix 2017
