Prokka
Contents
Introduction
genome annotator for bacterial circular genomes.
Usage
Prokka's manual is here
Example jobscript for prokka
#!/bin/bash #$ -cwd #$ -j y #$ -S /bin/bash #$ -V #$ -q marvin.q #$ -pe multi 16 DIR=$ prokka --fast --cpus $NSLOTS --outdir
prokka's standard help file
Name: Prokka 1.12-beta by Torsten Seemann <torsten.seemann@gmail.com> Synopsis: rapid bacterial genome annotation Usage: prokka [options] <contigs.fasta> General: --help This help --version Print version and exit --docs Show full manual/documentation --citation Print citation for referencing Prokka --quiet No screen output (default OFF) --debug Debug mode: keep all temporary files (default OFF) Setup: --listdb List all configured databases --setupdb Index all installed databases --cleandb Remove all database indices --depends List all software dependencies Outputs: --outdir [X] Output folder [auto] (default ) --force Force overwriting existing output folder (default OFF) --prefix [X] Filename output prefix [auto] (default ) --addgenes Add 'gene' features for each 'CDS' feature (default OFF) --addmrna Add 'mRNA' features for each 'CDS' feature (default OFF) --locustag [X] Locus tag prefix (default 'PROKKA') --increment [N] Locus tag counter increment (default '1') --gffver [N] GFF version (default '3') --compliant Force Genbank/ENA/DDJB compliance: --addgenes --mincontiglen 200 --centre XXX (default OFF) --centre [X] Sequencing centre ID. (default ) Organism details: --genus [X] Genus name (default 'Genus') --species [X] Species name (default 'species') --strain [X] Strain name (default 'strain') --plasmid [X] Plasmid name or identifier (default ) Annotations: --kingdom [X] Annotation mode: Archaea|Bacteria|Mitochondria|Viruses (default 'Bacteria') --gcode [N] Genetic code / Translation table (set if --kingdom is set) (default '0') --gram [X] Gram: -/neg +/pos (default ) --usegenus Use genus-specific BLAST databases (needs --genus) (default OFF) --proteins [X] FASTA or GBK file to use as 1st priority (default ) --hmms [X] Trusted HMM to first annotate from (default ) --metagenome Improve gene predictions for highly fragmented genomes (default OFF) --rawproduct Do not clean up /product annotation (default OFF) --cdsrnaolap Allow [tr]RNA to overlap CDS (default OFF) Computation: --cpus [N] Number of CPUs to use [0=all] (default '8') --fast Fast mode - only use basic BLASTP databases (default OFF) --noanno For CDS just set /product="unannotated protein" (default OFF) --mincontiglen [N] Minimum contig size [NCBI needs 200] (default '1') --evalue [n.n] Similarity e-value cut-off (default '1e-06') --rfam Enable searching for ncRNAs with Infernal+Rfam (SLOW!) (default '0') --norrna Don't run rRNA search (default OFF) --notrna Don't run tRNA search (default OFF) --rnammer Prefer RNAmmer over Barrnap for rRNA prediction (default OFF)
Output files
- If given a fragmented scaffold file (typically from a de-novo assembler), prokka will refer to each scaffold / contigs as "nodes".
Installation issues (sysadmins only)
Prokka can be cloned from github and its first step is of setting up databases, like so:
> ./prokka --setupdb [16:54:57] Appending to PATH: /home/nutria/gitrepos/prokka/bin/../binaries/linux [16:54:57] Appending to PATH: /home/nutria/gitrepos/prokka/bin/../binaries/linux/../common [16:54:57] Appending to PATH: /home/nutria/gitrepos/prokka/bin [16:54:57] Cleaning databases in /home/nutria/gitrepos/prokka/bin/../db [16:54:57] Cleaning complete. [16:54:57] Looking for 'makeblastdb' - found /usr/bin/makeblastdb [16:54:57] Determined makeblastdb version is 2.2 [16:54:57] Making kingdom BLASTP database: /home/nutria/gitrepos/prokka/bin/../db/kingdom/Archaea/sprot [16:54:57] Running: makeblastdb -hash_index -dbtype prot -in \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/kingdom\/Archaea\/sprot -logfile /dev/null [16:54:58] Making kingdom BLASTP database: /home/nutria/gitrepos/prokka/bin/../db/kingdom/Bacteria/sprot [16:54:58] Running: makeblastdb -hash_index -dbtype prot -in \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/kingdom\/Bacteria\/sprot -logfile /dev/null [16:54:59] Making kingdom BLASTP database: /home/nutria/gitrepos/prokka/bin/../db/kingdom/Mitochondria/sprot [16:54:59] Running: makeblastdb -hash_index -dbtype prot -in \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/kingdom\/Mitochondria\/sprot -logfile /dev/null [16:54:59] Making kingdom BLASTP database: /home/nutria/gitrepos/prokka/bin/../db/kingdom/Viruses/sprot [16:54:59] Running: makeblastdb -hash_index -dbtype prot -in \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/kingdom\/Viruses\/sprot -logfile /dev/null [16:54:59] Making genus BLASTP database: /home/nutria/gitrepos/prokka/bin/../db/genus/Enterococcus [16:54:59] Running: makeblastdb -hash_index -dbtype prot -in \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/genus\/Enterococcus -logfile /dev/null [16:54:59] Making genus BLASTP database: /home/nutria/gitrepos/prokka/bin/../db/genus/Escherichia [16:54:59] Running: makeblastdb -hash_index -dbtype prot -in \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/genus\/Escherichia -logfile /dev/null [16:55:00] Making genus BLASTP database: /home/nutria/gitrepos/prokka/bin/../db/genus/Staphylococcus [16:55:00] Running: makeblastdb -hash_index -dbtype prot -in \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/genus\/Staphylococcus -logfile /dev/null [16:55:00] Looking for 'hmmpress' - found /usr/bin/hmmpress [16:55:00] Determined hmmpress version is 3.1 [16:55:00] Pressing HMM database: /home/nutria/gitrepos/prokka/bin/../db/hmm/HAMAP.hmm [16:55:00] Running: hmmpress \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/hmm\/HAMAP\.hmm Working... done. Pressed and indexed 1463 HMMs (1463 names). Models pressed into binary file: /home/nutria/gitrepos/prokka/bin/../db/hmm/HAMAP.hmm.h3m SSI index for binary model file: /home/nutria/gitrepos/prokka/bin/../db/hmm/HAMAP.hmm.h3i Profiles (MSV part) pressed into: /home/nutria/gitrepos/prokka/bin/../db/hmm/HAMAP.hmm.h3f Profiles (remainder) pressed into: /home/nutria/gitrepos/prokka/bin/../db/hmm/HAMAP.hmm.h3p [16:55:01] Looking for 'cmpress' - found /usr/bin/cmpress [16:55:01] Determined cmpress version is 1.1 [16:55:01] Pressing CM database: /home/nutria/gitrepos/prokka/bin/../db/cm/Viruses [16:55:01] Running: cmpress \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/cm\/Viruses Working... done. Pressed and indexed 142 CMs and p7 HMM filters (142 names and 142 accessions). Covariance models and p7 filters pressed into binary file: /home/nutria/gitrepos/prokka/bin/../db/cm/Viruses.i1m SSI index for binary covariance model file: /home/nutria/gitrepos/prokka/bin/../db/cm/Viruses.i1i Optimized p7 filter profiles (MSV part) pressed into: /home/nutria/gitrepos/prokka/bin/../db/cm/Viruses.i1f Optimized p7 filter profiles (remainder) pressed into: /home/nutria/gitrepos/prokka/bin/../db/cm/Viruses.i1p [16:55:01] Pressing CM database: /home/nutria/gitrepos/prokka/bin/../db/cm/Bacteria [16:55:01] Running: cmpress \/home\/nutria\/gitrepos\/prokka\/bin\/\.\.\/db\/cm\/Bacteria Working... done. Pressed and indexed 564 CMs and p7 HMM filters (564 names and 564 accessions). Covariance models and p7 filters pressed into binary file: /home/nutria/gitrepos/prokka/bin/../db/cm/Bacteria.i1m SSI index for binary covariance model file: /home/nutria/gitrepos/prokka/bin/../db/cm/Bacteria.i1i Optimized p7 filter profiles (MSV part) pressed into: /home/nutria/gitrepos/prokka/bin/../db/cm/Bacteria.i1f Optimized p7 filter profiles (remainder) pressed into: /home/nutria/gitrepos/prokka/bin/../db/cm/Bacteria.i1p [16:55:01] Looking for databases in: /home/nutria/gitrepos/prokka/bin/../db [16:55:01] * Kingdoms: Archaea Bacteria Mitochondria Viruses [16:55:01] * Genera: Enterococcus Escherichia Staphylococcus [16:55:01] * HMMs: HAMAP [16:55:01] * CMs: Bacteria Viruses
it seems to set its own paths
When invoking prokka with no arguments, one sees this:
[ramon@marvin ~]$ prokka [13:52:03] Appending to PATH: /usr/local/Modules/modulefiles/tools/prokka/gitv1_8f07048/bin/../binaries/linux [13:52:03] Appending to PATH: /usr/local/Modules/modulefiles/tools/prokka/gitv1_8f07048/bin/../binaries/linux/../common [13:52:03] Appending to PATH: /usr/local/Modules/modulefiles/tools/prokka/gitv1_8f07048/bin