Latest revision as of 13:57, 23 October 2018

Introduction

The workhorse of bioinformatics, some tips on usage, and (mostly) how to speed it up

BLAST

READ this for how to run blast: https://www.ncbi.nlm.nih.gov/books/NBK279680/

http://nebc.nerc.ac.uk/bioinformatics/documentation/blast+/user_manual.pdf

Put this in your script or in your .bash_profile

export BLASTDB=/shelf/public/blastntnr/blastDatabases

to add the latest BLAST tools to you path

export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH

now you can run against nr , nt , human_genomic, uniprot_sprot.fasta:

> blastp -db nr -query amino_acid.fasta -out tests.txt

Taxonomy database is also in here: /shelf/public/blastntnr/blastDatabases. So if you want the extended BLAST format, this should just work, assuming you ask for it.

FULL EXAMPLE:

export BLASTDB=/shelf/public/blastntnr/blastDatabases

export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH

blastp -db nr -query MY_AA.fasta -evalue 1e-5 -seg no -num_threads 8 -outfmt "6 std salltitles staxids sscinames scomnames sskingdoms" -out MY_AA_vs_nr.fa_NR_18oct2018.tab

Output formats

-m 9

This is for the old blast and mpiblast

Query id
Subject id
% identity
alignment length
mismatches
gap openings
q. start
q. end
s. start
s. end
e-value
bit score

Benchmark Exercises

Transcriptome Panda Blood

92600 transcripts (contigs) in 2.4 million line FASTA file.
blastx speed on 62-fragmented nr database: 5% (about 4500 contigs then) in 33 hours on blastx

@@ Line 2: / Line 2: @@
 The workhorse of bioinformatics, some tips on usage, and (mostly) how to speed it up
+== BLAST ==
+READ this for how to run blast: https://www.ncbi.nlm.nih.gov/books/NBK279680/
+http://nebc.nerc.ac.uk/bioinformatics/documentation/blast+/user_manual.pdf
+# Put this in your script or in your .bash_profile
+export BLASTDB=/shelf/public/blastntnr/blastDatabases
+# to add the latest BLAST tools to you path
+export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH
+# now you can run against nr , nt , human_genomic, uniprot_sprot.fasta:
+> blastp -db '''nr''' -query amino_acid.fasta -out tests.txt
+Taxonomy database is also in here: /shelf/public/blastntnr/blastDatabases. So if you want the extended BLAST format, this should just work, assuming you ask for it.
+== FULL EXAMPLE: ==
+export BLASTDB=/shelf/public/blastntnr/blastDatabases
+export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH
+blastp -db nr -query MY_AA.fasta -evalue 1e-5 -seg no -num_threads 8 -outfmt "6 std salltitles staxids sscinames scomnames sskingdoms" -out MY_AA_vs_nr.fa_NR_18oct2018.tab
 = Output formats =
@@ Line 9: / Line 38: @@
 This is for the old blast and mpiblast
-* Query id
+# Query id
-* Subject id
+# Subject id
-* % identity
+# % identity
-* alignment length
+# alignment length
-* mismatches
+# mismatches
-* gap openings
+# gap openings
-* q. start
+# q. start
-* q. end
+# q. end
-* s. start
+# s. start
-* s. end
+# s. end
-* e-value
+# e-value
-* bit score
+# bit score
 = Benchmark Exercises =

Difference between revisions of "Blast"

Latest revision as of 13:57, 23 October 2018

Contents

Introduction

BLAST

FULL EXAMPLE:

Output formats

-m 9

Benchmark Exercises

Transcriptome Panda Blood

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools