Difference between revisions of "Blast"
PeterThorpe (talk | contribs) (added blast databases with tax db.) |
|||
Line 2: | Line 2: | ||
The workhorse of bioinformatics, some tips on usage, and (mostly) how to speed it up | The workhorse of bioinformatics, some tips on usage, and (mostly) how to speed it up | ||
+ | |||
+ | |||
+ | == BLAST == | ||
+ | |||
+ | |||
+ | READ this for how to run blast: https://www.ncbi.nlm.nih.gov/books/NBK279680/ | ||
+ | |||
+ | http://nebc.nerc.ac.uk/bioinformatics/documentation/blast+/user_manual.pdf | ||
+ | |||
+ | |||
+ | # Put this in your script or in your .bash_profile | ||
+ | export BLASTDB=/shelf/public/blastntnr/blastDatabases | ||
+ | |||
+ | # to add the latest BLAST tools to you path | ||
+ | export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH | ||
+ | |||
+ | # now you can run against nr , nt , human_genomic, uniprot_sprot.fasta: | ||
+ | > blastp -db '''nr''' -query amino_acid.fasta -out tests.txt | ||
+ | |||
+ | Taxonomy database is also in here: /shelf/public/blastntnr/blastDatabases. So if you want the extended BLAST format, this should just work, assuming you ask for it. | ||
+ | |||
+ | == FULL EXAMPLE: == | ||
+ | export BLASTDB=/shelf/public/blastntnr/blastDatabases | ||
+ | |||
+ | export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH | ||
+ | |||
+ | blastp -db nr -query MY_AA.fasta -evalue 1e-5 -seg no -num_threads 8 -outfmt "6 std salltitles staxids sscinames scomnames sskingdoms" -out MY_AA_vs_nr.fa_NR_18oct2018.tab | ||
+ | |||
+ | |||
= Output formats = | = Output formats = |
Latest revision as of 13:57, 23 October 2018
Contents
Introduction
The workhorse of bioinformatics, some tips on usage, and (mostly) how to speed it up
BLAST
READ this for how to run blast: https://www.ncbi.nlm.nih.gov/books/NBK279680/
http://nebc.nerc.ac.uk/bioinformatics/documentation/blast+/user_manual.pdf
- Put this in your script or in your .bash_profile
export BLASTDB=/shelf/public/blastntnr/blastDatabases
- to add the latest BLAST tools to you path
export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH
- now you can run against nr , nt , human_genomic, uniprot_sprot.fasta:
> blastp -db nr -query amino_acid.fasta -out tests.txt
Taxonomy database is also in here: /shelf/public/blastntnr/blastDatabases. So if you want the extended BLAST format, this should just work, assuming you ask for it.
FULL EXAMPLE:
export BLASTDB=/shelf/public/blastntnr/blastDatabases
export PATH=/shelf/apps/ncbi-blast-2.7.1+/bin/:$PATH
blastp -db nr -query MY_AA.fasta -evalue 1e-5 -seg no -num_threads 8 -outfmt "6 std salltitles staxids sscinames scomnames sskingdoms" -out MY_AA_vs_nr.fa_NR_18oct2018.tab
Output formats
-m 9
This is for the old blast and mpiblast
- Query id
- Subject id
- % identity
- alignment length
- mismatches
- gap openings
- q. start
- q. end
- s. start
- s. end
- e-value
- bit score
Benchmark Exercises
Transcriptome Panda Blood
- 92600 transcripts (contigs) in 2.4 million line FASTA file.
- blastx speed on 62-fragmented nr database: 5% (about 4500 contigs then) in 33 hours on blastx