Difference between revisions of "OrthoFinder"

From wiki
Jump to: navigation, search
Line 3: Line 3:
 
OrthoFinder <ref>  D.M. Emms & S. Kelly (2015), OrthoFinder: solving fundamental biases in whole genome comparisons
 
OrthoFinder <ref>  D.M. Emms & S. Kelly (2015), OrthoFinder: solving fundamental biases in whole genome comparisons
 
     dramatically improves orthogroup inference accuracy, Genome Biology 16:157.</ref>  is an orthology detector seen in some circles as better than the widely used OrthoMCL.
 
     dramatically improves orthogroup inference accuracy, Genome Biology 16:157.</ref>  is an orthology detector seen in some circles as better than the widely used OrthoMCL.
 +
 +
= Helpfile =
 +
 +
To see the help file, when module is loaded, the main command can be launched with no arguments as follows:
 +
 +
orthofinder.py
 +
 +
Key points are as follows:
 +
 +
== Simple Usage ==
 +
 +
To infer orthologous groups for the proteomes contained in fasta_directory running <number_of_blast_threads> in parallel for the BLAST searches and subsequently running      <number_of_orthofinder_threads> in parallel for the OrthoFinder algorithm.
 +
 +
python orthofinder.py -f fasta_directory [-t <number_of_blast_threads>] [-a <number_of_orthofinder_threads>]
 +
 +
== Advanced Usage ==
 +
 +
1. Prepare files for BLAST and prints the BLAST commands. Does not perform BLAST searches or infer orthologous groups. Useful if you want to prepare the files in the form required by OrthoFinder but want to perform the BLAST searches using a job scheduler/on a cluster and then infer orthologous groups using option 2.
 +
 +
python orthofinder.py -f fasta_directory -p
 +
 +
2. Infers orthologous groups using pre-calculated BLAST results. These can be after BLAST searches have been completed following the use of option 1 or using the WorkingDirectory
 +
    from a previous OrthoFinder run. Species can be commented out with a '#' in the SpeciesIDs.txt file to exclude them from the analysis. See README file for details.
 +
 +
python orthofinder.py -b precalculated_blast_results_directory [-a number_of_orthofinder_threads]
 +
 +
3. Add species from fasta_directory to a previous OrthoFinder run where precalculated_blast_results_directory is the directory containing the BLAST results files etc. from the previous run.
 +
 +
python orthofinder.py -b precalculated_blast_results_directory -f fasta_directory [-t number_of_blast_threads] [-a number_of_orthofinder_threads]
 +
 +
== Options and Arguments ==
 +
 +
-f fasta_directory, --fasta fasta_directory
 +
    Predict orthogroups for the proteins in the fasta files in the fasta_directory
 +
 +
-b precalculated_blast_results_directory, --blast precalculated_blast_results_directory
 +
    Predict orthogroups using the pre-calcualted BLAST results in precalculated_blast_results_directory.
 +
 +
-t number_of_blast_threads, --threads number_of_blast_threads
 +
    The number of BLAST processes to be run simultaneously. This should be increased by the user to at least
 +
    the number of cores on the computer so as to minimise the time taken to perform the BLAST all-versus-all
 +
    queries. [Default is 16]
 +
 +
-a number_of_orthofinder_threads, --algthreads number_of_orthofinder_threads
 +
    The number of threads to use for the OrthoFinder algorithm and MCL after BLAST searches have been completed.
 +
    Running the OrthoFinder algorithm with a number of threads simultaneously increases the RAM
 +
    requirements proportionally so be aware of the amount of RAM you have available (and see README file).
 +
    Additionally, as the algorithm implementation is very fast, file reading is likely to be the
 +
    limiting factor above about 5-10 threads and additional threads may have little effect other than
 +
    increase RAM requirements. [Default is 1]
 +
 +
-x speciesInfoFilename, --orthoxml speciesInfoFilename
 +
    Output the orthogroups in the orthoxml format using the information in speciesInfoFilename.
 +
 +
-p , --prepare
 +
    Only prepare the files in the format required by OrthoFinder and print out the BLAST searches that
 +
    need to be performed but don't run BLAST or infer orthologous groups
 +
 +
-h, --help
 +
    Print this help text
  
 
= Usage patterns =
 
= Usage patterns =
  
Many taken from OrthoFinder's githib page linked below.
+
Many taken from OrthoFinder's github page linked below.
  
 
Can be verbose as outputs are included.
 
Can be verbose as outputs are included.
  
First we make sure to get the right  
+
First we make sure to get the right modules are loaded
  
$ module unload python/2.7
+
  $ module load OrthoFinder
  $ module load python/2.7.11 OrthoFinder
 
 
   
 
   
 
  Currently Loaded Modulefiles:
 
  Currently Loaded Modulefiles:
 
   1) modules              2) python/2.7          3) R/3.2.1              4) blastall/2.2.26      5) blastScripts/1.0.0  6) openmpi/1.6.5        7) mcl/14-137          8) mafft/7.147          9) FastTree/2.1.8      10) OrthoFinder/0.6.0
 
   1) modules              2) python/2.7          3) R/3.2.1              4) blastall/2.2.26      5) blastScripts/1.0.0  6) openmpi/1.6.5        7) mcl/14-137          8) mafft/7.147          9) FastTree/2.1.8      10) OrthoFinder/0.6.0
+
 
 +
Both python/2.7.6 and python/2.7.11 modules should work.
 +
 
 +
OrthoFinder has an example Dataset, which can be launched with the '''-f''' op
 
  $ orthofinder.py -f ExampleDataset/
 
  $ orthofinder.py -f ExampleDataset/
 
   
 
   

Revision as of 22:38, 21 June 2016

Introduction

OrthoFinder [1] is an orthology detector seen in some circles as better than the widely used OrthoMCL.

Helpfile

To see the help file, when module is loaded, the main command can be launched with no arguments as follows:

orthofinder.py

Key points are as follows:

Simple Usage

To infer orthologous groups for the proteomes contained in fasta_directory running <number_of_blast_threads> in parallel for the BLAST searches and subsequently running <number_of_orthofinder_threads> in parallel for the OrthoFinder algorithm.

python orthofinder.py -f fasta_directory [-t <number_of_blast_threads>] [-a <number_of_orthofinder_threads>]

Advanced Usage

1. Prepare files for BLAST and prints the BLAST commands. Does not perform BLAST searches or infer orthologous groups. Useful if you want to prepare the files in the form required by OrthoFinder but want to perform the BLAST searches using a job scheduler/on a cluster and then infer orthologous groups using option 2.

python orthofinder.py -f fasta_directory -p

2. Infers orthologous groups using pre-calculated BLAST results. These can be after BLAST searches have been completed following the use of option 1 or using the WorkingDirectory

    from a previous OrthoFinder run. Species can be commented out with a '#' in the SpeciesIDs.txt file to exclude them from the analysis. See README file for details.

python orthofinder.py -b precalculated_blast_results_directory [-a number_of_orthofinder_threads]

3. Add species from fasta_directory to a previous OrthoFinder run where precalculated_blast_results_directory is the directory containing the BLAST results files etc. from the previous run.

python orthofinder.py -b precalculated_blast_results_directory -f fasta_directory [-t number_of_blast_threads] [-a number_of_orthofinder_threads]

Options and Arguments

-f fasta_directory, --fasta fasta_directory
    Predict orthogroups for the proteins in the fasta files in the fasta_directory

-b precalculated_blast_results_directory, --blast precalculated_blast_results_directory
    Predict orthogroups using the pre-calcualted BLAST results in precalculated_blast_results_directory.

-t number_of_blast_threads, --threads number_of_blast_threads
    The number of BLAST processes to be run simultaneously. This should be increased by the user to at least 
    the number of cores on the computer so as to minimise the time taken to perform the BLAST all-versus-all 
    queries. [Default is 16]

-a number_of_orthofinder_threads, --algthreads number_of_orthofinder_threads
    The number of threads to use for the OrthoFinder algorithm and MCL after BLAST searches have been completed. 
    Running the OrthoFinder algorithm with a number of threads simultaneously increases the RAM 
    requirements proportionally so be aware of the amount of RAM you have available (and see README file). 
    Additionally, as the algorithm implementation is very fast, file reading is likely to be the 
    limiting factor above about 5-10 threads and additional threads may have little effect other than 
    increase RAM requirements. [Default is 1]

-x speciesInfoFilename, --orthoxml speciesInfoFilename
    Output the orthogroups in the orthoxml format using the information in speciesInfoFilename.

-p , --prepare
    Only prepare the files in the format required by OrthoFinder and print out the BLAST searches that
    need to be performed but don't run BLAST or infer orthologous groups

-h, --help
    Print this help text

Usage patterns

Many taken from OrthoFinder's github page linked below.

Can be verbose as outputs are included.

First we make sure to get the right modules are loaded

$ module load OrthoFinder

Currently Loaded Modulefiles:
  1) modules              2) python/2.7           3) R/3.2.1              4) blastall/2.2.26      5) blastScripts/1.0.0   6) openmpi/1.6.5        7) mcl/14-137           8) mafft/7.147          9) FastTree/2.1.8      10) OrthoFinder/0.6.0

Both python/2.7.6 and python/2.7.11 modules should work.

OrthoFinder has an example Dataset, which can be launched with the -f op

$ orthofinder.py -f ExampleDataset/

OrthoFinder version 0.6.0 Copyright (C) 2014 David Emms

    This program comes with ABSOLUTELY NO WARRANTY.
    This is free software, and you are welcome to redistribute it under certain conditions.
    For details please see the License.md that came with this software.

1 threads for OrthoFinder algorithm

1. Checking required programs are installed
-------------------------------------------
Test can run "makeblastdb -help" - ok
Test can run "blastp -help" - ok
Test can run "mcl -h" - ok

2. Temporarily renaming sequences with unique, simple identifiers
------------------------------------------------------------------
Done

3. Dividing up work for BLAST for parallel processing
-----------------------------------------------------

3a. Creating BLAST databases
----------------------------


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies0
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species0.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 820 sequences in 0.0515079 seconds.


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies1
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species1.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 763 sequences in 0.0477939 seconds.


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies2
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species2.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 476 sequences in 0.03089 seconds.


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies3
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species3.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 674 sequences in 0.0401189 seconds.

4. Running BLAST all-versus-all
-------------------------------
Maximum number of BLAST processes: 16
2016-06-21 21:52:11 : This may take some time....
2016-06-21 21:52:11 : Running Blast 0 of 16
2016-06-21 21:52:11 : Running Blast 1 of 16
2016-06-21 21:52:11 : Running Blast 2 of 16
2016-06-21 21:52:11 : Running Blast 3 of 16
2016-06-21 21:52:11 : Running Blast 4 of 16
2016-06-21 21:52:11 : Running Blast 6 of 16
2016-06-21 21:52:11 : Running Blast 5 of 16
2016-06-21 21:52:11 : Running Blast 7 of 16
2016-06-21 21:52:11 : Running Blast 8 of 16
2016-06-21 21:52:11 : Running Blast 9 of 16
2016-06-21 21:52:11 : Running Blast 11 of 16
2016-06-21 21:52:11 : Running Blast 10 of 16
2016-06-21 21:52:11 : Running Blast 12 of 16
2016-06-21 21:52:11 : Running Blast 13 of 16
2016-06-21 21:52:11 : Running Blast 14 of 16
2016-06-21 21:52:11 : Running Blast 15 of 16
2016-06-21 21:52:17 : Finished Blast 15 of 16
2016-06-21 21:52:18 : Finished Blast 9 of 16
2016-06-21 21:52:19 : Finished Blast 14 of 16
2016-06-21 21:52:19 : Finished Blast 13 of 16
2016-06-21 21:52:19 : Finished Blast 10 of 16
2016-06-21 21:52:20 : Finished Blast 11 of 16
2016-06-21 21:52:20 : Finished Blast 12 of 16
2016-06-21 21:52:23 : Finished Blast 6 of 16
2016-06-21 21:52:23 : Finished Blast 7 of 16
2016-06-21 21:52:23 : Finished Blast 2 of 16
2016-06-21 21:52:23 : Finished Blast 4 of 16
2016-06-21 21:52:23 : Finished Blast 5 of 16
2016-06-21 21:52:24 : Finished Blast 1 of 16
2016-06-21 21:52:25 : Finished Blast 8 of 16
2016-06-21 21:52:28 : Finished Blast 0 of 16
2016-06-21 21:52:30 : Finished Blast 3 of 16
Done!

5. Running OrthoFinder algorithm
--------------------------------
2016-06-21 21:52:31 : Initial processing of each species
2016-06-21 21:52:31 : Starting species 0
2016-06-21 21:52:31 : Initial processing of species 0 complete
2016-06-21 21:52:31 : Starting species 1
2016-06-21 21:52:32 : Initial processing of species 1 complete
2016-06-21 21:52:32 : Starting species 2
2016-06-21 21:52:32 : Initial processing of species 2 complete
2016-06-21 21:52:32 : Starting species 3
2016-06-21 21:52:32 : Initial processing of species 3 complete
2016-06-21 21:52:34 : Connected putatitive homologs
2016-06-21 21:52:35 : Writen final scores for species 2 to graph file
2016-06-21 21:52:35 : Writen final scores for species 3 to graph file
2016-06-21 21:52:35 : Writen final scores for species 1 to graph file
2016-06-21 21:52:35 : Writen final scores for species 0 to graph file
[mclIO] reading </shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/OrthoFinder_v0.6.0_graph.txt>
.......................................
[mclIO] read native interchange 2733x2733 matrix with 5080 entries
[mcl] pid 23551
 ite   chaos  time hom(avg,lo,hi) m-ie m-ex i-ex fmv
  1    19.76  0.00 0.99/0.04/1.42 1.67 1.67 1.67   0
  2    15.38  0.00 0.94/0.39/1.23 1.47 1.20 2.01   4
  3     7.67  0.01 0.93/0.42/1.12 1.25 0.91 1.83   5
  4     6.63  0.01 0.92/0.39/1.22 1.22 0.73 1.33   5
  5     5.23  0.00 0.91/0.25/1.20 1.20 0.90 1.19   2
  6     4.25  0.02 0.89/0.38/1.20 1.17 0.92 1.10   1
  7     3.53  0.01 0.88/0.46/1.00 1.08 0.93 1.02   0
  8     2.30  0.01 0.88/0.50/1.00 1.05 0.94 0.95   0
  9     1.13  0.01 0.88/0.55/1.00 1.02 0.94 0.89   0
 10     1.04  0.01 0.90/0.55/1.00 1.02 0.94 0.84   0
 11     0.73  0.00 0.92/0.50/1.00 1.00 0.93 0.78   0
 12     0.64  0.01 0.94/0.63/1.00 1.00 0.91 0.71   0
 13     0.66  0.00 0.96/0.62/1.00 1.00 0.91 0.64   0
 14     0.67  0.00 0.97/0.62/1.00 1.00 0.88 0.57   0
 15     0.52  0.00 0.98/0.69/1.00 1.00 0.91 0.52   0
 16     0.50  0.01 0.99/0.75/1.00 1.00 0.91 0.47   0
 17     0.42  0.01 0.99/0.59/1.00 1.00 0.95 0.45   0
 18     0.25  0.01 0.99/0.57/1.00 1.00 0.96 0.43   0
 19     0.25  0.00 1.00/0.77/1.00 1.00 0.99 0.42   0
 20     0.25  0.01 1.00/0.77/1.00 1.00 0.98 0.42   0
 21     0.24  0.00 1.00/0.77/1.00 1.00 0.98 0.41   0
 22     0.24  0.01 1.00/0.76/1.00 1.00 0.99 0.41   0
 23     0.18  0.00 1.00/0.83/1.00 1.00 0.99 0.40   0
 24     0.24  0.00 1.00/0.89/1.00 1.00 1.00 0.40   0
 25     0.33  0.01 1.00/0.80/1.00 1.00 1.00 0.40   0
 26     0.44  0.00 1.00/0.68/1.00 1.00 1.00 0.40   0
 27     0.50  0.00 1.00/0.63/1.00 1.00 1.00 0.40   0
 28     0.44  0.00 1.00/0.69/1.00 1.00 1.00 0.40   0
 29     0.24  0.00 1.00/0.84/1.00 1.00 1.00 0.40   0
 30     0.06  0.00 1.00/0.96/1.00 1.00 1.00 0.40   0
 31     0.01  0.01 1.00/1.00/1.00 1.00 1.00 0.40   0
 32     0.00  0.00 1.00/1.00/1.00 1.00 0.99 0.40   0
[mcl] jury pruning marks: <100,99,99>, out of 100
[mcl] jury pruning synopsis: <99.6 or perfect> (cf -scheme, -do log)
[mclIO] writing </shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/clusters_OrthoFinder_v0.6.0_I1.5.txt>
.......................................
[mclIO] wrote native interchange 2733x1331 matrix with 2733 entries to stream </shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/clusters_OrthoFinder_v0.6.0_I1.5.txt>
[mcl] 1331 clusters found
[mcl] output is in /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/clusters_OrthoFinder_v0.6.0_I1.5.txt

Please cite:
    Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD thesis,
    University of Utrecht, May 2000.
       (  http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf
       or  http://micans.org/mcl/lit/svdthesis.pdf.gz)
OR
    Stijn van Dongen, A cluster algorithm for graphs. Technical
    Report INS-R0010, National Research Institute for Mathematics
    and Computer Science in the Netherlands, Amsterdam, May 2000.
       (  http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z
       or  http://micans.org/mcl/lit/INS-R0010.ps.Z)

2016-06-21 21:52:35 : Ran MCL

6. Creating files for Orthologous Groups
--------------------------------------- $ module unload python/2.7
$ module load python/2.7.11 OrthoFinder

Currently Loaded Modulefiles:
  1) modules              2) python/2.7           3) R/3.2.1              4) blastall/2.2.26      5) blastScripts/1.0.0   6) openmpi/1.6.5        7) mcl/14-137           8) mafft/7.147          9) FastTree/2.1.8      10) OrthoFinder/0.6.0

$ orthofinder.py -f ExampleDataset/

OrthoFinder version 0.6.0 Copyright (C) 2014 David Emms

    This program comes with ABSOLUTELY NO WARRANTY.
    This is free software, and you are welcome to redistribute it under certain conditions.
    For details please see the License.md that came with this software.

1 threads for OrthoFinder algorithm

1. Checking required programs are installed
-------------------------------------------
Test can run "makeblastdb -help" - ok
Test can run "blastp -help" - ok
Test can run "mcl -h" - ok

2. Temporarily renaming sequences with unique, simple identifiers
------------------------------------------------------------------
Done

3. Dividing up work for BLAST for parallel processing
-----------------------------------------------------

3a. Creating BLAST databases
----------------------------


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies0
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species0.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 820 sequences in 0.0515079 seconds.


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies1
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species1.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 763 sequences in 0.0477939 seconds.


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies2
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species2.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 476 sequences in 0.03089 seconds.


Building a new DB, current time: 06/21/2016 21:52:11
New DB name:   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/BlastDBSpecies3
New DB title:  /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/Species3.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 674 sequences in 0.0401189 seconds.

4. Running BLAST all-versus-all
-------------------------------
Maximum number of BLAST processes: 16
2016-06-21 21:52:11 : This may take some time....
2016-06-21 21:52:11 : Running Blast 0 of 16
2016-06-21 21:52:11 : Running Blast 1 of 16
2016-06-21 21:52:11 : Running Blast 2 of 16
2016-06-21 21:52:11 : Running Blast 3 of 16
2016-06-21 21:52:11 : Running Blast 4 of 16
2016-06-21 21:52:11 : Running Blast 6 of 16
2016-06-21 21:52:11 : Running Blast 5 of 16
2016-06-21 21:52:11 : Running Blast 7 of 16
2016-06-21 21:52:11 : Running Blast 8 of 16
2016-06-21 21:52:11 : Running Blast 9 of 16
2016-06-21 21:52:11 : Running Blast 11 of 16
2016-06-21 21:52:11 : Running Blast 10 of 16
2016-06-21 21:52:11 : Running Blast 12 of 16
2016-06-21 21:52:11 : Running Blast 13 of 16
2016-06-21 21:52:11 : Running Blast 14 of 16
2016-06-21 21:52:11 : Running Blast 15 of 16
2016-06-21 21:52:17 : Finished Blast 15 of 16
2016-06-21 21:52:18 : Finished Blast 9 of 16
2016-06-21 21:52:19 : Finished Blast 14 of 16
2016-06-21 21:52:19 : Finished Blast 13 of 16
2016-06-21 21:52:19 : Finished Blast 10 of 16
2016-06-21 21:52:20 : Finished Blast 11 of 16
2016-06-21 21:52:20 : Finished Blast 12 of 16
2016-06-21 21:52:23 : Finished Blast 6 of 16
2016-06-21 21:52:23 : Finished Blast 7 of 16
2016-06-21 21:52:23 : Finished Blast 2 of 16
2016-06-21 21:52:23 : Finished Blast 4 of 16
2016-06-21 21:52:23 : Finished Blast 5 of 16
2016-06-21 21:52:24 : Finished Blast 1 of 16
2016-06-21 21:52:25 : Finished Blast 8 of 16
2016-06-21 21:52:28 : Finished Blast 0 of 16
2016-06-21 21:52:30 : Finished Blast 3 of 16
Done!

5. Running OrthoFinder algorithm
--------------------------------
2016-06-21 21:52:31 : Initial processing of each species
2016-06-21 21:52:31 : Starting species 0
2016-06-21 21:52:31 : Initial processing of species 0 complete
2016-06-21 21:52:31 : Starting species 1
2016-06-21 21:52:32 : Initial processing of species 1 complete
2016-06-21 21:52:32 : Starting species 2
2016-06-21 21:52:32 : Initial processing of species 2 complete
2016-06-21 21:52:32 : Starting species 3
2016-06-21 21:52:32 : Initial processing of species 3 complete
2016-06-21 21:52:34 : Connected putatitive homologs
2016-06-21 21:52:35 : Writen final scores for species 2 to graph file
2016-06-21 21:52:35 : Writen final scores for species 3 to graph file
2016-06-21 21:52:35 : Writen final scores for species 1 to graph file
2016-06-21 21:52:35 : Writen final scores for species 0 to graph file
[mclIO] reading </shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/OrthoFinder_v0.6.0_graph.txt>
.......................................
[mclIO] read native interchange 2733x2733 matrix with 5080 entries
[mcl] pid 23551
 ite   chaos  time hom(avg,lo,hi) m-ie m-ex i-ex fmv
  1    19.76  0.00 0.99/0.04/1.42 1.67 1.67 1.67   0
  2    15.38  0.00 0.94/0.39/1.23 1.47 1.20 2.01   4
  3     7.67  0.01 0.93/0.42/1.12 1.25 0.91 1.83   5
  4     6.63  0.01 0.92/0.39/1.22 1.22 0.73 1.33   5
  5     5.23  0.00 0.91/0.25/1.20 1.20 0.90 1.19   2
  6     4.25  0.02 0.89/0.38/1.20 1.17 0.92 1.10   1
  7     3.53  0.01 0.88/0.46/1.00 1.08 0.93 1.02   0
  8     2.30  0.01 0.88/0.50/1.00 1.05 0.94 0.95   0
  9     1.13  0.01 0.88/0.55/1.00 1.02 0.94 0.89   0
 10     1.04  0.01 0.90/0.55/1.00 1.02 0.94 0.84   0
 11     0.73  0.00 0.92/0.50/1.00 1.00 0.93 0.78   0
 12     0.64  0.01 0.94/0.63/1.00 1.00 0.91 0.71   0
 13     0.66  0.00 0.96/0.62/1.00 1.00 0.91 0.64   0
 14     0.67  0.00 0.97/0.62/1.00 1.00 0.88 0.57   0
 15     0.52  0.00 0.98/0.69/1.00 1.00 0.91 0.52   0
 16     0.50  0.01 0.99/0.75/1.00 1.00 0.91 0.47   0
 17     0.42  0.01 0.99/0.59/1.00 1.00 0.95 0.45   0
 18     0.25  0.01 0.99/0.57/1.00 1.00 0.96 0.43   0
 19     0.25  0.00 1.00/0.77/1.00 1.00 0.99 0.42   0
 20     0.25  0.01 1.00/0.77/1.00 1.00 0.98 0.42   0
 21     0.24  0.00 1.00/0.77/1.00 1.00 0.98 0.41   0
 22     0.24  0.01 1.00/0.76/1.00 1.00 0.99 0.41   0
 23     0.18  0.00 1.00/0.83/1.00 1.00 0.99 0.40   0
 24     0.24  0.00 1.00/0.89/1.00 1.00 1.00 0.40   0
 25     0.33  0.01 1.00/0.80/1.00 1.00 1.00 0.40   0
 26     0.44  0.00 1.00/0.68/1.00 1.00 1.00 0.40   0
 27     0.50  0.00 1.00/0.63/1.00 1.00 1.00 0.40   0
 28     0.44  0.00 1.00/0.69/1.00 1.00 1.00 0.40   0
 29     0.24  0.00 1.00/0.84/1.00 1.00 1.00 0.40   0
 30     0.06  0.00 1.00/0.96/1.00 1.00 1.00 0.40   0
 31     0.01  0.01 1.00/1.00/1.00 1.00 1.00 0.40   0
 32     0.00  0.00 1.00/1.00/1.00 1.00 0.99 0.40   0
[mcl] jury pruning marks: <100,99,99>, out of 100
[mcl] jury pruning synopsis: <99.6 or perfect> (cf -scheme, -do log)
[mclIO] writing </shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/clusters_OrthoFinder_v0.6.0_I1.5.txt>
.......................................
[mclIO] wrote native interchange 2733x1331 matrix with 2733 entries to stream </shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/clusters_OrthoFinder_v0.6.0_I1.5.txt>
[mcl] 1331 clusters found
[mcl] output is in /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/WorkingDirectory/clusters_OrthoFinder_v0.6.0_I1.5.txt

Please cite:
    Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD thesis,
    University of Utrecht, May 2000.
       (  http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf
       or  http://micans.org/mcl/lit/svdthesis.pdf.gz)
OR
    Stijn van Dongen, A cluster algorithm for graphs. Technical
    Report INS-R0010, National Research Institute for Mathematics
    and Computer Science in the Netherlands, Amsterdam, May 2000.
       (  http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z
       or  http://micans.org/mcl/lit/INS-R0010.ps.Z)

2016-06-21 21:52:35 : Ran MCL

6. Creating files for Orthologous Groups
----------------------------------------

When publishing work that uses OrthoFinder please cite:
    D.M. Emms & S. Kelly (2015), OrthoFinder: solving fundamental biases in whole genome comparisons
    dramatically improves orthogroup inference accuracy, Genome Biology 16:157.

Orthologous groups have been written to tab-delimited files:
   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/OrthologousGroups.csv
   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/OrthologousGroups_UnassignedGenes.csv
And in OrthoMCL format:
   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/OrthologousGroups.txt

When publishing work that uses OrthoFinder please cite:
    D.M. Emms & S. Kelly (2015), OrthoFinder: solving fundamental biases in whole genome comparisons
    dramatically improves orthogroup inference accuracy, Genome Biology 16:157.

Orthologous groups have been written to tab-delimited files:
   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/OrthologousGroups.csv
   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/OrthologousGroups_UnassignedGenes.csv
And in OrthoMCL format:
   /shelf/scratch/ramon/swmake/OrthoFinder-0.6/ExampleDataset/Results_Jun21/OrthologousGroups.txt


Links

  • OrthoFinder's github page with some usage details

References

  1. D.M. Emms & S. Kelly (2015), OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy, Genome Biology 16:157.