T-coffee

From wiki
Jump to: navigation, search

Introduction

Multiple sequence aligner ... especially proteins, Cedric Notredame of the CRG in Barcelona.

It's quite large program with many options and is capable of interaction with a number of other aligners, especially Clustal as Cedric has often collaborated with Desmond Higgins.

T-Coffee stands for "Tree based Consistency Objective Function For alignment Evaluation". Its main distinction is that of being a consistency based aligner.


Usage

T-coffee is a predominantly command-line program. If you want proteins alignments using a Graphical User Interface, try clustalx.

Despite the dash in its name, the T-coffee executable is actually t_coffee with an underscore.

As input it can take various types of data such as

  • P : PDB structure
  • S : Sequences (aligned or unaligned sequences)
  • M : Methods used to build the library
  • L : Precomputed T-Coffee library
  • A : Alignments that must be turned into a Library
  • X : Substitution matrices
  • R : Profiles

These letters can be used with the -in option. Alternative options such as -seq also exist so that feedign the program an unaligned, multifasta file is as easy as:

t_coffee -seq sh3.fasta -outfile <yourchoiceofname.aln>

In this case the result will be output to a file called "yourchoiceofname.aln" which will be in clustal format. It will also output a file called "youchoiceofname.aln.html" which is a colorized, more visually friendly version of the alignment whihc can be viewed in a browser.