http://stab.st-andrews.ac.uk/wiki/index.php?title=Cd-hit&feed=atom&action=historyCd-hit - Revision history2024-03-29T06:08:25ZRevision history for this page on the wikiMediaWiki 1.30.0http://stab.st-andrews.ac.uk/wiki/index.php?title=Cd-hit&diff=194&oldid=prevRf: Created page with "=Introduction= CD-HIT is primarily a clustering program which, for input, takes fasta sequence files which are being envisioned as databases against which query sequence file..."2016-05-01T15:23:25Z<p>Created page with "=Introduction= CD-HIT is primarily a clustering program which, for input, takes fasta sequence files which are being envisioned as databases against which query sequence file..."</p>
<p><b>New page</b></p><div>=Introduction=<br />
<br />
CD-HIT is primarily a clustering program which, for input, takes fasta sequence files which are being envisioned as databases against which query sequence files will search.<br />
<br />
A major concern with such fasta files is the level of redundancy they may have. Depending on the experiment or analysis being run, the degree of detail in the database file may be too high, and there is a benefit to clustering sequences that are similar. CD-HIT is used for this.<br />
<br />
= Common use-cases=<br />
<br />
== Clustering the Antibiotic Resistance Gene database ==<br />
<br />
cdhit-est -i argannot-nt_doc.fasta -o argannot_cdhit90 -d 0 > argannot_cdhit90.stdout</div>Rf