Difference between revisions of "Two Eel Scaffolds"

From wiki
Jump to: navigation, search
Line 20: Line 20:
 
One of the most up-to-date (2016) gene predictors is Augustus. It uses HMM profiles based on a related organism. In terms of eel, there are two given organisms: Zebra fish ('''zb''') and Lamprey
 
One of the most up-to-date (2016) gene predictors is Augustus. It uses HMM profiles based on a related organism. In terms of eel, there are two given organisms: Zebra fish ('''zb''') and Lamprey
 
('''lp''') which Augustus makes available. Though tilapia is not available, it is possible - given time - to train and establish HMM profile for this organism.
 
('''lp''') which Augustus makes available. Though tilapia is not available, it is possible - given time - to train and establish HMM profile for this organism.
 +
 +
An example Augustus command line is as follows:
 +
 +
augustus --species=lamprey eelScaffold320.fa >aug_s320_lp.gtf
 +
 +
  
 
Augustus outputs in the GTF format, for browsing we need to convert to the related format, GFF:
 
Augustus outputs in the GTF format, for browsing we need to convert to the related format, GFF:
  
 
  gtf2gff.pl --printExon --gff3 < aug_s32_lp.gtf --out=aug_s32_lp.gff
 
  gtf2gff.pl --printExon --gff3 < aug_s32_lp.gtf --out=aug_s32_lp.gff

Revision as of 11:56, 13 May 2016

Introduction

Two DNA scaffolds are presented:

  1. eelScaffold32. 679 422 bp and 42.25% GC.
  2. eelScaffold320. 246 433 bp and 43.47% GC.

We take tilapia (Oreochromis niloticus, Ensembl abbreviation ONI) to be the reference.

There are two genes expected to be around about the regions covered by these scaffolds:

  • eelScaffold32 contains any part of PDCD10b (Programmed cell death 10b).
  • eelScaffold320 contains any part of nrd1a (Nardilysin, N-arginine dibasic convertase)

ORF Analysis

ORF scans for sequences over 100 kbp often throw up too much data, but it can be useful first step to see the complexity of the sequence.

Gene Predictor

One of the most up-to-date (2016) gene predictors is Augustus. It uses HMM profiles based on a related organism. In terms of eel, there are two given organisms: Zebra fish (zb) and Lamprey (lp) which Augustus makes available. Though tilapia is not available, it is possible - given time - to train and establish HMM profile for this organism.

An example Augustus command line is as follows:

augustus --species=lamprey eelScaffold320.fa >aug_s320_lp.gtf


Augustus outputs in the GTF format, for browsing we need to convert to the related format, GFF:

gtf2gff.pl --printExon --gff3 < aug_s32_lp.gtf --out=aug_s32_lp.gff