Edgenl2g

From wiki
Revision as of 12:09, 3 October 2016 by Rf (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

Details of Edinburgh Genomics' Linux for Genomics course

Duration

  • 1 day
  • 2/3's core linux 1/3 genomics focus

General Contents

Core Linux

  • The shell and commands
  • Getting help
  • Files and directories
  • Navigating the file system
  • File management*
  • Permissions
  • Accessing files
  • Downloading remote files
  • Zipping and unzipping files
  • Pipes and redirects
  • Filtering / manipulating file content
  • Shell scripts
  • Process management

Focused Genomics

Command-line tools for genomics

  • seqtk
  • bioawk
  • samtools
  • bedtools
  • tabix

Detailed Contents Part One: Linux

The command ls lists files and subdirectories in a directory

The command man provides help for a command

Basic Linux/Unix tips for filenames

Changing directories

Tab completion for commands and filenames

Command history

Making and removing (empty) directories

Text editors

Reading text files

Copying files

Removing directories

Piping and outputting to files

Grep

What permissions mean

Head and tail

Redirection

Working with zipped data

Some other useful information

Stopping processes

Clearing the terminal

Copying and pasting text

Environment Variables

The FASTQ format

FASTQ on the command line

Using paste to manipulate fastq

awk for data in columns

FASTQ to FASTA conversion

Process Management

Simple shell scripts

If you finish and you are bored

Detailed Contents Part Two: Command-line tools for Genomics

bedtools

seqtk

samtools

Bioawk