Hdi2u intro

From wiki
Jump to: navigation, search

Course schedule

  • This is a cut-down version of 1 day course.
  • History and theory has been left out
  • Scripting is excluded (although plenty of one-liners)
  • Maximise practical aspect.
  • Having said that, if you fall behind, listening is better than catching up

Course website: http://stab.st-andrews.ac.uk/hdi2u/

Connecting to a remote Machine

Presenting this before introduction as some people might experience delays logging in.

  • We shall use a remote machine not the machine you are logged into locally
  • The program we shall use is PuTTY.
  • Please try to locate PuTTY in the applications section or on AppsAnywhere

Configuring PuTTY for connection

  • Server: marvin.st-andrews.ac.uk
  • Terminal | keyboard | check VT100+
  • Window | Selection | Control use of Mouse | set xterm
  • Connect | Data | enter username
  • Connection | ssh | X11 Forwarding | Check yes
  • Back to PuTTY main screen | select Default setting |click save

You should now be able to open a session, entering your password and get connected to marvin.

Unix nearly 50 years old

  • Inspired by CTSS timesharing systems 1964
  • Computers were much slower then …but there was alot less data too
  • Computers now much faster …but still fall short in meeting big data challenges

Why so many different Unix’s?

  • AIX, IBM’s Unix
  • HP-UX, HP’s Unix
  • Solaris, Sun’s (Oracle’s) Unix
  • Linux: Ubuntu, Debian, RedHat, SuSE, many others.
  • Mac OSX: s an Unix “under the hood”
  • On Windows, you can use Cygwin or install a virtual Linux.

Linux particularities

  • Connected to Open source code (GNU)
  • A grassroots movement
  • Immense information out on the web

Unix and Genomics: Common ground


  • A few large files, multitude of small files
  • Small inefficiencies add up to large delays


  • Automation
  • Small, gradual improvements
  • Focus on performance

Represents a style of work


  • Small tools, do one thing well
  • Combine these as building blocks for larger tasks
  • Look out for small inefficiencies: they add up to large delays
Good news Bad news
Details It’s there somewhere Demands patience
Preparation Subsequent actions easy First time is hard
Memorizing Repetition strengthens Reliance on memory

Things to get used to

On one hand On the other hand
Personal Shared
Single load Batch load
General usage Focused usage

The command line (also called the shell) is Unix’s central tool

Unix Philosophy


  • Effective use of the command-line
  • Single optimised small tools can be used as building blocks
  • Exposes and so does not hide, details
  • Powerful approach can lead easily-made big mistakes


  • Test before executing
  • Realise that the tiniest of details can be important
  • Consulting help documentation continuously