Intro to RNA-Seq Data Analysis Course
Contents
Course schedule
- This is based on a 2 day Edinburgh Genomics course of the same name, with the following changes:
 
- - "Introduction to Linux" moduel excluded
 - - "Sequencer technology overview" module excluded.
 - - No laboratory visit
 - - 50% of that course was theoretical, this will be reduced to 30%
 
- Each section begins with a "Talk", and then a practical runthrough.
 - If necessary, some talk slides may be skipped, as the main idea is getting through the practicals.
 - Having said that if major theoretical points arise during a practical, they will be discussed.
 -  Course website: 
http://stab.st-andrews.ac.uk/i2rda/ 
- - this has all the presentations and practicals
 
Connecting to a remote Machine
Presenting this before introduction as some people might experience delays logging in.
- We shall use a remote machine not the machine you are logged into locally
 - The program we shall use is PuTTY.
 - Please try to locate PuTTY in the applications section or on AppsAnywhere
 
Configuring PuTTY for connection
-  Server: 
marvin.st-andrews.ac.uk - Terminal | keyboard | check VT100+
 - Window | Selection | Control use of Mouse | set xterm
 - Connect | Data | enter username
 - Connection | ssh | X11 Forwarding | Check yes
 - Back to PuTTY main screen | select Default setting |click save
 
You should now be able to "open" a session
- Be aware typing in your password is done blindly. I.e. it does not appear on the screen.
 
Computing resources
- RNA-Seq is characterised by;
 
- - heavy workload
 - - several software programs
 - - long-running tasks.
 
These have three implications:
- - The marvin cluster is an 11-machine shared computing resource, not a personal computer ... others are using it.
 - - We need to load the special software before using it
 - - We want to be able to have a process run unattended.
 
- For these three aspects, we have:
 
- - A queue system to use, we shall request an interactive session (
qrsh) from the queue. - - Use the module system to load, list and unload software programs
 - - We shall use the GNU Screen utility so we can do other things while waiting.
 
GNU Screen 1
A program which allows several command-line sessions open, similar to the idea of open tabs in a web browser. Let's try it out.
-  To enter a new session, type 
screen - This will open with quite a bare screen except with a indicator line at the bottom.
 -  Screen works on the activator key concept, you need to use 
Ctrl+l(whileCtrl-key iis held down briefly,l-key is pressed) to activate any of its functions. -  After pressing 
Ctrl+land releasing you then have a series of single key strokes that will do various useful things. - There will be two command-line windows open when you start it.
 - Let's learn how to get out of it first
 
- - type 
exit, you should see you have one command-line session less. - - type 
exitagain and you will be told you have exited screen - - you are now back in the ordinary command-line.
 
GNU Screen 2
- Go back into screen
 -  Switch back and forth between the two open sessions: use 
Ctrl+l,n(n for next) orCtrl+l,p(p for previous) - Don't see anything different when you do this? Look again at the bottom line, the asterisk has changed position.
 
- - the asterisk define the active session
 -  you can move to a numbered session with 
ctrl+l,1orctrl+l,2for session no.1 and no.2 respectively. 
Overview of RNA-Seq
- For gene expression analyses, seen as a more powerful replacememnt to microarrays