Difference between revisions of "ChIP-Seq Top2 peak-calling E2"

From wiki
Jump to: navigation, search
m (Protected "ChIP-Seq Top2 peak-calling E2" ([Edit=Allow only administrators] (indefinite) [Move=Allow only administrators] (indefinite)))
 
(6 intermediate revisions by one other user not shown)
Line 68: Line 68:
 
  └── 2017-06-27-08-T2WTINP_S8_L004_R2_001.fastq.gz
 
  └── 2017-06-27-08-T2WTINP_S8_L004_R2_001.fastq.gz
  
= Raw dataset quality analysis =
+
== Concatenation ==
  
Using FASTQC and them MultiQC.
+
Splitting the rawdata into lanes is not really necessary for these experiments, so the above were concatenated to form:
  
Results linked [http://stab.st-andrews.ac.uk/top2/fastqc.html here].
+
rawcatdata/
 +
├── ULS1_INP_R1.fastq.gz
 +
├── ULS1_INP_R2.fastq.gz
 +
├── ULS1_IP_R1.fastq.gz
 +
├── ULS1_IP_R2.fastq.gz
 +
├── ULS1_T2_INP_R1.fastq.gz
 +
├── ULS1_T2_INP_R2.fastq.gz
 +
├── ULS1_T2_IP_R1.fastq.gz
 +
├── ULS1_T2_IP_R2.fastq.gz
 +
├── WT_T2_INP_R1.fastq.gz
 +
├── WT_T2_INP_R2.fastq.gz
 +
├── WT_T2_IP_R1.fastq.gz
 +
└── WT_T2_IP_R2.fastq.gz
 +
(order not respected)
  
= Adaptor-cutting and filtering procedure =
+
= Dataset quality analysis and filtering =
  
* First run was without any filtering or cutting.
+
== Raw unfiltered data ==
  
= BAM file quality analysis =
+
Using FASTQC to view quality aspects inidvidually:
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_INP_R1_fastqc.html ULS1_INP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_INP_R2_fastqc.html ULS1_INP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_IP_R1_fastqc.html ULS1_IP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_IP_R2_fastqc.html ULS1_IP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_T2_INP_R1_fastqc.html ULS1_T2_INP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_T2_INP_R2_fastqc.html ULS1_T2_INP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_T2_IP_R1_fastqc.html ULS1_T2_IP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/ULS1_T2_IP_R2_fastqc.html ULS1_T2_IP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/WT_T2_INP_R1_fastqc.html WT_T2_INP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/WT_T2_INP_R2_fastqc.html WT_T2_INP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/WT_T2_IP_R1_fastqc.html WT_T2_IP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/fqc1/WT_T2_IP_R2_fastqc.html WT_T2_IP_R2]
  
Using BamQC.
+
and then MultiQC to integrate the FASTQC files into one report
  
== For the unfiltered reads ==
+
* [http://stab.st-andrews.ac.uk/top2/fqc1/multiqc_report.html integrated quality report].
  
Results linked [http://stab.st-andrews.ac.uk/top2/bamqc.html here].
+
== Adaptor-cutting and filtering procedure ==
  
= Papers =
+
The program '''cutadapt''' was applied first (this is advised) and then '''trimmomatic''' in the following manner
  
== Genome-Organizing Factors Top2 and Hmo1 Prevent Chromosome Fragility at Sites of S phase Transcription ==
+
cutadapt -g ACACTCTTTCCCTACACGACGCTCTTCCGATCT...GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG ${SAMP} -o ${OUTDIR}
 +
RL="ILLUMINACLIP:${TS3PE2}:2:30:10 LEADING:5 TRAILING:5 SLIDINGWINDOW:4:15 MINLEN:49"
 +
java -jar $TRIMMOJARFILE PE -threads $THRDS -phred33 $SAMP1 $SAMP2 $OUTR1 $OUTSING1 $OUTR2 $OUTSING2 $RL
  
link: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE16258
+
The integrated MultiQC report is as follows:
  
Abstract:
+
* [http://stab.st-andrews.ac.uk/top2/ffqc2/multiqc_report.html MultiQC].
Specialized topoisomerases solve the topological constraints arising when replication forks encounter transcription. We have investigated the contribution of Top2 in S phase transcription. Specifically in S phase, Top2 binds intergenic regions close to transcribed genes. The Top2-bound loci exhibit low nucleosome density and accumulate gammaH2A when Top2 is defective. These intergenic loci associate with the HMG protein Hmo1 throughout the cell cycle and are refractory to the histone variant Htz1. In top2 mutants, Hmo1 is deleterious and accumulates at pericentromeric regions in G2/M. Our data indicate that Top2 is dispensable for transcription and that Hmo1 and Top2 bind in the proximity of genes transcribed in S phase suppressing chromosome fragility at the M-G1 transition. We propose that an Hmo1-dependent epigenetic signature together with Top2 mediate an S phase architectural pathway to preserve genome integrity.
 
  
== Topoisomerase II binds nucleosome-free DNA and acts redundantly with topoisomerase I to enhance recruitment of RNA Pol II in budding yeast ==
+
The individual FastQC reports are as follows:
  
link: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE22626
+
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_INP_R1_cutad_q30l49_tro_fastqc.html ULS1_INP_R1]
 
+
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_INP_R2_cutad_q30l49_tro_fastqc.html ULS1_INP_R2]
DNA topoisomerases are believed to promote transcription by removing excessive DNA supercoils produced during elongation. However, it is unclear how topoisomerases in eukaryotes are recruited and function in the transcription pathway in the context of nucleosomes. To address this problem we present high-resolution genome-wide maps of one of the major eukaryotic topoisomerases, Topoisomerase II (Top2) and nucleosomes in the budding yeast, Saccharomyces cerevisiae. Our data indicate that at promoters Top2 binds primarily to DNA that is nucleosome-free. However, although nucleosome loss enables Top2 occupancy, the opposite is not the case and the loss of Top2 has little effect on nucleosome density. We also find that Top2 is involved in transcription. Not only is Top2 enriched at highly transcribed genes, but Top2 is required redundantly with Top1 for optimal recruitment of RNA polymerase II at their promoters. These findings and the examination of candidate-activated genes suggest that nucleosome loss induced by nucleosome remodeling factors during gene activation enables Top2 binding, which in turn acts redundantly with Top1 to enhance recruitment of RNA polymerase II.
+
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_IP_R1_cutad_q30l49_tro_fastqc.html ULS1_IP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_IP_R2_cutad_q30l49_tro_fastqc.html ULS1_IP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_T2_INP_R1_cutad_q30l49_tro_fastqc.html ULS1_T2_INP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_T2_INP_R2_cutad_q30l49_tro_fastqc.html ULS1_T2_INP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_T2_IP_R1_cutad_q30l49_tro_fastqc.html ULS1_T2_IP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/ULS1_T2_IP_R2_cutad_q30l49_tro_fastqc.html ULS1_T2_IP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/WT_T2_INP_R1_cutad_q30l49_tro_fastqc.html WT_T2_INP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/WT_T2_INP_R2_cutad_q30l49_tro_fastqc.html WT_T2_INP_R2]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/WT_T2_IP_R1_cutad_q30l49_tro_fastqc.html WT_T2_IP_R1]
 +
* [http://stab.st-andrews.ac.uk/top2/ffqc2/WT_T2_IP_R2_cutad_q30l49_tro_fastqc.html WT_T2_IP_R2]

Latest revision as of 10:19, 4 June 2018

Introduction

Experiment 2, or the second part of Top2 - ULS1 ChIP-Seq ipeak-calling

One of the authors also has a tutorial which is formatted into wiki format (for easy copy-pasting) here.

Raw data

  • ULS1 Immunoprecipitation files:
2017_06_27_07_Uls1IP-50925901
├── 2017-06-27-07-Uls1IP_S7_L001_R1_001.fastq.gz
├── 2017-06-27-07-Uls1IP_S7_L001_R2_001.fastq.gz
├── 2017-06-27-07-Uls1IP_S7_L002_R1_001.fastq.gz
├── 2017-06-27-07-Uls1IP_S7_L002_R2_001.fastq.gz
├── 2017-06-27-07-Uls1IP_S7_L003_R1_001.fastq.gz
├── 2017-06-27-07-Uls1IP_S7_L003_R2_001.fastq.gz
├── 2017-06-27-07-Uls1IP_S7_L004_R1_001.fastq.gz
└── 2017-06-27-07-Uls1IP_S7_L004_R2_001.fastq.gz
  • ULS1 Input files
2017_06_27_06_Uls1INP-50917952
├── 2017-06-27-06-Uls1INP_S6_L001_R1_001.fastq.gz
├── 2017-06-27-06-Uls1INP_S6_L001_R2_001.fastq.gz
├── 2017-06-27-06-Uls1INP_S6_L002_R1_001.fastq.gz
├── 2017-06-27-06-Uls1INP_S6_L002_R2_001.fastq.gz
├── 2017-06-27-06-Uls1INP_S6_L003_R1_001.fastq.gz
├── 2017-06-27-06-Uls1INP_S6_L003_R2_001.fastq.gz
├── 2017-06-27-06-Uls1INP_S6_L004_R1_001.fastq.gz
└── 2017-06-27-06-Uls1INP_S6_L004_R2_001.fastq.gz
  • Top2 Wildtype Immunoprecipitation files
2017_06_27_09_T2WTIP-50922924
├── 2017-06-27-09-T2WTIP_S9_L001_R1_001.fastq.gz
├── 2017-06-27-09-T2WTIP_S9_L001_R2_001.fastq.gz
├── 2017-06-27-09-T2WTIP_S9_L002_R1_001.fastq.gz
├── 2017-06-27-09-T2WTIP_S9_L002_R2_001.fastq.gz
├── 2017-06-27-09-T2WTIP_S9_L003_R1_001.fastq.gz
├── 2017-06-27-09-T2WTIP_S9_L003_R2_001.fastq.gz
├── 2017-06-27-09-T2WTIP_S9_L004_R1_001.fastq.gz
└── 2017-06-27-09-T2WTIP_S9_L004_R2_001.fastq.gz
  • Top2 Wildtype Input file
2017_06_27_10_T2Uls1INP-50926896
├── 2017-06-27-10-T2Uls1INP_S10_L001_R1_001.fastq.gz
├── 2017-06-27-10-T2Uls1INP_S10_L001_R2_001.fastq.gz
├── 2017-06-27-10-T2Uls1INP_S10_L002_R1_001.fastq.gz
├── 2017-06-27-10-T2Uls1INP_S10_L002_R2_001.fastq.gz
├── 2017-06-27-10-T2Uls1INP_S10_L003_R1_001.fastq.gz
├── 2017-06-27-10-T2Uls1INP_S10_L003_R2_001.fastq.gz
├── 2017-06-27-10-T2Uls1INP_S10_L004_R1_001.fastq.gz
└── 2017-06-27-10-T2Uls1INP_S10_L004_R2_001.fastq.gz
  • Top2 ULS1 Immunoprecipitation files:
2017_06_27_11_T2Uls1IP-50919923
   ├── 2017-06-27-11-T2Uls1IP_S11_L001_R1_001.fastq.gz
   ├── 2017-06-27-11-T2Uls1IP_S11_L001_R2_001.fastq.gz
   ├── 2017-06-27-11-T2Uls1IP_S11_L002_R1_001.fastq.gz
   ├── 2017-06-27-11-T2Uls1IP_S11_L002_R2_001.fastq.gz
   ├── 2017-06-27-11-T2Uls1IP_S11_L003_R1_001.fastq.gz
   ├── 2017-06-27-11-T2Uls1IP_S11_L003_R2_001.fastq.gz
   ├── 2017-06-27-11-T2Uls1IP_S11_L004_R1_001.fastq.gz
   └── 2017-06-27-11-T2Uls1IP_S11_L004_R2_001.fastq.g
  • Top2 ULS1 Input files
2017_06_27_08_T2WTINP-50917953
├── 2017-06-27-08-T2WTINP_S8_L001_R1_001.fastq.gz
├── 2017-06-27-08-T2WTINP_S8_L001_R2_001.fastq.gz
├── 2017-06-27-08-T2WTINP_S8_L002_R1_001.fastq.gz
├── 2017-06-27-08-T2WTINP_S8_L002_R2_001.fastq.gz
├── 2017-06-27-08-T2WTINP_S8_L003_R1_001.fastq.gz
├── 2017-06-27-08-T2WTINP_S8_L003_R2_001.fastq.gz
├── 2017-06-27-08-T2WTINP_S8_L004_R1_001.fastq.gz
└── 2017-06-27-08-T2WTINP_S8_L004_R2_001.fastq.gz

Concatenation

Splitting the rawdata into lanes is not really necessary for these experiments, so the above were concatenated to form:

rawcatdata/
├── ULS1_INP_R1.fastq.gz
├── ULS1_INP_R2.fastq.gz
├── ULS1_IP_R1.fastq.gz
├── ULS1_IP_R2.fastq.gz
├── ULS1_T2_INP_R1.fastq.gz
├── ULS1_T2_INP_R2.fastq.gz
├── ULS1_T2_IP_R1.fastq.gz
├── ULS1_T2_IP_R2.fastq.gz
├── WT_T2_INP_R1.fastq.gz
├── WT_T2_INP_R2.fastq.gz
├── WT_T2_IP_R1.fastq.gz
└── WT_T2_IP_R2.fastq.gz

(order not respected)

Dataset quality analysis and filtering

Raw unfiltered data

Using FASTQC to view quality aspects inidvidually:

and then MultiQC to integrate the FASTQC files into one report

Adaptor-cutting and filtering procedure

The program cutadapt was applied first (this is advised) and then trimmomatic in the following manner

cutadapt -g ACACTCTTTCCCTACACGACGCTCTTCCGATCT...GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG ${SAMP} -o ${OUTDIR}
RL="ILLUMINACLIP:${TS3PE2}:2:30:10 LEADING:5 TRAILING:5 SLIDINGWINDOW:4:15 MINLEN:49"
java -jar $TRIMMOJARFILE PE -threads $THRDS -phred33 $SAMP1 $SAMP2 $OUTR1 $OUTSING1 $OUTR2 $OUTSING2 $RL

The integrated MultiQC report is as follows:

The individual FastQC reports are as follows: