RRBS Analysis


1. RRBS Data Analysis

Sample Information 15 RRBS samples (paired-ends sequencing, 75nt) ??
Patients datasets
X Normal / X Tumors
Perform by Loh Wan Yi (Benoukraf’s Lab)
Date Data Received 29th June, 2015

2. QC

A quality control checks was performed by using FastQC. As shown in Figure 1, the bases of sequenced reads (R1 and R2 ) result a very good quality score as all the bases falling above the green zone with its quality score ranging from 32 to 40.

per_base_quality_dechen-0 per_base_quality_dechen-1 per_base_quality_dechen-2

Figure 2 shows the proportion of each base position in FastQ files.

per_base_content-0 per_base_content-1 per_base_content-2

3. Read Alignment

RRBS samples were mapped against human reference genome ( hg19 ) by using Bismark .Table 1 shows the total input reads to Bismark, the number of paired-end alignments with a unique best hit and the mapping efficiency for all samples. The average mapping efficiency of RRBS samples is ~ 63%.

Bismark Alignment Report
Samples 27 87 89 90 FG014 FG058 FG060 FG064 FG070 FG093 LO1iT NP111
Input reads 31459427 31227992 53315689 31985150 56610577 57764919 55927417 57553068 56625979 34302067 52061596 52191564
Uniquely mapped 21246583 19379149 33510024 20075250 37527305 36032670 33940745 36614700 34147955 21298918 32691864 33204931
Mapping efficiency 67.5% 62.1% 62.9% 62.8% 66.3% 62.4% 60.7% 63.6% 60.3% 62.1% 62.8% 63.6%

 Technical Information:

– Genome : hg19 ,

– Software : Bismark

– Command Line : for i in *; do bismark /home/jason/bismark14-bowtie2_hg19/ –bam –bowtie2 -p 4 -1 $i/*1.fq.gz -2 $i/*2.fq.gz -o $i;done

 

 

4. Methylation Scoring

B-score (DNA methylation Scoring) of the BAM files (obtain from Step 3) was calculated and performed by using GBSA. Later, comparative analysis on output files was performed by using MethylKit.

  • Clustering Samples:

Figure 3 shows the similarity of their methylation profiles by clustering method.

CpG_methylation_clustering

Figure 3: Clustering Samples Based On The Similarity Of Their CpG Methylation Profiles.

  • PCA

Figure 4 plots a scree plot for importance of components.

CpG_methylation_PCA_Screeplot

Figure 4: Scree Pot For Importance of Components on Samples.

Figure 5 shows the scatter plot of the samples.

Selection_003

Figure 5: Scatter Plot of the Samples.

 Technical Information:

  • file.list=list(‘methyl_form_27_bscore_CpG_File.txt.gz’,’methyl_form_FG014_bscore_CpG_File.txt.gz’,’methyl_form_FG070_bscore_CpG_File.txt.gz’,’methyl_form_87_bscore_CpG_File.txt.gz’,’methyl_form_FG058_bscore_CpG_File.txt.gz’,’methyl_form_FG093_bscore_CpG_File.txt.gz’,’methyl_form_89_bscore_CpG_File.txt.gz’,’methyl_form_FG060_bscore_CpG_File.txt.gz’,’methyl_form_LO1iT_bscore_CpG_File.txt.gz’,’methyl_form_90_bscore_CpG_File.txt.gz’,’methyl_form_FG064_bscore_CpG_File.txt.gz’,’methyl_form_NP111_bscore_CpG_File.txt.gz’)
  • myobj=read( file.list, sample.id=list(’27’,’FG014′,’FG070′,’87’,’FG058′,’FG093′,’89’,’FG060′,’LO1iT’,’90’,’FG064′,’NP111′), assembly=’hg19′,treatment=c(1,1,1,1,1,1,1,1,1,1,1,1),context=’CpG’ )
  • meth=unite(myobj, destrand=FALSE)
  • clusterSamples(meth, dist=’correlation’, method=’ward’, plot=TRUE)
  • PCASamples(meth,sreeplot=TRUE)
  • PCASamples(meth)