DF_HT-12_GeneExpression


1. Array QC

Array QC analysis was performed by Tan SiLi (CTRAD).

HumanHT-12 v4.0 Report

1.1 Sample Information

Sample Information 24 total RNA samples from cell lines
Received by Vannessa Mok (Richie Soong’s lab)
Date Sample Received 29th January 2015
RNA Sample Assessment All samples passed quality control (Section 1.2)
Additional Sample Comments None Applicable

1.2 Total RNA & cRNA Sample Data

 

nanodrop_rin ribboGreen

1.3 Array Analysis

Analyzed by Tan Sili (Richie Soong’s lab)
Analysis Date 5th March 2015
Sample-independent metrics All samples passed quality control (Figures 1-3)
Sample-dependent metrics All samples passed quality control (Figures 4-5)
Hybridization and Batch effects All samples passed quality control (Figure 6)
Additional remarks None Applicable

1.4 Array Data: Sample-Independent Metrics

Figure 1. Hybridization control Hybridization control QC Report Hybridization Controls: low QC Report Hybridization Controls: med QC Report Hybridization Controls: high
Figure 2. Low stringency controls Low Stingency
Figure 3.Biotin and high stringency controls biotinHighStingency

1.5 Array Data: Sample-Dependent Metrics

Figure 4. Gene intensity Control GeneIntensity QCHouseKeeping
Figure 5. Negative Controls NegativeControl

1.6 Hybridization and Batch effects

Figure 6. Hybridization signal across all samples HybridizationAllSamples

1.7 About the Assay

HumanHT-12 v4.0 BeadChip
The HumanHT-12 v4.0 Expression BeadChip provides genome-wide transcriptional coverage of well-characterized genes, gene candidates, and splice variants, with a significant portion targeting well-established sequences supported by peer-reviewed literature. Probes were designed to cover content from NCBI RefSeq Release 38, as well as legacy UniGene content.

Number of Probes 47,231
Number of coding transcript (Well-established/ provisional annotation) 28,688 / 11,121
Number of non-coding transcript (Well-established/ provisional annotation) 1,752 / 2,209
UniGene 3,461
Additional Sample Comments None Applicable
Methods

Total RNA Assessment:

  • NanoDrop 1000 Spectrophotometer (Cat no: SER-1K-1PR) (Thermo Fisher Scientific, Waltham, MA)
  • Agilent RNA 6000 Nano Kit (Cat no: 5067-1511) (Agilent Technologies, Santa Clara, California)

Sample Preparation: TotalPrepTM-96 RNA Amplification Kit (Cat no: 4393543) (Life Technologies, Carlsbad, CA)
cRNA Assessment:

  • Quant-iTTM RiboGreen® RNA Assay Kit (Cat no: R11490) (Life Technologies, Carlsbad, CA)
  • RNA ScreenTape (Cat no: 5067-5576) (Agilent Technologies, Santa Clara, California)

Data Processing: GenomeStudio (2011.1)
Gene Expression Analysis: Gene Expression
PCA Analysis: Partek Genome Suite v6.6

Quality metrics

Total RNA: 11μL of >46ng/μL, RIN: >8
cRNA: >750ng (5μL of >150 ng/μL)
Hybridization Controls: High > Medium > Low
Low Stringency Control: Perfect-Match (PM) > Mis-Match (MM)
Biotin Control: High
Negative Controls (Background and Noise): Low
Gene Intensity Controls (Housekeeping and All Genes): Housekeeping > All Genes
Batch Effects: No clustering according to array number
Replication Reproducibility: Replicates are clustered close together

References

 

  • Whole-Genome Gene Expression Direct Hybridization Assay Guide (#11322355 Rev. A)
  • GenomeStudio TM GX Module Guide v1.0 (11319121 A)
  • Technical Note: RNA Analysis – Gene Expression Microarray Data Quality Control

 

Raw data (scanner output - zip) can be downloaded here.

2. Expression Data Analysis

Sample Information 8 assays x 3 replicates
Perform by Ricky Lim (Benoukraf’s Lab)
Date Data Received 13th March, 2015

2.1 Data Normalization

The expression data was normalized by applying two normalization methods using limma bioconductor.

  • Inter-quantile normalization with log transformation using normalizeBetweenArrays function
  • Control background correction, log transformation, and inter-quantile using neqc function

Following the normalization, the samples in triplicates were clustered closer within their replicate groups as shown in figures below (PCA Sample Clustering) compared to before normalization.

PCA Sample Clustering

Before Normalization pcaBeforeNormalization
After Quantile-Normalization pcaAfterQuantileNormalization
After Background Correction and Quantile-Normalization pcaAfterNormalization

Before normalization, sample RG2 are clustered closer with RDG501 instead within RG group cluster (as shown in Hiearchical Sample Clustering below). Applying normalization methods, samples are group according to their replicates.

Hiearchical Sample Clustering

Before Normalization hclustBeforeNormalization
After Quantile-Normalization hclustAfterQuantileNormalization
After Background Correction and Quantile-Normalization hclustAfterNormalization

Furthermore, the distributions of expression data have more similar spreads in all samples after the normalization. The effect of normalization is displayed in figure below (Sample Distribution).

 

Sample Distribution

Before Normalization BoxplotBeforeNormalization
After Quantile-Normalization BoxplotAfterQuantileNormalization
After Background Correction and Quantile-Normalization BoxplotAfterNormalization

Due to the appearance of an outlier within sample RG3 after only quantile-normalization, we recommend to further analyze the data after background correction and quantile-normalization (neqc). In addition to that, the clustering within replicate groups is closer in this normalized dataset as compared with only quantile-normalization.

Full Normalization procedures and codes (pdf format) can be downloaded here.
Normalized data matrix (txt format with tab separator) can be downloaded here (right click and download target as).

2.2 Differential Gene Expression Analysis

State Comparisons:

  • For the Karpas 299 cell lines (K):
    • Compare KD (karpas-DMSO) KG (Karpas-GSK343 (EZH2i))
    • Compare KD (karpas-DMSO) KJ (Karpas-JQ1 (BRD4i))
  • For the RL line (R):
    • Compare RL (DMSO) to RD (RL-DAC (DNMTi))
    • Compare RL (DMSO) RG (RL-GSK343 (EZH2i))
    • Compare RL (DMSO) RDG50 (RL-DAC / GSK (DNMTi / EZH2i))
    • Compare RL (DMSO) RDG64 (RL-DAC / GSK (DNMTi / EZH2i))
    • Compare RDG50 to RDG64 (same line, same treatment done in parallel)

2.2.1 Differential Gene Expression Analysis: SAM

Differential Expression Analysis was carried out using MEV (http://www.tm4.org/mev.html) version 4.8.1 (Linux). The software can be downloaded here

Input: normalized expression value (expressionDataNormalized.txt)
Input Parameters:
- SAM Version 1.0
- Study Design: Two Class Unpaired
- Number of unique permutations: 20
Output: All significant genes (differentialExpressionAnalysis.xls), each sheet contains different state comparison

The output file can be downloaded here

– Functional annotation was performed using DAVID 6.7, Nature Protocols 2009; 4(1):44 & Genome Biology 2003; 4(5):P3

Input: Gene Symbol (OFFICIAL_GENE_SYMBOL) from All significant genes produced from Differential Expression Analysis by MEV
Output:
- Gene_Ontology:
- Selected for: GOTERM_BP_FAT, GOTERM_CC_FAT, GOTERM_MF_FAT
- OutputFile: GO_differentialExpressionAnalysis.xls

- Pathways (Pathway_differentialExpressionAnalysis.xls)
- Selected for: BBID, BIOCARTA, KEGG_PATHWAY
- OutputFile: Pathway_differentialExpressionAnalysis.xls

The output file for Gene_Ontology can be downloaded here

The output file for Pathways can be downloaded here

2.2.1 Differential Gene Expression Analysis: limma

Analysis was performed using limma with fdr method. The more details procedures were described here (section 2)

The output file can be downloaded here
Each state comparison is stored in a separate sheet
The analysis package can be downloaded for reproducibility here