Entering edit mode
Dear all,
I am using edgeR to analyse one of my RNAseq data.
In this experiment, we have three tumour samples from three different
patients.
Named: Patient_1, Patient_2, Patient_3
We did RNASeq on Tumour-cells under two different conditions: treated
and
untreated, and wanted to find differential expressed genes after
treatment.
In the end, we got following 6 RNASeq data (100bp, paired-end and
HiSeq2500):
Patient_1_treated, Patient_1_untreated,
Patient_2_treated, Patient_2_untreated,
Patient_3_treated, Patient_3_untreated,
So for each condition (tread V.S untreated), we have three biological
replicates.
I followed <4.4 RNA-Seq of oral carcinomas vs matched normal tissue>
in
edgeR Users Guide to analysis these data, for this case study is very
similar to what we did.
However, as you can see, from the MDS-plot, Patient_1_treated is very
close
to Patient_1_untreated on first two dimensions, which is same for
Patient_2
or Patient_3.
plotMDS: http://i.imgur.com/6AQVNhi.png
plotBCV : http://i.imgur.com/srqwuJC.png
So, without surprise, I end up with finding 0 differential expressed
genes
(FDR<0.1)
My questions are:
1.Could I say that, especially basing on the result of MDS-plot, the
biological replicates are not consistent in my case, or the quality
(fitness for purpose) is very low.
2. Should we add more replicates to increased statistic power or
trying
other statistic models, or you have another suggestion to deal with
this
kind of data ?
Thank you for your suggesting in advances.
Regards,
Sheng
=======================
R code:
library( "edgeR" )
files <- dir( pattern=".read_cnt", full.names = FALSE)
RG <- readDGE( files )
colnames( RG$samples )
d <- DGEList( counts = RG )
y <- d
colnames(y)<-c( "Patient_1_treated", "Patient_1_untreated",
"Patient_2_treated", "Patient_2_untreated","Patient_3_treated",
"Patient_3_untreated")
##==Filtering
keep <- rowSums(cpm(y) > 10 ) >= 3
y <- y[keep,]
dim(y)
##Re-compute the library sizes:
y$samples$lib.size <- colSums(y$counts)
y$samples
##Normalizing
y <- calcNormFactors(y)
y$samples
plotMDS(y, top=500, main ="Multi-Dimensional Scaling Plot for Count
Data")
###The design matrix
Patient <- factor( c( "Patient_1", "Patient_1","Patient_2",
"Patient_2","Patient_3", "Patient_3" ) )
Treat <- factor( c( "Treated", "Untread", "Treated",
"Untread","Treated",
"Untread") )
data.frame( Sample = colnames(y), Patient,Treat)
design <- model.matrix( ~Patient + Treat )
rownames( design ) <- colnames( y )
y <- estimateGLMCommonDisp( y, design, verbose = TRUE )
y <- estimateGLMTrendedDisp( y, design )
y <- estimateGLMTagwiseDisp( y, design )
fit <- glmFit(y, design)
lrt <- glmLRT( fit )
top <- topTags( lrt, n = 50 )
q_value = 0.05
summary( de <- decideTestsDGE( lrt, p.value = q_value ) )
q_value = 0.1
summary( de <- decideTestsDGE( lrt, p.value = q_value ) )
plotBCV( y, cex = 0.8)
#sessionInfo( )
#R version 3.0.1 (2013-05-16)
#Platform: x86_64-apple-darwin10.8.0 (64-bit)
#
#locale:
#[1] C
#
#attached base packages:
#[1] splines stats graphics grDevices utils datasets
methods
base
#
#other attached packages:
#[1] ggplot2_0.9.3.1 biomaRt_2.16.0 edgeR_3.2.3 limma_3.16.5
#
#loaded via a namespace (and not attached):
# [1] MASS_7.3-27 RColorBrewer_1.0-5 RCurl_1.95-4.1
XML_3.95-0.2
colorspace_1.2-2
# [6] dichromat_2.0-0 digest_0.6.3 grid_3.0.1
gtable_0.1.2
labeling_0.2
#[11] munsell_0.4 plyr_1.8 proto_0.3-10
reshape2_1.2.2 scales_0.2.3
#[16] stringr_0.6.2 tools_3.0.1
[[alternative HTML version deleted]]