Question about quality of biological replicates from RNAseq data
0
0
Entering edit mode
sheng zhao ▴ 90
@sheng-zhao-5316
Last seen 8.9 years ago
Germany
Dear all, I am using edgeR to analyse one of my RNAseq data. In this experiment, we have three tumour samples from three different patients. Named: Patient_1, Patient_2, Patient_3 We did RNASeq on Tumour-cells under two different conditions: treated and untreated, and wanted to find differential expressed genes after treatment. In the end, we got following 6 RNASeq data (100bp, paired-end and HiSeq2500): Patient_1_treated, Patient_1_untreated, Patient_2_treated, Patient_2_untreated, Patient_3_treated, Patient_3_untreated, So for each condition (tread V.S untreated), we have three biological replicates. I followed <4.4 RNA-Seq of oral carcinomas vs matched normal tissue> in edgeR User’s Guide to analysis these data, for this case study is very similar to what we did. However, as you can see, from the MDS-plot, Patient_1_treated is very close to Patient_1_untreated on first two dimensions, which is same for Patient_2 or Patient_3. plotMDS: http://i.imgur.com/6AQVNhi.png plotBCV : http://i.imgur.com/srqwuJC.png So, without surprise, I end up with finding 0 differential expressed genes (FDR<0.1) My questions are: 1.Could I say that, especially basing on the result of MDS-plot, the biological replicates are not consistent in my case, or the quality (fitness for purpose) is very low. 2. Should we add more replicates to increased statistic power or trying other statistic models, or you have another suggestion to deal with this kind of data ? Thank you for your suggesting in advances. Regards, Sheng ======================= R code: library( "edgeR" ) files <- dir( pattern=".read_cnt", full.names = FALSE) RG <- readDGE( files ) colnames( RG$samples ) d <- DGEList( counts = RG ) y <- d colnames(y)<-c( "Patient_1_treated", "Patient_1_untreated", "Patient_2_treated", "Patient_2_untreated","Patient_3_treated", "Patient_3_untreated") ##==Filtering keep <- rowSums(cpm(y) > 10 ) >= 3 y <- y[keep,] dim(y) ##Re-compute the library sizes: y$samples$lib.size <- colSums(y$counts) y$samples ##Normalizing y <- calcNormFactors(y) y$samples plotMDS(y, top=500, main ="Multi-Dimensional Scaling Plot for Count Data") ###The design matrix Patient <- factor( c( "Patient_1", "Patient_1","Patient_2", "Patient_2","Patient_3", "Patient_3" ) ) Treat <- factor( c( "Treated", "Untread", "Treated", "Untread","Treated", "Untread") ) data.frame( Sample = colnames(y), Patient,Treat) design <- model.matrix( ~Patient + Treat ) rownames( design ) <- colnames( y ) y <- estimateGLMCommonDisp( y, design, verbose = TRUE ) y <- estimateGLMTrendedDisp( y, design ) y <- estimateGLMTagwiseDisp( y, design ) fit <- glmFit(y, design) lrt <- glmLRT( fit ) top <- topTags( lrt, n = 50 ) q_value = 0.05 summary( de <- decideTestsDGE( lrt, p.value = q_value ) ) q_value = 0.1 summary( de <- decideTestsDGE( lrt, p.value = q_value ) ) plotBCV( y, cex = 0.8) #sessionInfo( ) #R version 3.0.1 (2013-05-16) #Platform: x86_64-apple-darwin10.8.0 (64-bit) # #locale: #[1] C # #attached base packages: #[1] splines stats graphics grDevices utils datasets methods base # #other attached packages: #[1] ggplot2_0.9.3.1 biomaRt_2.16.0 edgeR_3.2.3 limma_3.16.5 # #loaded via a namespace (and not attached): # [1] MASS_7.3-27 RColorBrewer_1.0-5 RCurl_1.95-4.1 XML_3.95-0.2 colorspace_1.2-2 # [6] dichromat_2.0-0 digest_0.6.3 grid_3.0.1 gtable_0.1.2 labeling_0.2 #[11] munsell_0.4 plyr_1.8 proto_0.3-10 reshape2_1.2.2 scales_0.2.3 #[16] stringr_0.6.2 tools_3.0.1 [[alternative HTML version deleted]]
RNASeq edgeR RNASeq edgeR • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6