DESeq questions
2
0
Entering edit mode
@shreyartha-mukherjee-4378
Last seen 9.7 years ago
Hi , I have RNA-seq count data for 30,400 genes across 6 conditions (3 replicates per condition). I was trying different normalization methods and then test for differentially expressed genes between conditions. How to test whether estimateSizeFactors() and estimateVarianceFunctions() does a good fit for my data? Also is there a way to test whether the normalization is good ? Any help is greatly appreciated, Thanks, Shrey [[alternative HTML version deleted]]
Normalization Normalization • 700 views
ADD COMMENT
0
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.8 years ago
Zentrum für Molekularbiologie, Universi…
Hi Shrey > I have RNA-seq count data for 30,400 genes across 6 conditions (3 replicates > per condition). I was trying different normalization methods and then test > for differentially expressed genes between conditions. How to test whether > estimateSizeFactors() and estimateVarianceFunctions() does a good fit for my > data? Also is there a way to test whether the normalization is good ? Any > help is greatly appreciated, To test whether the normalization (i.e., the size factor estimation) worked fine, do an MA plot for a pair of samples and mark the size factor log ratio with a horizontal line. Here is a demonstration with example data: library( DESeq ) # Make some example data (or use your real data ) cds <- makeExampleCountDataSet( ) # estimate the size factors cds <- estimateSizeFactors( cds ) # Choose two samples for which you want to check whether they are # properly normalizae with respect to each other s1 <- 1; s2 <- 2 # Make the MA plot, i.e., plot the log fold change between the sample # against the mean of the log counts plot( ( log10( counts(cds)[,s1] ) + log10( counts(cds)[,s2] ) )/2, log10( counts(cds)[,s2] ) - log10( counts(cds)[,s1] ) ) # In this plot, the bulk of the genes which are not differentially # expressed should scatter around a horizontal line in the middle. # The position of this line should be given by the log ratio of # the size factors. Mark the latter: abline( h=log10( sizeFactors(cds)[s2] ) - log10( sizeFactors(cds)[s1] ), col="red" ) # Now, the red line should go right through the middle of the bulk of # not differentially expressed genes. I hope that helps Simon
ADD COMMENT
0
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.8 years ago
Zentrum für Molekularbiologie, Universi…
Hi Shrey regarding your other question On 12/06/2010 10:54 PM, Shreyartha Mukherjee wrote: > I have RNA-seq count data for 30,400 genes across 6 conditions (3 replicates > per condition). I was trying different normalization methods and then test > for differentially expressed genes between conditions. How to test whether > estimateSizeFactors() and estimateVarianceFunctions() does a good fit for my > data? Also is there a way to test whether the normalization is good ? Any > help is greatly appreciated, To check the effect of estimateVarianceFunctions: have a look at the package vignette, specifically at Fig. 2. There, the variance estimates for each gene are plotted against the mean, and the estimated variance function is indicated by a red line. This should show a reasonable fit. Simon
ADD COMMENT

Login before adding your answer.

Traffic: 679 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6