Question

How to find the standard deviation of the gene counts in DeSeq2?

0

Entering edit mode

rafa.rios.50 • 0

@rafarios50-9912

Last seen 8.1 years ago

I'm working in the analysis of some RNAseq data with DeSeq2.

I have 2 conditions, treated and untreated, for two strains, mutant and wild type. With three replicates for each one, in total 12 samples.

I have done all the possible comparisons from the results function, specifying which experiments must be compared to define the numerator and denominator of the fold change relation.

Now I'm wondering if it is possible to get the standard deviation of the genes with significant differential expression across comparisons.

I have followed this manual https://www.bioconductor.org/packages/3.3/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf and it looks like it is possible to do it, but in some sort of separate way, it is possible to plot the expression levels of a specific gene through the conditions, section 1.5.2 and figure 2, and also its possible to plot the standard deviation of the transformed count data, section 2.1.5 and figure 4.

I'm not familiar with R, but in a naive guess, I think it is possible to select which data is going to be plotted from the result of the DESeq function on the count data (dds).

Could you please give some example or some insight on how I should do the calling or the filtering on the dds to get the standard deviation of the counts (transformed or not) for the significant genes, padj below 0.05?

Thank you

deseq2 standard deviation • 3.6k views

ADD COMMENT • link updated 8.1 years ago by Michael Love 41k • written 8.1 years ago by rafa.rios.50 • 0

score 1 · Answer 1 · 2016-03-14

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 23 hours ago

United States

The standard deviation of the transformed count data in Figure 4 is calculated using the meanSdPlot function from the vsn package. This is just calculating the sample standard deviation, i.e.:

row.sd <- apply(m, 1, sd)

For a matrix, m.

However, I guess you want the within-group standard deviation of transformed data?

But I'm not sure what value this is, because the standard deviation of transformed counts is not used in the statistical testing, and with only 3 replicates per group, you would certainly only want to plot the normalized counts themselves, not a mean + sd summary.

ADD COMMENT • link 8.1 years ago Michael Love 41k

0

Entering edit mode

Yes I would like to filter or create a subset from the matrix with the columns of the related replicates and then do the standard deviation calculation for that group.

Would m[,col_i:col_j] do the filtering, where the columns relate to the conditions? Then assign it to a new variable (m_condition) and then applying sd to it, as your example?

ADD REPLY • link 8.1 years ago rafa.rios.50 • 0

0

Entering edit mode

Here is some general R code for calculating the sample standard deviation of groups of a matrix. Again, I don't know what purpose you have for this, so use at your own discretion.

m <- matrix(1:60, ncol=6)
idx <- list(1:2,3:4,5:6)
library(genefilter)
sqrt(sapply(idx, function(i) rowVars(m[,i])))

ADD REPLY • link 8.1 years ago Michael Love 41k