Question: DESeq2 results with RUVSeq values
2
4.1 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

Hi,

I have a data set, where we assume one of the conditions to have transcriptional amplification toward a set of genes. For that reason I decide to test the data with the RUVSeq package.

I have ran the analysis as explained in the vignette and calculated the estimated factors of unwanted variation. I have got the following factors

> pData(set1)[,2]
[1] -0.4209557 -0.3984871 -0.4039097  0.4393171  0.4079229  0.3761126

I wanted to use these factors to analyse the dataset using DESeq2, but DESeq accepts only positive values as sizeFactors.

> sizeFactors(cds.postRUV) <- pData(set1)$W_1 Error in .local(object, ..., value) : size factors must be positive Is there a way to re-calculate these RUVSeq factors in to positive values to fit for the DESeq2 analysis? thanks, Assa deseq2 ruvseq sizefacotrs • 2.7k views ADD COMMENTlink modified 4.1 years ago by davide risso830 • written 4.1 years ago by Assa Yeroslaviz1.4k Answer: DESeq2 results with RUVSeq values 1 4.1 years ago by davide risso830 Weill Cornell Medicine davide risso830 wrote: Mix 1 and Mix 2 are supposed to be different in the two samples, so I'm not surprised that you capture the group difference with the first factor of UV based on them. You can either use only group B or subtract the expected fold-change from the expression matrix to use all the spike-ins. From our paper: "Interestingly, one can relax the negative control gene assumption by requiring instead the identification of a set of Jc positive or negative controls, for which the value of βc is known a priori but need not be zero. Then, Xβc is known and one can perform the singular value decomposition of logYc Xbc Oc to estimate W as in step 3 of RUVg above. Steps 4 and 5 remain the same. This allows us to make full use of all 92 ERCC spike-in controls for the SEQC data set. " I hope this helps. ADD COMMENTlink written 4.1 years ago by davide risso830 I have been struggling with this paragraph in the paper for sometime, I am still unsure how it is possible to use positive controls. How/where does one specify the expected values for the fold changes. ADD REPLYlink written 3.5 years ago by Pietà Schofield0 Let me omit the offset for simplicity. The model for the control genes becomes log Y_c ~ X b_c + W a_c. If b_c is known (and different from zero), you can estimate W a_c by SVD on the matrix log Y_c - X b_c. In R, it would be something like: logY <- log1p(Y) logY[controls,] <- logY[controls,] - X * b_c[controls] Assuming that Y is the full matrix of counts, controls is an indicator of the positive controls and X is a one dimensional indicator variable (i.e., two-class comparison). ADD REPLYlink written 3.4 years ago by davide risso830 Answer: DESeq2 results with RUVSeq values 0 4.1 years ago by Michael Love24k United States Michael Love24k wrote: Add the factor(s) of unwanted variation to the column data: dds$fuv1 <- fuv1

Then include the factor(s) in the design:

design(dds) <- ~ fuv1 + fuv2 + condition
Answer: DESeq2 results with RUVSeq values
0
4.1 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

Thanks Michael,

somehow it doesn't really look right.

these are my samples and factors:

> pData(set1)
x        W_1
ctrl1       ctrl -0.4209557
ctrl2       ctrl -0.3984871
ctrl3       ctrl -0.4039097
strain1 strained  0.4393171
strain2 strained  0.4079229
strain3 strained  0.3761126

and this the analysis:

col.d$fuv1 <- pData(set1)$W_1

cds.postRUV <- DESeqDataSetFromMatrix(
countData = counts_deseq_filt,
colData   = col.d,
design    = ~ fuv1 + conditions)

dds.postRUV <- DESeq(cds.postRUV)

dds.postRUV$conditions <- relevel(dds.postRUV$conditions, "ctrl")
res.postRUV <- results(object = dds.postRUV)

But than, checking the adjusted p-values i have in almost all the genes the same value:

> table(res.postRUV\$padj)

0.999903715587095 0.999947942273539 0.999989492263923 0.999997165182665
28703                 1                 5                 1

Did I do something wrong here?

Answer: DESeq2 results with RUVSeq values
0
4.1 years ago by
davide risso830
Weill Cornell Medicine
davide risso830 wrote:

Hi Assa,

the reason you get these results is because your factor of UV is almost perfectly correlated with the biology: all negative values for your controls and all positive values for your "strained" samples. So you're testing and correcting for basically the same variable, and this causes troubles in the DE test.

Either your negative controls are not really negative controls and capture the biological difference between your samples, or the unwanted variation is truly perfectly correlated with the biology. If the latter is true, than I don't think RUV can help you here.

What are you using as negative controls?

Answer: DESeq2 results with RUVSeq values
0
4.1 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

I am using the ercc Spike In as a data set.

Answer: DESeq2 results with RUVSeq values
0
4.1 years ago by
davide risso830
Weill Cornell Medicine
davide risso830 wrote:

It looks like the expression of the spike-ins is different in the two groups. Are you using the same mix for both controls and treated samples, or are you using Mix 1 vs Mix 2?

If you're using the same mix, it might be worth looking at the distribution of the spike-in reads across samples (e.g., box plots, MAplots, ...) as well as plotting the first principal components of the spike-in expressions. If you see the samples cluster by biology, removing the variation inferred from the spike-ins will likely remove your signal of interest.

Answer: DESeq2 results with RUVSeq values
0
4.1 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

Hi Davide,

I am using a mix1-mix2 mix of the ERCC spikeIns. I have tested this and it came out quite good for the two conditions. Where I wasn't sure how to proceed was whether or not to use only group B of the spikeIn mix, where in both samples I have the same concentration. But as far as I understand it from your above explanation, this will also remove the signal of interest I'm looking for.

Is this assumption correct?