Can Normalized cpm values be used as RUVs input
1
0
Entering edit mode
sooby • 0
@sooby-11430
Last seen 7.6 years ago

Hi,

I am considering to remove the batch effect in my samples to decrease the variation among the replicates. I find RUVseq is great and easy to use. It has very specified details about how to use RUVseq in the DE genes analysis. But I still have a question. I need to used the batch corrected genes expressions in the downstream analysis. I have one solution, but I am not sure if it is available. Please help me.

My solution is :

1. use edgeR to generate normalized cpm values.

2. Put the normalized cpm values into RUVseq and use RUVs function to remove the unwanted variations (because for each conditions, I have three replicates.).

3. use normCounts function to get the corrected gene expression values.

After that, I can get a normalized cpm values , adjusted by removing unwanted variations. I only use these values to explore some specific genes expression profile in my samples. For the DE genes analysis, I will do as the examples showed in RUVseq manual.

Is my solution possible?

Or, if you have any suggestions, please let me know.

Best,

Sooby.

rnaseq ruvseq • 1.6k views
ADD COMMENT
0
Entering edit mode
davide risso ▴ 950
@davide-risso-5075
Last seen 5 weeks ago
University of Padova

I believe that your approach is reasonable for exploratory data analysis. As you point out, we recommend a different approach for differential expression, but your proposed solution will work if the objective is data exploration.

 

ADD COMMENT
0
Entering edit mode

Thanks for your fast reply, David.

I found when I used RUVs function, the W I got for every samples were all negative. Is it normal?  Additionally, do you have any suggestions about what is the best "k factor" in RUV analysis? I found the bigger K could result in the better cluster of replicates. But I think if we use big K, we may loose some true DE genes. So what is your opinion?

Sooby.

Sooby.

ADD REPLY
0
Entering edit mode

The values in W are not very informative: Since we need to estimate both W and alpha, these are unidentifiable (e.g., swapping signs between alpha and W will give you the same solution). Increasing k too much may be a problem, especially if the negative control genes are not a perfect set (we generally find that <= 3 is on the safe side).

ADD REPLY
0
Entering edit mode

Let me rephrase that: the value in W are actually informative, but not in an absolute scale, i.e., they're only informative in relation to one another and not for their actual value.

ADD REPLY
0
Entering edit mode

Thanks David,

I think I get your point about the W values. And I will pay attention to the k values.

Thanks again,

Sooby.

ADD REPLY

Login before adding your answer.

Traffic: 958 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6