Hi,
I am considering to remove the batch effect in my samples to decrease the variation among the replicates. I find RUVseq is great and easy to use. It has very specified details about how to use RUVseq in the DE genes analysis. But I still have a question. I need to used the batch corrected genes expressions in the downstream analysis. I have one solution, but I am not sure if it is available. Please help me.
My solution is :
1. use edgeR to generate normalized cpm values.
2. Put the normalized cpm values into RUVseq and use RUVs function to remove the unwanted variations (because for each conditions, I have three replicates.).
3. use normCounts function to get the corrected gene expression values.
After that, I can get a normalized cpm values , adjusted by removing unwanted variations. I only use these values to explore some specific genes expression profile in my samples. For the DE genes analysis, I will do as the examples showed in RUVseq manual.
Is my solution possible?
Or, if you have any suggestions, please let me know.
Best,
Sooby.
Thanks for your fast reply, David.
I found when I used RUVs function, the W I got for every samples were all negative. Is it normal? Additionally, do you have any suggestions about what is the best "k factor" in RUV analysis? I found the bigger K could result in the better cluster of replicates. But I think if we use big K, we may loose some true DE genes. So what is your opinion?
Sooby.
Sooby.
The values in W are not very informative: Since we need to estimate both W and alpha, these are unidentifiable (e.g., swapping signs between alpha and W will give you the same solution). Increasing k too much may be a problem, especially if the negative control genes are not a perfect set (we generally find that <= 3 is on the safe side).
Let me rephrase that: the value in W are actually informative, but not in an absolute scale, i.e., they're only informative in relation to one another and not for their actual value.
Thanks David,
I think I get your point about the W values. And I will pay attention to the k values.
Thanks again,
Sooby.