Adding a group of samples reduces DEGs in other groups
1
0
Entering edit mode
Jonathan ▴ 10
@c31cf0e5
Last seen 21 days ago
United States

I have a simple Bulk RNASeq experiment, in which I compare relatively matched lesional and nonlesional samples, using the voom-duplicatecorrelation-limma pipeline. Recently, I've added a group of controls, and consequently, two more contrasts (Lesional vs. Controls, non-lesional vs. controls). To my surprise, this reduced the number of DEGs in the lesional-vs-nonlesional comparison.

Two questions:

  1. If I understood correctly, is this because there is a high variance within the control group, and this variance affects the voom results of the entire experiment?
  2. Could I mitigate this by running the three pipelines separately, each with two compared groups and one contrast comparison?

Thanks!

RNASeq limma • 1.1k views
ADD COMMENT
0
Entering edit mode

Are libraries prepared in the same batch as the samples that were already present?

ADD REPLY
0
Entering edit mode

Yes, all libraries were prepared in the same batch.

ADD REPLY
4
Entering edit mode
@gordon-smyth
Last seen 4 hours ago
WEHI, Melbourne, Australia

is this because there is a high variance within the control group, and this variance affects the voom results of the entire experiment?

Presumably yes, but you should directly examine whether the controls are variable by plotting the data (plotMDS) rather than making indirect conclusions from the number of DE genes.

Could I mitigate this by running the three pipelines separately, each with two compared groups and one contrast comparison?

No, we definitely do not recommend that. Far better is to estimate quality weights either for individual samples or for groups (using voomLmFit or voomWithQualityWeights). To estimate sample weights with voomLmFit set sample.weights=TRUE. To estimate group weights, set var.group=Group.

ADD COMMENT
0
Entering edit mode

Thank you. What are the pros/cons of using sample weights vs. using group weights? If I understand correctly, it seems that calculating individual sample weights is more computation-heavy, but more accurate.

ADD REPLY
1
Entering edit mode

No, individual sample weights are neither more computationally heavy nor necessarily more accurate.

If (after exploring your data with QC plots) your data seems to have outlier samples, then you should use sample weights. If the issue isn't outliers but rather a systematic increase in variability in one group than another, then you should use group weights. It is not a matter of pros and cons but rather a matter of matching the analysis to the nature of the data.

ADD REPLY
0
Entering edit mode

In the voomLmFit documentaiton, it says: "var.group - optional vector or factor indicating groups to have different array weights". I've also tried to execute it with var.group=TRUE, but it failed as var.group has wrong length. It seems that I should specify the groups, e.g. voomLmFit(..., var.group = phenoData$Group), is it not?

Additionally, if both outlier samples and increased variability are issues, can I use both options? e.g., voomLmFit(counts = DGE.cpm, design = design, block = phenoData$SubjectID, sample.weights = T, var.group = phenoData$Group)?

Thank you, I really appreciate it.

ADD REPLY
1
Entering edit mode

Yes, var.group should be the group factor. Sorry, I typed the wrong thing in my answer above, now corrected.

No you cannot specify both options. The function can actually handle very general possibilities via the var.design argument, but I recommend that you stick or one of the two options I mentioned.

ADD REPLY

Login before adding your answer.

Traffic: 444 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6