2-Factor Experiment in edgeR, Effect of no reps in one set and two reps in others
1
0
Entering edit mode
@nancyjwahl7-11700
Last seen 6.6 years ago

I am analyzing differential gene expression from an RNA-Seq experiment in which two factors were applied. I have multiple replicates among samples when just Factor 1 is considered, and there is significant clustering among the groups. When Factor 2 is considered as well, there is one group that has no replicates of Factor 2 at a certain level of Factor 1, while the other levels of Factor 1 have 2 replicates for each level of Factor 2. I have good reason not to discard the group with no replicates, because the annotated set of differentially expressed genes strongly indicates the effect of Factor 2 is significant. 

My question is, do I need to do anything special with respect to the dispersion factor, or will the factor derived from the replicated samples be applied to the group lacking replicates? Thanks!

Nancy

rnaseq edger differential expression dispersion • 1.2k views
ADD COMMENT
1
Entering edit mode
@ryan-c-thompson-5618
Last seen 6 weeks ago
Icahn School of Medicine at Mount Sinai…

If your design is ~Factor1 + Factor2, then you have nothing to worry about, since all the coefficients will be estimated with some replication. If your design is ~Factor1 * Factor2 (or ~Group, where Group is all the unique combinations of Factor1 and Factor2), then you do indeed have a group with no replication. You can still fit the model and test for differential expression using this group, because the other groups have replication from which the dispersions can be estimated. The only caveats are the obvious ones. First, the group with no replication will not contribute anything to the dispersion estimation. If that group happens to have a higher dispersion than others, this would result in you underestimating the overall dispersions for the whole experiment, and thereby overestimating your significance. Second, the expression estimate for that group will be unreliable. In theory, the p-values should take this into account, but you can only ask so much of a statistical method, and with an N of 1, they might not be very reliable.

ADD COMMENT
0
Entering edit mode

Thanks for your helpful reply. I plan to use the Factor1 + Factor2 design.

Nancy

ADD REPLY

Login before adding your answer.

Traffic: 909 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6