Question: edgeR: estimating dispersion for nested designs
0
gravatar for Mauve
4.8 years ago by
Mauve0
Norway
Mauve0 wrote:

Hi,

I have an RNA-seq experiment that is similar to section 3.5 in the edgeR user’s guide, i.e. a nested paired approach, and I have used this approach to analyze my own data. Briefly, the experiment involves 60 RNA samples, corresponding to two groups of bacterial strains (commensal (C) and disease-causing (D)); each group consisting of 15 different strains; either strain treated (IND) and not treated with a chemical (CTR). My questions are about estimating dispersion in this type of scenario (which is skipped in the user’s guide):

  1. Can I correctly estimate common/trended and tagvise dispersion using estimateGLMCommonDisp /estimateGLMTrendedDisp and estimateGLMTagwiseDisp (relative to a design matrix), even though there are no true biological replicates?
  2. How do I calculate the prior degrees of freedom in this case?

Any help will be greatly appreciated.

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Mauve0
1

See Section 2.10 of the edgeR User's Guide "What to do if you have no replicates"

ADD REPLYlink written 4.8 years ago by Gordon Smyth39k

So in the section 3.5 example, which option was used?

ADD REPLYlink written 4.8 years ago by Mauve0

The Section 3.5 example has replicates. There are 18 samples, and the design matrix has only 12 columns, so there are 6 residual df for estimating the dispersion. Hence all the edgeR glm dispersion estimation methods work.

PS. Please be careful to post follow-up questions as comments rather than answers. I have moved our interchange so far to be comments on your original question.

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by Gordon Smyth39k
Answer: edgeR: estimating dispersion for nested designs
3
gravatar for Gordon Smyth
4.8 years ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

I think that you may have misunderstood the example in Section 3.5. You seem to be assuming that it has no replicates, but there are 18 samples, and the design matrix has only 12 columns, so there are 6 residual df for estimating the dispersion. Hence all the edgeR glm dispersion estimation methods work.

You have not entirely explained the purpose of your experiment. Do you want to find genes DE between IND and CTR, treating the different strains as biological replicates? In that case, your experiment is like Section 3.5 and you do have replicates.

If you want to find genes that are DE between IND and CTR for each strain separately, then you don't have replicates, and I don't think you can do any formal statistical analysis. Just compute fold changes.

ADD COMMENTlink written 4.8 years ago by Gordon Smyth39k
Answer: edgeR: estimating dispersion for nested designs
0
gravatar for Mauve
4.8 years ago by
Mauve0
Norway
Mauve0 wrote:

Thank you for clarifying, I was under the impression that real biological replicates were required to estimate dispersion correctly, but I realize now that as I am only interested in group-level differences in expression between CTR and IND the strains can be considered as replicates.

ADD COMMENTlink written 4.8 years ago by Mauve0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 136 users visited in the last hour