Search
Question: RNA-seq paired analysis design
0
gravatar for cjmb3
10 months ago by
cjmb30
cjmb30 wrote:

Hello,

I'm sorry if this is a recurrent question but the more I read the more I get confused!

I have RNA-seq data from patients for which I have cells either "control" (CT) or "stimulated" (S). So for each patient, I have extracted a cell type and I'm comparing 2 treatments. If I want to analyze for deferentially expressed genes between CT and S, accounting for the patient effect, what design should I use ?

I have seen different ways of doing this in Limma, EdgeR and DESeq2 manuals. My opinion was that I should use:

design=model.matrix(~patient+treatment)

but I have been told that another way to go would be:

design=model.matrix(~0+treatment+patient)

which should (I thought) be the same as above, except here I have to use the function makeContrasts as an additional step. But the 2 models give me different results!

Another issue I have is that I have a huge variability in the read/sample for this data set, i.e. ranging from 6M reads to 30M. Even after normalization for library size and using the function voomWithQualityWeights (in Limma) I still get a library size effect, since I clearly see that my samples cluster together (in a heatmap for example) depending on their amount of total reads. Plus, if I look at the FPKM values of house keeping genes (using this ref. : http://www.sciencedirect.com/science/article/pii/S0168952513000899) I can see differential expression across samples, correlating with library size, even after filtering and normalization. Any idea if there is a way to correct for this?

Many thanks for your help!

Kind regards,

Christophe

ADD COMMENTlink modified 10 months ago by Gordon Smyth32k • written 10 months ago by cjmb30

If you want package maintainers to respond to this, you'll need to put the package names as tags.

ADD REPLYlink written 10 months ago by Aaron Lun17k

Thanks Aaron, I'm going to tag some but the thing is (I think) the theory behind the design formula should be independent of the package (same principle whether you use limma or DESeq2 for example).

ADD REPLYlink written 10 months ago by cjmb30
2
gravatar for Gordon Smyth
10 months ago by
Gordon Smyth32k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth32k wrote:

Both design matrices are in principle equivalent and do give the same results with edgeR or limma-trend. However limma-voom uses an approximation that will cause slightly different results if you use the second choice. If you want to read why, see the warning on the help("contrasts.fit") document page or read a previous posting this site: limma differences with and without an intercept . The first choice is the standard choice that is recommended in the limma and edgeR User's Guides. It is the simplest and always correct.

Generally speaking you don't need to anything to control for library size. If you really think you do, then post a new question on that topic with plots and more information about your data set.

ADD COMMENTlink modified 10 months ago • written 10 months ago by Gordon Smyth32k

Thanks a lot Gordon, it is much clearer now!

ADD REPLYlink written 10 months ago by cjmb30

Hi Gordon. In the post you reference, you mentioned there that you plan to re-write the linear modeling code in limma to store more information from each fit to run around this approximation problem. It sounds like that hasn't been done yet? If not, is it still on the radar, or do you think it's something that you folks just won't be able to get to?

ADD REPLYlink written 10 months ago by Steve Lianoglou12k

I tried to do the rewrite but it is much more difficult that I originally anticipated (given the need to maintain limma's generality and speed). It's still on the radar but I won't give a timeline.

ADD REPLYlink written 10 months ago by Gordon Smyth32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 334 users visited in the last hour