I am trying to use limma-voom to compare the effect of a treatment between males and females from an rna-seq experiment. I have paired samples, and my data looks as such:
sample
condition
sex
nested
sample1
pre
M
1
sample1
post
M
1
sample2
pre
M
2
sample2
post
M
2
sample70
pre
F
1
sample 70
post
F
1
...
...
...
...
I setup my design as such: design = model.matrix(~ sex + sex:nested + sex:condition, data), and then did fit <- lmFit(v, design) in accordance with the manual.
I'm confused as to how best proceed to make a contrast matrix that compares the effect of the condition between the sexes while accounting for the pairing. Any help would be much appreciated!
What does "nested" mean here? You say that you want to compare the post vs pre treatment effect between the sexes, but how does "nested" come into this?
I apologize, I am answering from a different account since I seem to have reached a positing limit. Nested is numbering each sample within its sex group, as recommended by the edgeR manual in the section detailing making comparisons between and within groups.
I would like to use limma to contrast the change in gene expression due to the condition between the sexes, so this design seemed appropriate
I see that you are following Section 3.5 of the edgeR User's Guide. The last two coefficients in your fitted model correspond to the treatment effect in females and the treatment in effect in males. So you just need to take the contrast between the last two coefficients in your model. There are examples of this in Section 3.5.
I don't know how many columns your design matrix has or what the column names are, but the following should always work.
Great, thank you. Yes, nested is a factor. However, since I have an uneven amount of males and females it is my understanding that I should remove columns from the design matrix that do not correspond to a sample. Am I correct in thinking this?
Well, yes, if there are columns of the design matrix that are all zero, then you should remove them. This is not essential, however, because limma will detect and remove any superfluous columns for you automatically. You will get warning messages about non-estimable coefficients and about coefficients being set to NA.
What does "nested" mean here? You say that you want to compare the post vs pre treatment effect between the sexes, but how does "nested" come into this?
I apologize, I am answering from a different account since I seem to have reached a positing limit. Nested is numbering each sample within its sex group, as recommended by the edgeR manual in the section detailing making comparisons between and within groups.
I would like to use limma to contrast the change in gene expression due to the condition between the sexes, so this design seemed appropriate