DESeq2 design for paired samples
1
0
Entering edit mode
@e7994b78
Last seen 24 days ago
Norway

Hi,

I have two groups of mice - wild type and mutant in which every mice had one brain hemisphere treated/damaged (ipsi) and one hemisphere healthy (contra). I would like to assess if there is a difference in response to the treatment between WT and MUT. Another question is if the healthy hemispheres differ significantly in gene expression levels (mut vs. wt).

        > head(metaData)
ID genotype   hemi
1 WT_1       WT Contra
2 WT_2       WT Contra
3 WT_3       WT Contra
4 WT_1       WT   Ipsi
5 WT_2       WT   Ipsi
6 WT_3       WT   Ipsi


Based on another topic (DESEq2 Paired samples Before and after treatment ) I made my design as:

~ mouseID + hemi* genotype


But I got an error message:

Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.Please read the vignette section 'Model matrix not full rank': vignette('DESeq2')

I read a vignette but I am still not sure how to change the design so it reflects my question (how the genotype affects the response to treatment). Or is there something I am doing wrong with the IDs, should I add a column of "mice.n" similar as in the the topic I am referring to? In that topic I understand the additional column is required because of unequal number of patients in naive/second groups.

DESeq2 paired • 209 views
0
Entering edit mode
ATpoint ▴ 850
@atpoint-13662
Last seen 19 minutes ago
Germany

Mouse ID is nested with genotype, I would omit ID from the design.

0
Entering edit mode

Thank you! Is the one below correct?

~ genotype+hemi+genotype:hemi

1
Entering edit mode

When you read the vignette, did you read the section about individuals nested within groups? If you want to compare within mouse, that section is relevant.

0
Entering edit mode

Hi Michael,

Thank you for pointing this out. I did not find this particular section reading the vignette before, now I understand that the additional column is the nested and it explains a lot. Thank you very much, I am not sure if I grasp the idea correctly but I will try and in case of troubles I will ask more precise questions.

0
Entering edit mode

edit: yes, it works perfectly, thank you so much!

 design<- (~ genotype + ind.n + genotype:ind.n + genotype:hemi)
> dds <- DESeqDataSetFromMatrix(countData = countData,
+                               design = design, tidy = TRUE)

# Calling results
> res <- results(dds)
> mcols(res, use.names=TRUE)
DataFrame with 6 rows and 2 columns
type                                  description
<character>                                  <character>
baseMean       intermediate    mean of normalized counts for all samples
log2FoldChange      results log2 fold change (MLE): genotypeMUT.hemiIpsi
lfcSE               results         standard error: genotypeMUT.hemiIpsi
stat                results         Wald statistic: genotypeMUT.hemiIpsi
pvalue              results      Wald test p-value: genotypeMUT.hemiIpsi
> # Print the coefficients
> resultsNames(dds)
[1] "Intercept"            "genotype_MUT_vs_WT"   "ind.n_2_vs_1"
[4] "ind.n_3_vs_1"         "genotypeMUT.ind.n2"   "genotypeMUT.ind.n3"
[7] "genotypeWT.hemiIpsi"  "genotypeMUT.hemiIpsi"
> results(dds, contrast=list("genotypeMUT.hemiIpsi","genotypeWT.hemiIpsi"))
log2 fold change (MLE): genotypeMUT.hemiIpsi vs genotypeWT.hemiIpsi
Wald test p-value: genotypeMUT.hemiIpsi vs genotypeWT.hemiIpsi
DataFrame with 18057 rows and 6 columns
baseMean log2FoldChange     lfcSE      stat    pvalue      padj
<numeric>      <numeric> <numeric> <numeric> <numeric> <numeric>


So now I can use the coefficients to make contrasts - do I understand correctly that the contrast from the vignette example (results(dds, contrast=list("grpY.cndB","grpX.cndB")) and in my case results(dds, contrast=list("genotypeMUT.hemiIpsi","genotypeWT.hemiIpsi")))

is similar to Interaction term in the design with Interaction (ignoring the individuals)? I mean if I run this design:

~ genotype+hemi+genotype:hemi


which woudl give me the coefficients:

> resultsNames(dds)
[1] "Intercept"            "genotype_MUT_vs_WT"   "hemi_Ipsi_vs_Contra"
[4] "genotypeMUT.hemiIpsi"


should I get similar result for results(dds, name="genotypeMUT.hemiIpsi") ? Is this correct?

0
Entering edit mode

For confirming that a particular statistical design is what you are intending, I'd recommend working with a statistician. Beyond what we've laid out in the vignette, I don't have sufficient time to follow up on the support site with statistical design questions.