DESeq2: Comparing A Knockout to a Strain
1
0
Entering edit mode
@viktor-thiel-9079
Last seen 2 days ago
Sweden

Dear all,

we performed RNAseq for a set of mouse samples. The samples come from different mouse strains, and from different knockout lines. Here is an excerpt of the sample metadata:

sample  condition      strain
1       WT             B6
2       WT             B6
3       WT             B6
4       Gene1KO        B6
5       Gene1KO        B6
6       Gene1KO        B6
7       Gene1KO        FOO
8       Gene1KO        FOO
9       Gene1KO        FOO
10      WT             BAR
11      WT             BAR
12      WT             BAR
13      Gene2KO        B6
14      Gene2KO        B6
15      Gene2KO        B6
16      WT             FOO
17      WT             FOO
18      WT             FOO

For the B6 strain, there are other knockouts than Gene1KO and Gene2KO which I omitted.

My design is:

design(dds) <- 0 + strain + condition

In general, the comparisons are straightforward to do because we want to compare the KOs to the WTs of the same strain. They are mostly in the B6 strain.

res1 <- results(dds, contrast = c("condition", "Gene1KO", "WT")
res2 <- results(dds, contrast = c("condition", "Gene2KO", "WT")

In a few particular cases, though, we are interested in comparing WT individuals of strain BAR to the overall effects of knockout Gene1KO, essentially treating strain as a batch variable. Phenotypically, we expect them to be similar overall with potentially pronounced effects in a few specific pathways. The Gene1KO samples come from several strains, but not from strain BAR. Incorporating the interaction condition:strain is not possible because we lack Gene1KO:BAR.

I believe this does what we want:

res3 <- results(dds, contrast = list("conditionGene1KO", "strainBAR")

Is this a valid approach to take? It runs fine, but results in ~22500 significant hits out of ~22800 genes after lfcShrink, which makes me a bit suspicious ...

Thanks in advance!

DESeq2 • 323 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 14 hours ago
United States

That last contrast doesn't make sense, comparing across coefficients from different design terms. I'd recommend consulting with a local statistician on your statistical analysis plan.

ADD COMMENT
0
Entering edit mode

Thank you for confirming this is a bad idea.

I understand of course that your time is limited. I will leave my follow-up question here anyway, and if you or someone else responds I will have learned something, and if not I will come back once I grasp my misunderstanding.

My reasoning was that, for one gene i and sample j, I essentially have a model log2(cts) = b1 strainABC + b2 strainBAR + b3 * condGene1KO + ...

To go from baseline to Gene1KO: log2(cts) = condGene1KO

To go from baseline to strain BAR: log2(cts) = strainBAR

Difference: log2FC = strainBAR - condGene1KO

Obviously I am missing something here.

ADD REPLY

Login before adding your answer.

Traffic: 560 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6