I have pairs of rna-seq samples from many cells that were exposed to different targeting and non-targeting knockdowns against three different genes. Each pair consists of material that come from the inner part of the cell and the outer part of the cell. So we don't expect them pairs to be the same or have the same baseline, but we are interested in the differences between those locations. My understanding of pairs in fishpond means that I should probably use pairing because it assumes the baselines are the same. Therefore, if we use pairing the differences will be highlighted, which I want.
I want to run three different tests:
- targeting vs non-targeting guides for the same gene target
- inner cellular zone vs the cellular periphery for each knockdown guide
- A look at the interaction between knockdown guide and cellular zone
After poring over the documentation, I am still unclear whether I should use a covariate term and/or a pairing term for each of these.
I have done the following so far:
- One simple condition term for test 1
- One simple condition term for test 2
- A condition plus pairing for test 3. The documentation suggests that should be equivalent to an interaction term. Is that correct? Because it only allows 1 sample per condition for the pairing, I would expect that I should actually include the covariate term so that I can do a four-way comparison (2 guides x 2 cellular zones). The covariate term is actually something this model considers a batch effect and will attempt to correct for, isn't that correct?
Finally, I am a little unclear about which output I should be most concerned with for each question. Currently, I am looking at LFC. But the actual test statistic may be more valuable since it seems to include both likelihood, such as the q-value, and the the difference, as in the LFC. I would expect the Mann-Whitney Wilcoxin to give a U and a p-value, not a q-value.
Is my current approach the correct one for what I hope to learn?
When I attempt to edit my original question, it says there is a field that is required, but I cannot find it, so apologies if this is not the right place to respond.
My experimental design can be summarized with the following table:
The soma and neurite samples from each pair come from the same cell. Even though we expect those samples to be different, we are interested in their differences, so the pair term seems appropriate.
I want to run three different tests:
How I have approached this so far:
I think there is a temporary bug in the edit button on the support site, this works. Of your approaches:
1 - looks correct 2 - I would use
swish(y, x="cellular_zone", pair="pair")
so that you have more power by accounting for the pairs 3 - I would only include pair if you were comparing across cellular zone (CZ), so not exactly.If you want to know if if the CZ effect changes across condition that would be:
As in here: https://thelovelab.github.io/fishpond/articles/swish.html#interaction-designs