Multi-factor paired RNAseq differential analysis with DEseq2
1
0
Entering edit mode
Matthew • 0
@f2420891
Last seen 3 months ago
United States

Hello, I’m working with an RNAseq dataset that looks at plants that are either infected with a fungus or have been left uninfected. I have both male and female genotypes, and those have been cloned, with one clone of each genotype getting inoculated while the other clone serves as the control. I’m also looking at floral and meristematic tissues, but I’m having a hard time figuring out how to use DESeq2 to make all the appropriate comparisons. I’ve seen posts like DESeq2: Paired Test, as well as looking at the vignette, but I don’t understand a few things. Assuming I have a metadata table with sex, condition, and tissue as the columns, with either male or female, infected or uninfected, and flower or meristem in them respectively. If I want to compare male infected flowers to female infected flowers, would the design look like this?

design(~sex + condition + tissue)


Or would it look more like

design(~sex*condition*tissue)


Or something else entirely. I can’t seem to get my head around how I go about saying I want all the male infected flowers compared to all the female infected flowers, don’t consider other columns like male or female uninfected flowers or any meristem tissues. Additionally, is there a way to specify between doing a paired vs independent comparison? I’d like comparisons like male infected flowers vs male uninfected flowers to be paired since it’s comparing the same genotypes, but comparisons like male infected flower vs female infected flower to be independent since they’re not the same genotypes, and I don’t understand how to distinguish between the two approaches. Thanks very much for any help!

DESeq2 RNASeq • 320 views
0
Entering edit mode
swbarnes2 ★ 1.0k
@swbarnes2-14086
Last seen 3 hours ago
San Diego

First, I'd start by looking at all your samples in PCA. Does sex have a strong effect? Does it make sense to compare one tissue with another?

The usual way to compare one subset of sample to another subset of samples is to make a new column of colData which concatenates the design elements, then use that column as your design, and specify what to compare to what.

0
Entering edit mode

Just to underscore swbarnes2 advice above, I'd also recommend discussing statistical design with a local statistician or someone familiar with linear models in R. It's really good to work this out, so you aren't just doing guesswork on the design and likewise on interpretation of the coefficients.

0
Entering edit mode

Thanks for the suggestion, I'll give that a try! Based on the PCA, all three factors do tend to influence the results, so I'll try the column addition. Just for clarification, when you say to concatenate design elements into a new column of colData, would that look like having a column called something like "combined_status" that says "male_infected_flower" in addition to having a column where you specify sex, another for infection status, and a third for tissue, then just using that combined status column as your design, like this?

design(~combined_status)

0
Entering edit mode

You can have as many columns as you like in coldata, only those specified in the design will be used when you do the DESeq command So sure, have the individual columns, and the combined ones. You can ask different questions with the separate columns later.