DESeq design and contrast with multiple factors
1
0
Entering edit mode
@fc2212d3
Last seen 12 months ago
United Kingdom

Hi everyone, I've been trying to teach myself R to do mostly RNAseq analysis and I feel like I'm making good progress, but still I just can't completely wrap my head around the design formula. From what I've read, the order of factors after the '~' don't matter, is that correct?

I have a few 100 libraries from five different phenotypes (lets call them A, B, C, D & E) from patients that are either progressors (P) or non-progressors (NP). From what I can tell, based on running various PCAs, the major separator is phenotype.

I regularly want to find out differences between progressors (P) and non-progressors (NP) (colData$NP_P) for each given phenotype (colData$Pheno), but also differences between the 5 phenotypes irrespective of progression status of the patient.

At the moment I just do: dds <- DESeqDataSetFromMatrix(countData=mat,colData=colData,design=~Pheno)

And when I want to look at NP vs P for a given Phenotype, I filter the colData for that Phenotype and:

dds <- DESeqDataSetFromMatrix(countData=mat,colData=colData,design=~NP_P)

Is this the wrong way to go about it? Should I be doing ~Pheno+NP_P, or ~Pheno + NP_P + Pheno:NP_P, I'm confused!

Lastly, if I do ~Pheno + NP_P + Pheno:NP_P, how do I set up the contrast for the Pheno:NP_P part? I tried: res <- data.frame(results(dds, contrast=c("PhenoA","NP","P"))) but it doesn't work. I tried to figure it out with resultsNames(dds) but couldn't.

Any help is greatly appreciated!

Thanks!

RNASeq DESeq2 • 657 views
0
Entering edit mode
@mikelove
Last seen 13 hours ago
United States

For questions about statistical analysis plan, I recommend to work with a local statistician or someone familiar with linear models in R. I have to restrict my time on the support site for software-related questions.