Hey Mike,
a couple of questions on DESeq2, but first of all, some code to make my questions reproducible:
library(airway) library(DESeq2) library(magrittr) dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway), colData = colData(airway), design=~cell+dex) dds_airway <- DESeq(dds_airway)
- alpha & independentFiltering. Can it be a tiny bug that when I set independentFiltering to FALSE, then the alpha is somehow not "set" in the DESeqResults object? Please compare the outcomes of these commands
(results(dds_airway,contrast=c("dex","trt","untrt"),alpha= 0.05,independentFiltering = T)) %>% summary (results(dds_airway,contrast=c("dex","trt","untrt"),alpha= 0.05,independentFiltering = F)) %>% summary (results(dds_airway,contrast=c("dex","trt","untrt"),alpha= 0.05,independentFiltering = F)) %>% summary(alpha=0.05)
- For an app development, I am trying to cover "automatically" the cases where the covariate is a factor, a continuous one or also where the levels are more than two. Quick check I am doing it right, according to the documentation:
factor -> contrast = the 3-element vector
numeric -> name = the character name of the numeric
more than 2 levels -> rerun DESeq with "LRT" as test and then use the full & reduced model to specify the contrast
Moreover, are you by chance aware of a dataset where there was a (possibly meaningful) use of a continuous covariate? As a toy case I am using airway with the read length and I am (correctly) getting very few hits. Or if not, do you know a robust way of simulating such a dataset? - I have seen you recommending the salmon path now for generating the counts, especially after the DTE/DGE/DTU paper of you and Charlotte. I found it a little harder to explain to the cooperation partners with the extra modeling-step already at the counting level, and this is kind of keeping me in the "old and safe" featureCounts-based approach. Do you have a suggestion on how to sell at best the advantages of the new method, well, apart from linking to your paper?
Thank you in advance!
Federico
Thank you for the clarification!
As for the LRT vs pairwise, you are right. I wanted at least to prompt the user that (s)he can perform the lrt test when more than 2 levels are available.
I also had my personal small portion of pain with using ReportingTools, so I know what you mean - it is still quite a great tool, kudos to the developers for it!
Thanks for the tip on the dataset, I will look deeper - and in the meanwhile hope some other user might already have been looking for the same thing.
Finally, good points for the new method selling. I also found a recent presentation by Charlotte @CSAMA, so I gathered enough info on becoming a good prophet for the novel approach.