If I look the raw countdata, ~55% of my genes have zero counts in all the samples (64 samples in total for 8 genotypes). Though filterByExpr effectively eliminated several high FC yielding but low level expressed unwanted genes, I noticed that some of the biologically interesting genes are eliminated. For ex, if I take a subset of the samples for a single genotype having two time points for control and stress, following pattern (ie, 2h stress induction) is interesting to my biological question,
2h_C1_1 2h_C1_2 4h_C1_1 4h_C1_2 2h_S1_1 2h_S1_2 4h_S1_1 4h_S1_2 Gene_x 0 0 18 15 48 54 29 35
How can I deal with elimination?
I use code,
group <- factor(paste(targets$Treat,targets$Time,targets$Genotype,sep=".")) cbind(targets,Group=group) y <- DGEList(counts=x) keep <- filterByExpr(y) y <- y[keep, , keep.lib.sizes=FALSE] y <- calcNormFactors(y) design <- model.matrix(~0+group) colnames(design) <- levels(group) y <- estimateDisp(y, design) fit <- glmQLFit(y, design) my.contrasts <- makeContrasts(--contrast--,levels=design) qlf <- glmQLFTest(fit, contrast=my.contrasts[--contrast--]) topTags(qlf)