Hi, I have some transcriptomics data from a microorganism collected at different locations. I've been trying to perform an LRT using edgeR to test the location effect on transcription. What I want is to make a full model with all the coefficients, and compare it with a null model in which location plays no role. I thought I had the correct formulas but after the test absolutely all the features/tags are significant (with absurdly small FDR), so I must be doing something wrong. A mockup of my code is as follows:
x <- normalized_counts site <- factor(c("a","a","b","b","c","c","d","d","e","e")) design <- model.matrix(~ 0 + site) y <- DGEList(counts=x,group=site) y <- calcNormFactors(y) y <- estimateDisp(y,design) fit <- glmFit(y,design) lrt <- glmLRT(fit, coef=c(1:ncol(fit$design)))
In case anyone knows the sleuth package, I previously used the lrt implemented there with the following configuration:
so <- sleuth_prep(s2c , ~ site , target_mapping = t2g , aggregation_column = "ens_gene" , transformation_function = function(x) log2(x + 0.5) ) so <- sleuth_fit(so) so <- sleuth_fit(so , ~1 , 'reduced') so <- sleuth_lrt(so, 'reduced', 'full')
And wanted to make a similar test using edgeR. I would greately appreciate any insights.