edgeR: How to set model for glmQLFTest in time-course experiment (compared to glmLRT?
1
0
Entering edit mode
@heikkisarin-13379
Last seen 5 months ago

Hi, 

We have an experiment model:  ~ condition + time + condition:subject.nested + condition:time . We're trying find any differentially expressed genes/exons between condition groups across any of the timepoints when accounting for the within subject variability. When we setup the model.matric we get 36 coefficients in total.

We can test it with glmLRT(fit, coef=35:36). Doing so glmLRTtest tests between two models: 1) full: ~ condition + time + condition:subject.nested + condition:time,  and 2) reduced: ~ condition + time + condition:subject.nested.

However we would like to do this same also with glmQLFtest. How can we do this? How does one have to setup the coefficients or contrast? 

Really appreciate all the help!

 

edgeR edger "glmqlftest()" timecourse model • 1.1k views
ADD COMMENT
1
Entering edit mode
@ryan-c-thompson-5618
Last seen 9 months ago
Scripps Research, La Jolla, CA

As far as I know, the quasi-likelihood F test supports all the same designs and contrasts as the LRT, and the usage is almost identical. Just swap out glmFit for glmQLFit and swap out glmLRT for glmQLFTest. If you're having a specific problem getting it to work, please update your question with details with your code and the error it produced.

ADD COMMENT
0
Entering edit mode

There is no problems or errors when running the glmQLFit and glmQLFTest instead of LRT approach. The point of interest on the matter is that is the statistical background of both approaches similar in such way that if I also set the coef=35:36  in glmQLFtest the test will produce an answer to the same question as with glmLRT?

ADD REPLY
1
Entering edit mode

Yes, it will.

ADD REPLY
0
Entering edit mode

Thank you very much!

Little follow-up on the matter. Is it an issue if on plotQLDisp() the Average Log2 CPM value axis and the values plotted start from -5 ? Otherwise the plot seems symmetrical and somewhat similar to the ones in the user guide. 

ADD REPLY
1
Entering edit mode

Average abundances of -5 suggest that the lowest-abundance genes have, on average, one read per 32 million reads in each library. This seems pretty small for typical library sizes - have you forgotten to filter out low-abundance genes?

ADD REPLY
0
Entering edit mode

Yeah, I've done pre filtering before the analysis in two steps:

1) remove.zeros=TRUE  (when creating the DGEList object)

2) A <- rowSums(dge$counts)
dge <- dge[A>10, ,keep.lib.sizes=FALSE]

Should I do more aggressive pre-filtering or is it possible that the plots are fine? Also on the Dispersion estimate plot the Average Log CPM values start from the -5 value.

 

ADD REPLY
1
Entering edit mode

This seems okay; I can only assume that you must have very large libraries (>100 million reads, perhaps?).

ADD REPLY
0
Entering edit mode

Well, that's good to hear. Yeah, on average around 100 million reads and highest ones a bit over 200 million. 

ADD REPLY

Login before adding your answer.

Traffic: 584 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6