Question: edgeR vs glm.nb
1
gravatar for dongamit123
3.9 years ago by
dongamit12320
United States
dongamit12320 wrote:

Hi,

This might be a quite basic question, but has been troubling me for a while. How is fitting a negative binomial glm in R different from edgeR DE analysis. I think edgeR allows one to model a different variance for all genes, which is not feasible in the standard glm.nb function in R. Other than this, do they both use the same model? I have drug response data for cases and controls and would like to estimate the effect of drug as well as the disease. With glm.nb I can analyze all the samples together using drug and disease as two factors (0/1). However, edgeR only allows pairwise comparisons, such as between cases with no drug vs controls with no drug OR cases with drug vs controls with drug to look at the effect of disease, and cases with no drug vs cases with drug OR controls with no drug vs controls with drug to look at the effect of drug. This tremendously reduces the power of the data I have. I just want to confirm if I am correct in understanding this. Any comments will be highly appreciated. 

Thanks

AD

edger glm • 1.3k views
ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by dongamit12320
Answer: edgeR vs glm.nb
2
gravatar for Gordon Smyth
3.9 years ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

No, you aren't correct in this understanding. Using glmFit and glmLRT, edgeR can fit completely general linear models, just the same as glm.nb.

ADD COMMENTlink written 3.9 years ago by Gordon Smyth37k

This is also a basic question but I am having a hard time finding it in the documentation.

If y is a DGE list object which contains norm.factors and lib.size, are these already included in the GLM regression model or must one explicitly add them via the lib.size parameter and/or design matrix.

Put another way, if I call res = glmFit(y, design), is the model for each gene y_i ~ design + lib.size*norm.factors?

 

ADD REPLYlink written 3.8 years ago by Guilherme Rocha40
1

You do not have to (ie. you should not) explicitly add them

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Steve Lianoglou12k
1

Also, the correct set-up for each gene looks more like this:

log(mu) = design %*% coef + log(lib.size*norm.factors)

where coef is a vector of estimated coefficients, mu is a vector of fitted values across libraries (an estimate of E(y) for a vector of counts y), and lib.size and norm.factors are vectors of their respective values across libraries.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Aaron Lun24k
Answer: edgeR vs glm.nb
0
gravatar for dongamit123
3.9 years ago by
dongamit12320
United States
dongamit12320 wrote:

Thanks for the quick response. 

If we look at the section 3.5 of edgeR manual, on this thread (question about the EdgeR package: additive models and blocking) you suggested subsetting the data and performing separate analyses on "without hormone" and the "with hormone" samples to find DE genes for diseased patients vs healthy patients. Is there any way in edgeR that I include all samples for computing DE genes between diseased patients vs healthy patients while adjusting for hormone differences.

 

 

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by dongamit12320
1

Direct comparisons between diseased and healthy patients of the same hormone status are not possible if you block for the patient effect. The patient blocking factor will absorb differences between samples from different patients, making the disease effect impossible to interpret. On the other hand, you can't just discard the patient blocking factor from your design, as you would end up with hidden correlations between hormone-treated and untreated samples from the same patient. This underlies Gordon's suggestion; to analyse hormone-treated samples separately from untreated samples, such that there is only one sample from each patient in either analysis (thus avoiding correlations between samples); or to analyse all samples with voom and duplicateCorrelation, in order to adjust for the known correlations between samples from the same patient.

ADD REPLYlink written 3.9 years ago by Aaron Lun24k

This followup question is unrelated to the topic of edgeR vs glm.nb. If you have a new question, please start a new question rather than posting it as an answer to your previous question.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Gordon Smyth37k
Answer: edgeR vs glm.nb
0
gravatar for dongamit123
3.9 years ago by
dongamit12320
United States
dongamit12320 wrote:

Thank you for the clarification! -AD

ADD COMMENTlink written 3.9 years ago by dongamit12320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 144 users visited in the last hour