Question: Find if a covariate has effect on gene expression level using edgeR and DESeq2
0
gravatar for ta65
3.3 years ago by
ta650
ta650 wrote:

I have RNAseq data of patient and control samples with different age and gender.

I would like to know whether these covariate affect gene expression levels. If they do not have any effect remove them from model and create simpler model for differential expression analysis using edgeR and DESeq2.

Would you please let me know if there is a function in these packages to find the answer?

 

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by ta650
Answer: Find if a covariate has effect on gene expression level using edgeR and DESeq2
3
gravatar for James W. MacDonald
3.3 years ago by
United States
James W. MacDonald52k wrote:

There aren't any functions per se, but it's not an uncommon thing to do. Do note however that you are fitting (tens of) thousands of models simultaneously, and age and/or gender (especially) will have an effect on at least some of those genes. One aspect of this sort of statistics is that you are doing things in bulk, and you don't have the ability to do lots of model checking, etc, and sometimes have to use an over-specified model because it is useful for some subset of the genes you care about (and you are thereby 'wasting' degrees of freedom on those genes for which the extra parameter(s) are not needed).

One thing you can do is a MDS or PCA plot to see if there are large sex or age related differences. You can also fit the 'full' model and then see how many genes have a significant coefficient for age or sex. If there are not too many such genes (for some definition of 'not too many'), you could decide to drop one or more coefficients. Or you could take the opposite approach and just fit the model with age and sex and burn the two degrees of freedom. If you have enough replication it might not matter that much.

ADD COMMENTlink written 3.3 years ago by James W. MacDonald52k

Thanks James. Would you please explain about what is " significant coefficient  ". sorry If this question is unrelevant. 

ADD REPLYlink written 3.3 years ago by ta650

So let's say you did something like this, where you have some data (called dglst, in my example) and a design matrix that includes sex and age.

fit <- glmQLFit(dglst, design)

testForSex <- glmQLFTest(fit, "sex")

topGenesForSex <- topTags(testForSex, Inf, p.value = 0.05)

nrow(topGenesForSex$table)

So now we fit a model that includes sex and age, and then extracted all the genes for which the 'sex' coefficient is significant at an FDR of 5%. We then see how many genes fulfill that criterion. You can then decide if that number is large enough for you to think that the sex coefficient is needed or not.

 

ADD REPLYlink written 3.3 years ago by James W. MacDonald52k
Answer: Find if a covariate has effect on gene expression level using edgeR and DESeq2
1
gravatar for Jenny Drnevich
3.3 years ago by
Jenny Drnevich1.9k
United States
Jenny Drnevich1.9k wrote:

I agree with Jim that for just 2 possible co-variates, I would just include them both as coefficients in the model. Even if there are only a few genes that one or both affect, it is better for them and the remaining genes likely won't be affected that much by losing 2 DF. However, another approach I recently used with some human data with 7 possible covariates was to first run WGCNA on the genes to divide the them into modules based on similar expression patterns, then use the eigengene values for each module in a BIC stepwise regression to select the best covariates for each module, and was pretty pleased with this broad level approach. I'm playing around with then running a separate limma/edgeR for the set of genes in each module using the best model to get gene-level results, but I'm not sure if it's a good idea or not because of the separate variance estimates for each gene set. I may broach this topic in a separate post...

Jenny

ADD COMMENTlink written 3.3 years ago by Jenny Drnevich1.9k

Thanks Jenny. Could you please share the codes that you have used for your analysis?

 

ADD REPLYlink written 3.3 years ago by ta650
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 144 users visited in the last hour