DESeq2 question about predicting abundance adjusted for covariates.
1
0
Entering edit mode
migimimi0 • 0
@migimimi0-15168
Last seen 6.1 years ago

Hi, I am doing some experiment and at a loss modeling by DESeq2. I want to estimate abundance of genes across the three conditions (namely treatment1, treatment2 and treatment3), adjusted for covariates of age. I make DESeq object, and estimate size factors, estimate dispersions, and do Wald test as follows where data is raw count matrix.

dds = DESeqDataSetFromMatrix(data, colData=mapping, design=~age_binned+treatment)
dds = DESeq(dds, fitType="local")

at this time, I can extract results of differential expression test by Wald test between treatment1, 2 and 3.

However, when I want to know adjusted abundance for all groups, how can I proceed? I did plotCounts(returnData=TRUE) which normalizes counts by sequencing depth and adds a pseudocount of 1/2, but in my understanding this doesn't account for covariates information like age, which are included in dispersion estimate process?

I tried also variance stabilized transformation (VST) for data, which transform size-factor normalized data by using estimated dispersion using design formula, but I don't know this already accounts for covariates information and can be interpreted directly as "age adjusted abundance", and thus comparable directly across groups.

My try so far is to extract size factors normalized values by counts(dds, normalize=TRUE), and make generalized linear model by feeding this data as dependent variables, and covariates as independent variable, then use predict() to get age-adjusted data. Is this the correct way to do such thing?

Or should I use VST data from design ~1 to feed for GLM or something?

To summarize, If I want to use abundance adjusted for sequence depth or library size variance, and age effect for comparison between groups, is this correct to use:

  1.  VST data directly (design ~age_binned + treatment)
  2.  Model VST data (design ~age_binned + treatment) in GLM or something to adjust for age effect
  3.  Model VST data (design ~1) in GLM or something to adjust for age effect
  4.  Model normalized counts by size factors in GLM or something to adjust for age effect
  5.  Model plotCounts() data frame in GLM or something to adjust for age effect

Certainly I'm missing something or poorly understood, thank you for help.

 

deseq2 • 1.5k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 12 hours ago
United States

hi,

"adjusted abundance for all groups" => we don't have any tools which will provide adjusted abundance based on covariates. The transformations in DESeq2 do not adjust for covariates. A simple approach is to transform the data (either simple log of normalized counts or a VST), compute a linear regression on nuisance variables, and then work the residuals. You can use limma's removeBatchEffect for this.

ADD COMMENT

Login before adding your answer.

Traffic: 943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6