deseq2 normalized data
1
0
Entering edit mode
akp • 0
@akp-8846
Last seen 5.2 years ago

I understand the idea of using negative bionomial distribution to test whether a covariate is differentially expressed/abundant or not.

I wonder, if the same argument is valid, when the analysis is not performing any test but for example, regressing these genes over case/control. In this regression, one continue and use relative abundance or should still use say the variantestablizer ...

regression deseq2 counts • 1.2k views
ADD COMMENT
0
Entering edit mode

I think you will need to be more specific about what you mean by "regression analysis".

ADD REPLY
0
Entering edit mode

regressing genes on the outcome ( case/control ).
 

ADD REPLY
1
Entering edit mode
@mikelove
Last seen 8 hours ago
United States

"when the analysis is not performing any test but for example, regressing these genes over case/control. In this regression, one continue and use relative abundance or should still use say the variantestablizer ..."

Sorry, this is not clear enough for me to give an answer. The GLM is in fact very similar to a regression of the expected value for the normalized counts on the log scale over the case/control status. Can you restate the question in a more specific way as to your aims?

ADD COMMENT
0
Entering edit mode

I am going to use a predictive model, to classify cancer / non-cancer. You can think, of it as a logistic regression; and eventually, my models returns some coefficient for every covariates(genes); Then if a new data comes, based on those coefficients I can assign new data points into either classes.

Typically, in this type of regression analysis, we standardize/rescale via "(x - mean(x))/sd(x)"; I wonder, if one should use DESEQ2 normalized data and skip "(x-mean(x))/sd(x)" or the other way around ?

ADD REPLY
0
Entering edit mode

I would recommend variance stabilizing using VST or rlog and not dividing out the row (gene) standard deviation*

* see this explanation: A: Biclustering Normalizing by Row in Heatmap of DESeq2

With the variance stabilized data, you can then perform any kind of machine learning or prediction algorithms your like. 

ADD REPLY

Login before adding your answer.

Traffic: 272 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6