Normalisation using clinical traits in RNA seq using DESeq2
Entering edit mode
Last seen 20 days ago

Dear Dr. Michael, I am Sukeshini K, persuing PhD. in pediatric genetics in SIU (India). I would like to take a moment to thank you for the development of DESeq2 (open source) package. Currently my work heavily depends on this package and its features. Regarding the same I had a few questions and would be grateful for your guidance. I have to normalise the matrix for some clinical traits other than batch and condition, such as age, BMI and sex. in the coldata object I have tried introducing all traits required. However, we observed that there was no difference in the normalised matrix with and without these clinical traits.

  1. Could you please let me know how to adjust for these clinical values?
  2. When I introduced batch in the colData, does that mean the batch wise matrix is normalised? or shall i seperately treat my count matrix for batch effect? Please find the script below that I tried to include the clinical traits,

dds2 <- DESeqDataSetFromMatrix(countData=count.df, colData = colData, design = ~Age+Gender+BMI+Batch+Condition)

this does not still shows adjustement for the traits.

Apologies for such a naive question, I am new to the RNA sequencing analysis. I would really appreciate your help here.


DESeq2 clinicaltraits Normalization batchnormalisation • 203 views
Entering edit mode
ATpoint ★ 4.2k
Last seen 14 minutes ago

See the vignette, it's a FAQ:

Replace "batch" during the read of this section by any covariate you add. The counts are never directly modified, it happens as part of the model internally.

Maybe a good read for you on how to estimate factors of unwanted variation and then correct for this, rather than adding individual parameters to the model 1 by 1.

Entering edit mode

Thank you, so much for the response. When we estimate factors of unwanted variable, which is the better options for factorisation of the variables? for continuous variables such as age, BMI is scaling better option or shall I use factorisation using age groups as 3 to 10, 10 to 18 and above 18? Also, I am still trying to adjust the matrix for all clinical variables. When the normalised matrix is saved shall I expect atlease some variation in the values. If not then at what stage will I oberve the change in values or adjustment for clinical variables. This will help me in understanding the contribution of each clinical trait to the gene expression.

Hopefully, my question makes sense. please let me know if I have understood correct.

Thank you so much.


Login before adding your answer.

Traffic: 800 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6