Question

How to correct for age, sex, etc. from an RNA-seq data in DESeq2.

2

Entering edit mode

treebig44 ▴ 40

@bffcbc5f

Last seen 2.6 years ago

United States of America

Hi. I am new to RNA-seq data expression analysis and to DESeq2. I have a RNA-seq dataset, and in its annotation data is the column of 'gender', 'age', etc. apart from the column of interest, which is 'condition'.

I wanted to ask that, how can I correct for the columns apart from the 'condition' column. i.e. for 'gender', 'age', etc.

Please excuse me for being a newbie. Thanks.

DESeq2 RNASeq RNASeqData • 11k views

ADD COMMENT • link 2.7 years ago treebig44 ▴ 40

0

Entering edit mode

~gender+whatever+you+want+to+correct+for, see the section on design formula in the vignette.

ADD REPLY • link 2.7 years ago ATpoint ★ 4.6k

0

Entering edit mode

I see. Yeah, I took a look at the vignette, and it says the same thing you mentioned in addition to 'condition' being the last entry in design.

~gender+whatever+i+want+to+correct+for+condition

So, is it something like that?

ADD REPLY • link 2.7 years ago treebig44 ▴ 40

0

Entering edit mode

The order does not matter if you specify the contrast. It's smart to always specify the contrast, if only so you know what you are comparing to what.

Also consider carefully the possible ramifications of making the age data into factors.

ADD REPLY • link 2.7 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

Oh okay. I will specify the contrast in that case as well.

possible ramifications of making the age data into factors

So, if there are ramifications, does that mean, that I cannot include 'age' into the design? Or, is there a way to include it in the design without any problems? So far, I was thinking of doing this:

design = ~ gender + age + condition

where gender has values 'M' and 'F', and age varies continuously from 18 to 80.

ADD REPLY • link 2.7 years ago treebig44 ▴ 40

score 1 · Answer 1 · 2022-03-15

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 4 days ago

United States

Check out the vignette first.

ADD COMMENT • link 2.7 years ago Michael Love 43k

1

Entering edit mode

I found the following in the vignette:

Note: If there is unwanted variation present in the data (e.g. batch effects) it is always recommend to correct for this, which can be accommodated in DESeq2 by including in the design any known batch variables or by using functions/packages such as svaseq in sva (Leek 2014) or the RUV functions in RUVSeq (Risso et al. 2014) to estimate variables that capture the unwanted variation. In addition, the ashr developers have a specific method for accounting for unwanted variation in combination with ashr (Gerard and Stephens 2017)

ADD REPLY • link 2.7 years ago treebig44 ▴ 40