Using Age or other numerical variable to model expression in deseq2
1
0
Entering edit mode
@krushnach80-11463
Last seen 12 minutes ago
India

This is regarding the single factor design For example if I have Age or other continuous numerical variable how to provide that into the design formula.

For this post do i need to 'You could dichotomise your continuous variables into meaningful groups' or it can go without it? grouping numerical variable is needed prior to running it in deseq2 because here each age becomes a factor if I get it

Here in case of metadata/coldata I m giving a single numerical value which is Blast percentage. In case of sample I have like 5 sub-types from M0 to M5.

Now for the interpretation part How do I interpret the result?

Would it be as such

the expression differences between the my subtypes(samples) due to 'Age' or what would be my statistical way to convey the result

I'm bit confused since in my coldata I'm not providing any information regarding my subtypes.

So if I would like to know if there is a difference between which sub-type due to this Age variable how do i get that information?

To know the differences in sub-type I have do which is providing the FAB which are basically my sub-types information where I have tested pairwise.

I would like to know if I give any numerical variable to my design how do I interpret output? the gene expression?

Any suggestion or help would be really appreciated.

My design


dds <- DESeqDataSetFromMatrix(countData=data, colData=coldata, design= ~ Age)

dds <- estimateSizeFactors(dds)

dds <- estimateDispersions(dds)

dds  <- DESeq(dds, parallel = TRUE)
resultsNames(dds)  I get this

[1] "Intercept"     "Age"

sessionInfo( )

DESeq2 • 394 views
0
Entering edit mode

If I understood correctly, you want to see the results, right?

 res = results(dds)
summary(res)


And then just view the genes on a volcano plot or something like that.

0
Entering edit mode

I know the result but i was not clear about the interpretation because if i use that numerical variable which is turned into factor for example 20.2 and 20.3 would be different factors although they are still 20. So if i have a like a categorical variable say disease vs non disease there the interpretation is straight forward what would be the here ?that was my question

2
Entering edit mode

You can import age and not make it a factor.

0
Entering edit mode

yes in my metadata I checked its not turned into factor.. simple question what is the intercept in case of when I using age in my design ? is it the lowest age that is turned into reference level

2
Entering edit mode

If you converted to a factor, the intercept is the reference level (you have to look at levels(age) to know what the reference is). If you use age as a continuous value, then the intercept is gene expression at birth.

3
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

From what you have said, it's likely you are not specifying your model correctly. Right now you are simply testing for genes that change expression as a function of age (in which case you would interpret the beta for any significant genes as the log fold change for each unit change in age. As an example, if you used age in years and you had a beta of say 1, that would mean the gene expression doubles for every yearly increase in age). But you talk about a blast percentage, which I take to be the phenotype or treatment of interest.

If you are actually trying to find genes that change due to the blast percentage, you need to add that to the design. You can keep age, which will then adjust for changes due to age (a nusiance parameter that you think might affect the gene expression, but you don't really care about). If what I am saying is confusing, then you should either read (and re-read) the DESeq2 vignette, and probably the workflow as well. Or maybe find somebody local with experience and have them help or do the analysis for you.

0
Entering edit mode

if I have to use age from this post i think this is more cleaner way to make group or make category of age groups

2
Entering edit mode

You have to use age from that post? Why is that? It is rarely optimal to categorize age rather than use as a continuous variable. It takes more degrees of freedom to do so, and you end up saying things like 20.2 and 20.3 are different. And presumably that means 20.1 is 'the same' as 20.2 and 20.4 is 'the same' as 20.3, which doesn't make sense - there's no magic thing that happens between 20.2 and 20.3 that wouldn't also happen between any other 0.1 difference in age, so why arbitrarily add that distinction between essentially random cutpoints?

It also makes less sense if age is only being used as a nuisance variable, which by definition you don't really care about, but feel like you should adjust for. You don't want to spend any more degrees of freedom on a nuisance variable than is necessary, and certainly don't want to make the interaction between the nuisance variable and your variable of interest more complicated than necessary.

0
Entering edit mode

"Why is that" no as an example i was citing that.."certainly don't want to make the interaction between the nuisance variable and your variable of interest more complicated than necessary." now Im clear