RNA-seq: two factor anova - how to find the variability explained by each of the factor?
1
1
Entering edit mode
nooshin ▴ 300
@nooshin-5239
Last seen 5.4 years ago

Hi all,

I have to do two way anova on the RNA-seq data and find out the variability that can be explained by each factor:

Time <- factor(rep(1:3,4),levels=3:1)
Sex <- factor(rep(c("Female","Male"),each=6),levels=c("Male","Female"))
design <- model.matrix(~Sex*Time)

I want to calculate the amount of the variance can be explained by factor Sex, the amount of the variance can be explained by factor Time, the amount of the variance can be explained by interaction between Sex and Time.

Would this be possible to do it using edgeR, DESeq, or limma? and if yes, how?

Thanks a lot and looking forward.

N,

 

 

rnaseq two factor anova limma DEseq2 edgeR • 3.0k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 46 minutes ago
WEHI, Melbourne, Australia

When you say "the amount of variance", I assume you actually mean the sum of squares (SS) and mean square (MS) quantities attributable to each term, as for any analysis of variance.

Well, yes, it can be done. But the answer will be different for each gene, making the results very hard to interpret. Are you sure that is what you need?

ADD COMMENT
0
Entering edit mode

Thanks a lot for your response.

Yes, I want the SS and MS for each gene separately like in anova. I need to calculate the variabilities of each factor and their interactions separately for each gene, like the following example by using anova:

res <- anova(lm(values ~ Sex*Time,data))
    
res_ss <- res$"Sum Sq"
res_ms <- res$"Mean Sq"
res_df <- res$"Df"
    
v1 <- (res_ss)[1]/sum(res_ss)

v2 <- ((res_ss-res$"Df"*res_ms[4])/(sum(res_ss)+res_ms[4]))[1]

Would this be possible that I do voom normalization on my data and then do exactly the same analysis as above on the voom-normalized results?

however I would like to do it also with edgeR to check for the similarities at least between two methods like limma and edgeR, or edgeR and DESeq.

Thanks.

ADD REPLY
0
Entering edit mode

Do what analysis? Your quantity v1 is just the proportion of SS explained by Sex, but v2 looks like a nonsense quantity. It doesn't seem to me to measure anything.

ADD REPLY
0
Entering edit mode

v2 is a less biased indicator of variance explained in the population by a predictor variable:

(SS_factor - df*(MS_residual))/ (SS_total + MS_residual)

It's basically the same as V1 but the normalized version if it's possible to call it so :)

ADD REPLY
0
Entering edit mode

OK, v2 seems to be a ratio of estimated variance components.

None of the packages will estimate variance components for you. In fact, fitting variance component models is problematic with weights (voom) or in a generalized linear model context (edgeR, DESeq).

You can simply compute a matrix of logCPM values, then repeat your anova calculation for each row.

ADD REPLY
0
Entering edit mode

Then this means that I can do the same anova analysis on logCPM values for each gene.

Thanks a lot.

ADD REPLY
0
Entering edit mode

would you mind please guide me on how I can do it?

tnx
 

ADD REPLY

Login before adding your answer.

Traffic: 751 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6