Which Package Should I Use to Predict Model Output Using RNA Seq Data?
1
0
Entering edit mode
theorist • 0
@theorist-23776
Last seen 18 months ago

I'm asking for some general input on how to analyze some data.

I'm trying to figure out how to fit the following statistical model to the RNA seq data in the SummarizedExperiment (SE) object here dataset. Briefly, the SE object contains expression values for individual genes across the different C. elegans lifestages. I want to find the weighted average of these stage specific expression levels that best fits predictions I've made using a model from our lab.

$$Yg = \suml \betal X{l, g} + \epsilon_g$$

where $X{l,g}$ is the true expression level of gene $g$ in lifestage $l$, $Yg$ is my model prediction, and $\epsilong$ is proportional to the std error in my model prediction. We don't know $X{l,g}$ exactly (of course), instead we have multiple estimates of it from replicates of the RNA seq experiments. That is ${x{l,g,1}, x{l,g,2}, ...}$

Before I started looking into bioconductor, I was planning on using a weighted least squares approach which deals with the imprecision in $Y$ (but not \$x_{l,g,i} , but I'm wondering if there's a better way.

Many thanks,

Mike

regression • 214 views
0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

Example workflows using limma, edgeR or DESeq2 respectively:

limma fits weighted linear models, as suggested in your question. The other two packages fit generalized linear models.