I'm asking for some general input on how to analyze some data.
I'm trying to figure out how to fit the following statistical model to the RNA seq data in the SummarizedExperiment (SE) object here dataset. Briefly, the SE object contains expression values for individual genes across the different C. elegans lifestages. I want to find the weighted average of these stage specific expression levels that best fits predictions I've made using a model from our lab.
$$ Yg = \suml \betal X{l, g} + \epsilon_g $$
where $X{l,g}$ is the true expression level of gene $g$ in lifestage $l$, $Yg$ is my model prediction, and $\epsilong$ is proportional to the std error in my model prediction. We don't know $X{l,g}$ exactly (of course), instead we have multiple estimates of it from replicates of the RNA seq experiments. That is ${x{l,g,1}, x{l,g,2}, ...}$
Before I started looking into bioconductor, I was planning on using a weighted least squares approach which deals with the imprecision in $Y$ (but not $x_{l,g,i} , but I'm wondering if there's a better way.
Many thanks,
Mike