I have a data set of two different cell lines compositions for a developmental assay in Drosophila. By itself it wouldn't be so difficult to analyze it, but the problem is in the composition of each of the cell lines.
The first cell line has four different subtypes (A,B,C,D) the second one has only subtype C. It is expected to have subtle changes for some of the genes, but we're not really sure how many and what kind of changes.
I was wondering if this can be analyzed in a straight forward way. What can I say about genes differences between these two populations?
Is it possible to compare them in the standard way, just compare one to the other?
I am not sure it would tell me anything significant. If a genes is DE in the first population, what does it means?
If anyone know of a source or reference of some kind for this kind of experiment, I would appreciate a hint.
thanks
Assa
Just to tack on to Aaron's answer, we have a very straightforward deconvolution method in DESeq2, called unmix(). This was built because we needed it for a few local projects, and we've put it through testing on a number of large bulk RNA-seq datasets now. It does the simple thing of non-negative-combinations-on-raw-expression-scale, while making comparisons between observed expression vector and the non-negative linear combination in a variance stabilized space. I like this VST approach relative to other approaches, which were filtering out the low and highly expressed genes. In my opinion, this is where lots of the signal resides.
Thanks Michael for this suggestion. This is one function I haven't seen before. Looking at the `?unmix` information, I was wondering if I understand it correctly.
Using this would mean that in `x` are my samples with the mixed population and `pure` are the samples withe only one subtype. Is this correct?
Thanks for the fast response. This is what i also thought. I still doubt though, that this is what they are looking for. Maybe a little more background information would help. The experiment is about dendrites in drosophila's brains. We are interested in a neural population which is responsible for motion. This can be divided into the aforementioned four subtypes, which are responsible for different kind of motion. Although their morphology is similar the four subtypes differ in he crucial parts responsible for their functionality.
The problems is that the four subtypes A-D are not separable. The best one can achieve is a mixture of cells with higher A-B amount and low(er) C-D amount of cells. But this still wouldn't tell me how much RNA from each subtype is in the mixed samples.
And yes, you're right, the next step in the plan is to do a single-cell RNA-Seq experiment, but as this takes more time and effort, it was considered to first try and see if one can get some preliminary results doing a standard RNA-Seq experiment.