Appropriate experimental design for differential expresion
Hi,

Sorry I am designing an RNA-seq experiment so I need help to be wise enough before invest any money; could I please ask you if the design gives what I want? You please imagine I have 1- 3 replications of fibroblast surrounding a tumor, 2- an 3 replications organoid (3D model) of this tumor and 3 - I have 3 replications of co-cultured fibroblast with organoid; My ultimate goal is getting genes coming from the interaction of fibroblast and organoid to know which genes are reacting in an environment contains of fibroblast and tumor, so I am going to use edgeR to for differential expression the GROUP would be co-culture vs fibroblast + organoid; Do you think this is right?

    > group
group batch
fibroblast1        fb        1
fibroblast2        fb        1
fibroblast3        fb        1
organoid1          or        2
organoid2          or        2
organoid3          or        2
co-cultured1       co        3
co-cultured2       co        3
co-cultured3       co        3

group= as.factor(c(rep ("co",3), rep("fb+or", 6)))

y <- DGEList(counts = counts, group = ~group + batch)

edgeR normalization cancer
You are planning this experiment? You haven't done anything yet? Do not prep all your samples on three different days. Prepping all samples of one type on one day is the worst thing you could possibly do, it will make your results almost meaningless.

The best thing to do would be to do all the RNA extractions on the same day, and do all the library preps on the same day, so you have no batch effect. You can have Jill do all the RNA preps on Monday, and Henry do all the library preps on Wednesday, but do not have Sarah do the co-culture ones while Aaron does the fibroblasts and Christine does the organoid samples, or have Sarah do the fibroid preps one day, and the co-cultures a different day, and the organoids on a third day. (Running the samples on separate lanes or runs on an Illumina instrument is perfectly fine)

If you cannot do all the preps on the same day, absolutely do not do all the sample of one type on one day. An arrangement like the one below would be far better than what you outlined

fibroblast1        fb        1
fibroblast2        fb        2
fibroblast3        fb        3
organoid1          or        1
organoid2          or        2
organoid3          or        3
co-cultured1       co        1
co-cultured2       co        2
co-cultured3       co        3


And of course, 3 replicates is the absolute bare minimum...you'd be much happier with, say, 5. With three, if two cluster closely, and one is a bit away, it's hard to say if that's normal variation or an outlier you can easily justify omitting. But if 4 cluster together, and only one is out on its own, it's easier to justify removing it.

I would never be an accomplice to such a horrifying experimental design! Well, mostly because they'd never let me near the cell culture room in the first place. A walking talking fungal colony, that's what they said.

Sorry @Aaron Lun, I seriously have been asked to suggest an expermental design for this RNA-seq for which the ultimate goal is identifying genes coming from the interaction of fibroblast cells and organoid cells. Then how would be a reasonable design please?

I don't have anything to add to what @swbarnes2 has already said.

I will say, though, that if you really, truly, deeply care about setting up your experiment correctly; if you want your data analysis done correctly; and if you want accountability in each of these steps; you should seek a collaboration with a local bioinformatician in your region. It's all well and good to get free advice from strangers on the internet, but here as in all matters, you won't get what you don't pay for.

Entering edit mode

Sorry I am likely a beginner in computational biology in our lab. The other postdoc is planning the experiment and has not done anything yet. My job is guiding her not to spend a lot of money on none sense things. She wants to recognize differentially expressed genes due to the co-culturing of fibroblast and organoid so she asked me how would be the design in DESeq2 or edgeR

In DESeq2 is this right?

condition = as.factor(c(rep("co-cultured",3),rep("fb+or",6)))

Does it make biological sense for you to lump the or's and fb's together? I'd also strongly recommend making your condition file in Excel; because you listed the fibro samples first, but you keep putting the co condition first in your condition data