Question

RNA-seq: samples form different experiments, how to do DGEs?

0

Entering edit mode

Philippine • 0

@e28160be

Last seen 1 day ago

Canada

Hello,

I have 5 experiments on bees, each experiments is done a new bees - never the same bee but always the same treatment at different doses :

exp 1: control vs exposed dose 1 and dose 2 with 5 replicates each (2 replicates control and exposed from lab 1 , 3 from lab 2 - sequenced together)
exp 2: control vs exposed dose 3 with 5 replicates each from lab 2 - done week 1
exp 3: control vs exposed dose 3 with 5 replicates each from lab 2 - done week 2
exp 4: control vs exposed dose 3 with 5 replicates each from lab 2 - done week 3
exp 5: control vs exposed dose 3 with 5 replicates each from lab 2 - done week 4

I am trying to figure out the best way to compare the effect of treatment vs control. All the experiments had same RNA extraction, libraries preparation, sequencing technology, cleaning, alignment and reads count procedure.

For exp 1 (which i consider unique): I have added in my DGEs procedure a batch effect = lab (since I saw two cluster related to the lab origin in the MDS plot).

For exp 2 to 5 I am lost, I am not sure what I should do since they have been done at different time even if the dose it's the same in all 4 exp.

Should I try to pool read count and create a big table with 4x5 control and 4x5 exposed, applied a batch effect = exp # and run the script for DGEs? Should I do DGEs individually (exp2, exp 3, exp 4, exp5) and then somehow pool them? In this case, should I take the mean of logFC? Should I completely forget about traditional DGEs and move to meta-analysis? If so, can I add more experiments but with different treatments ?

I would love your recommendations since I am not sure which is the most robust way to analysis this kind of data. It's my first time with RNA-seq analysis with sample from different lab and from different experiments.

Thank you for your help.

Philippine

DGEs edgeR RNASeqData • 573 views

ADD COMMENT • link 4 weeks ago Philippine • 0

score 0 · Answer 1 · 2024-10-25

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 43 minutes ago

WEHI, Melbourne, Australia

From the information you give, it would seem that you could analyse all five experiments together with batch effect for each lab and time combination, i.e.,

design <- model.matrix(~Exposed + Batch)

where Batch takes on a different value for each lab by time combination. Batch would take on 2 values for the two labs in experiment 1 and a new value for each of experiments 2-5.

ADD COMMENT • link 4 weeks ago Gordon Smyth 52k

0

Entering edit mode

You mean:

Metadata table with batches

is that correct? Thanks :D

ADD REPLY • link 4 weeks ago Philippine • 0

0

Entering edit mode

Sorry, I do not understand at all why you are equating batches with labs, since your original question seems to indicate that there are only two labs involved in total across the five experiments.

I am, however, suggesting six batches, the same as what you have, for some reason, called labs.

ADD REPLY • link 4 weeks ago Gordon Smyth 52k

0

Entering edit mode

Thank you, indeed I called them lab but their name should be Batch 1, 2, 3, 4, 5 and 6. Thank you again for your help! :D

ADD REPLY • link 4 weeks ago Philippine • 0