Dealing with my.contrasts with replicates containing nested samples
1
0
Entering edit mode
James • 0
@James-24772
Last seen 7 months ago
United States

I am working with experimental designs from the edgeR User Guide. In this experimental design I have a single factor with 3 levels. Each level has 2 replicates and each replicate had 5 samples taken. see below.

    Treatments <- as.factor(c(rep("control", 10), rep("dose1", 10), rep("dose2", 10)))
Replicate <- c(rep("C1", 5), rep("C2", 5), rep("D1.1", 5), rep("D1.2", 5), rep("D2.1", 5), rep("D2.2", 5))
Sample <- 1:30
y <- as.data.frame(cbind(Treatments, Replicate, Sample))
design <- model.matrix ( ~0 + Replicate, data = y)


how would I setup my.contrast to compare (C1 & C2 with D1.1 & D1.2) and then compare C1 & C2 with D2.1 & D2.2)?

my.contrasts <- makeContrasts (
ControlVsDose1 = (ReplicateC1 + ReplicateC2) - (ReplicateD1.1 + ReplicateD1.2),
ControlVsDose2 = (ReplicateC1 + ReplicateC2) - (ReplicateD2.1 + ReplicateD2.2),
levels = design)


Is this my.contrasts doing what I think it's doing?

edgeR • 633 views
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

The edgeR code seems self-explanatory but your experiment itself is unclear. What you do you mean when you say that each level has 2 replicates and each replicate had 5 samples? What is the difference between replicates and samples? Your analysis is currently treating replicates as treatments and samples as replicates.

A more detailed explanation of what you mean by "nested samples" is necessary before good advice can be given. If the samples are technical replicates, then the edgeR User's Guide advices you to collapse them into one for each biological replicate by summing the counts (using sumTechReps).

0
Entering edit mode

Thank you for your help Gordon. It's an fish experiment where we have to consider variance among tanks within a treatment (aka tank effect). There are 3 treatments (2 different tanks per treatment for a total of 6 tanks). Muscle tissue sampled from 5 randomly chosen fish per tank was sequenced. I cannot treat each fish as a treatment replicate, they are nested in tank samples. These are not technical replicates either.

I know other labs that haven't figured out how to run this experimental design in edgeR, so any advice would go a long way. Also, as a bonus question, how would I set this experiment up in edgeR if I had 2 or 3 Treatments?

1
Entering edit mode

See https://support.bioconductor.org/p/132926/ for a previous answer to the same question. My advice remains the same as it was then.

0
Entering edit mode

Thanks for the link Gordon. How would you recommend best measure Tank variance with your code before moving on to the two options (collapsing the counts in each tank or using each fish as replicate?)

1
Entering edit mode

Plot the samples with plotMDS.

0
Entering edit mode

Thank you again. So any Tank Effect becomes an judgement call. I have now plotted out the samples in a few dimension on the plotMDS with method = "common" and and need to know how much variance each component explains to see if it worth considering as a Tank Effect. i.e. if the component only accounts for 0.5% of the data then I wouldn't assume that it's a significant Tank Effect.

I have read your other posts and you are very clear that any output for variance per PCA or MDS isn't in the original design for plotMDS. Would you then recommend to make a new df from the normalized DGEList and use prcom() and summary() to find how much variance each component explains?

on a similar note is there a way to get plotMDS to do a nMDS?

0
Entering edit mode

So any Tank Effect becomes an judgement call.

Is that such a problem? Do you see a clear tank effect in the plot or not? We're not worrying about invisible effects or finessing micro effects. Just looking for something noticeable. There is no substitute for judgement in choosing analysis strategies.

I have now plotted out the samples in a few dimension on the plotMDS with method = "common"

Any reason why you don't want to use the default method that we recommend for this sort of data? There's no need to forensically examine lots of dimensions. If the tank effect doesn't show up in the first two dimensions then either it isn't important or you have much worse batch effects to be worrying about.

You just need to run plotMDS with default settings and look. It's as simple as that.

need to know how much variance each component explains

Why do you need an arbitrary numerical measure that has no direct relationship to the intended analysis and which has no objective cutoff anyway?

if the component only accounts for 0.5% of the data then I wouldn't assume that it's a significant Tank Effect.

That is unhelpful. Tank would account for more than 0.5% even if the data was just random.

Would you then recommend ...

I've given my advice but you seem resistant to taking it! The edgeR package implements what I recommend.

is there a way to get plotMDS to do a nMDS?

plotMDS does what it is documented to do. How could nMDS possibly help you?

Anyway, it appears that your data has no significant DE. All the analysis options are more conservative than the analysis you've already done. So these decisions are moot. If you're not confident about making decisions, then just average the fish and accept that there's no DE. Next time consider using more tanks. Experiments with n=2 replicates are pretty minimal.

0
Entering edit mode

I am sorry, I didn't intend to frustrate you. I do appreciate and respect your advice, hence I seek your answers. I will proceed as you suggest.

Yes, two tanks per treatment was sub par. We were limited by space and cost per tank. I have several other experiments with adequate replication that I am using your code on, so I feel the need to know the code & data well.