Question

DuplicateCorrelation() block by duplicate sample or by cell line ?

0

Entering edit mode

Gimena • 0

@ef404090

Last seen 12 weeks ago

United States

I have Bulk-RNAseq data from 3 drug exposures (vehicle, low and high dose) x 2 replicates per condition x 6 cell lines. So its a total of 36 samples.

I am interested in the Exposure effect.

I am using DuplicateCorrelation and limma voom, but my cell Line effect is eating all the effect (seen by variant partition).

info <- # is the metadata table
genes <- # counts table, rows are genes, columns are samples

design <- model.matrix( ~ Exposure+ CellLine ,info)

vobj_tmp = voom( genes, design, plot=TRUE)

dupcor <- duplicateCorrelation(vobj_tmp,design,block=info$SampleID)  # I am blocking by sample,  meaning each 2 replicates per condition

vobj = voom( genes, design, plot=TRUE, block=info$SampleID, correlation=dupcor$consensus)

fitDupCor <- lmFit(vobj, design, block=info$SampleID, correlation=dupcor$consensus.correlation)

fitDupCor <- eBayes( fitDupCor )

I got no significance because the effect of the cell lines i hugh (as I observed before in VariantPartition).

1) Am I using the blocking correctly? Or should I block by sampleID in duplicateCorrelation but block by CellLine in voom() ?

2) Is it correctly to use Variant Partition even considering the replicates as independent samples? I am not using duplicateCorrelation for Variant Partition.

gExpr <- DGEList(counts=counts)

gExpr <- calcNormFactors(gExpr)

design <- model.matrix( ~ Exposure+ CellLine ,info)

vobjGenes <- voom(gExpr, design)

geneExpres <-vobjGenes$E

varPart <- fitExtractVarPartModel( vobjGenes, form, info )

vp <- sortCols(varPart)

3) Any other idea on how can I rescue the exposure effect?

Thanks in advance Gimena

variancePartition RNASeqData limma BulkRNAseq RNASeq • 1.3k views

ADD COMMENT • link updated 12 months ago by Gordon Smyth 52k • written 14 months ago by Gimena • 0

score 1 · Answer 1 · 2023-10-10

1

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 1 day ago

United States

If you have complete cases, cell line will be orthogonal to the treatment effect, so the fact that most of the variation is due to cell line should have no effect on being able to detect differences in exposure. Also, you should probably use voomLmFit instead.

fit <- eBayes(voomLmFit(gExpr, design, block = info$SampleID))
topTable(fit, 2)
topTable(fit, 3)

ADD COMMENT • link 14 months ago James W. MacDonald 67k

0

Entering edit mode

Hi James,

Yes, these are complete cases. every line has all the treatment groups and technical replicates.

ADD REPLY • link 12 months ago Gimena • 0

score 0 · Answer 2 · 2023-10-11

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

It is not clear to me why you are using duplicateCorrelation at all. None of the information you give about your experiment suggests any random effect or blocking. It is not clear what you mean by "replicates" or why they would not be independent samples.

ADD COMMENT • link 14 months ago Gordon Smyth 52k

0

Entering edit mode

I believe it's because the replicate cell lines aren't truly biological replicates, but instead just aliquots from the same cell suspension.

ADD REPLY • link 14 months ago James W. MacDonald 67k

0

Entering edit mode

Could be, but the question doesn't say that as far as I can see. If the replicates are indeed just technical replicates of aliquotes, then there is no biological replication and an analysis in limma is impossible. limma needs some biological replicates just to estimate the duplicateCorrelation. Same goes for variancePartition -- there is insufficient information in this experiment to estimate variance components.

The only option really is to treat the replicates as independent samples, i.e., same analysis but without blocking or variance partition.

ADD REPLY • link 14 months ago • updated 12 months ago Gordon Smyth 52k

0

Entering edit mode

I am sorry for the confusion. It goes like this:

3 drug exposures (vehicle, low and high dose) x 2 replicates per condition: technical replicates x 6 cell lines: biological replicates

Technical replicates for me are the same cell line, same treatment, but grown in different wells.

That is why I want to use duplicateCorrelation for the technical replicates.

I hope this clarifies.

Thank you!!!!

ADD REPLY • link 12 months ago Gimena • 0

0

Entering edit mode

They are not technical replicates. With the Exposure+CellLine model, your experiment has only one level of variability so there is no possibility of estimating random effects. It is neither necessary nor correct to use duplicateCorrelation or VariancePartition.

Given the balanced design, it is also not possible for the CellLine effect to be somewhat "eating" the Exposure effect.

The standard way to analyse this experiment would be the use the oneway layout approach described in the limma document. That would allow you to test for cellline-specific exposure effects.

ADD REPLY • link 12 months ago Gordon Smyth 52k