Question: Multi-factorial design for patient-replicates from different time points using duplicate correlation LIMMA Voom
0
gravatar for alva.james
2.6 years ago by
alva.james0
Germany
alva.james0 wrote:

Hello All,

I am trying to implement LIMMA room duplicate correlation strategy for my RNA-seq samples using blocking setup. I need to consider following points into account.

1. My samples are biologically paired and dependent also not all patients has its paired sample.

for instance, this is how my design table looks like,

Patient Condition Time
PE01 PH ID
PE01 PH REL
AE01 Non ID
PE05 Non REL

 I have 82 samples with more than half of patients has its paired sample and few of them doesn't. According to Limma from user guide, I have adapted to the following workflow. But I have a question,   how limma voom fits into the workflow for a paired experimental design that uses duplicateCorrelation() to handle blocking on the same subject, how is the appropriate experimental design (taking subject blocking into account) fed to voom?

My design looks like,

combined_fac <- factor(paste( design_table$Condition))

design <- model.matrix(~0 + combined_fac)

rownames(design)<-colnames(data)

colnames(design)<-c("Non","PH")

## Then followed with estimating correlation, blocking is input with design and then the Contract for finding DE genes

y <- DGEList(data)
y <- calcNormFactors(y)
v <- voom(y, design)
corfit <- duplicateCorrelation(v, design, block =design_table$Patient

)
v <- voom(y, design, block =design_table$Patient, correlation =
corfit$consensus)
fit <- lmFit(v, design, block = design_table$Patient, correlation =
corfit$consensus)
cont_matrix <- makeContrasts( PHvsNon = PH-Non,  levels=design)

fit2 <- contrasts.fit(fit, cont_matrix)
fit2 <- eBayes(fit2)
result_double <- topTable(fit2, number = nrow(data), sort.by ="none")


I wanted to make the understanding the workflow is right with for my experiment.? or what is the appropriate way to run voom such that it takes all of the necessary design information into its account? 

Thanks

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by alva.james0
Answer: Multi-factorial design for patient-replicates from different time points using d
2
gravatar for Aaron Lun
2.6 years ago by
Aaron Lun25k
Cambridge, United Kingdom
Aaron Lun25k wrote:

Time seems to be an important factor, so perhaps you should be doing:

combined_fac <- factor(paste0(design_table$Condition, ".", design_table$Time))

Even if you're only interested in DE between conditions, you need to include time in the design to avoid inflating the variance. It is simple to modify the contrasts to test for DE between conditions in the presence of time:

con <- makeContrasts( (PH.ID + PH.REL)/2 - (Non.ID + Non.REL)/2, levels=design)

In any case, your design matrix doesn't have a Phlike column, so I don't know what that is or where it comes from. I also don't know what model$Block is in the call to duplicateCorrelation, perhaps you mean design_tab$Patient.

Otherwise, the code looks fine.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Aaron Lun25k

Thanks Aaron for your time, yes Phlike is PH as it is in my design_table, I just corrected that in my primary question, also model$block. So yes I need the DE genes between PH and Non. Their data is my count and the column names of samples don't have "." so I don't understand why combined_fac should be defined as you define? Also regarding time, is it because statistically, the variance inflation matters in the analysis? Aslo as you see few patients doesnt have either REL or ID as pairs. In this case is it important  to take the time level into account

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by alva.james0

Does gene expression change between levels of Time? Presumably whoever designed the experiment thought that it does, otherwise they wouldn't have bothered collecting samples for the two time points. If there is DE between time points, you need to consider the Time factor in your design matrix. Otherwise, DE between time points will manifest as an increase in the estimated variance, which will reduce your power to detect DE between conditions.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Aaron Lun25k

Does gene expression change between levels of Time? No, we first started with that (to look into the difference between 2-time levels), but we didn't see any changes between time level but the subtype is driving the difference between samples.

ADD REPLYlink written 2.6 years ago by alva.james0

Honestly speaking, I don't see the harm in putting Time into the model. It's only one factor, so you hardly lose any residual d.f., and it does protect you against potential variance inflation. I don't know how you initially concluded that time had no effect; but if you used the same approach as you did in your original post, you obviously won't observe any DE because the variance inflation due to Condition is likely to be substantial.

ADD REPLYlink written 2.6 years ago by Aaron Lun25k

 I don't know how you initially concluded that time had no effect; --We initially started to find the DE genes (using Deseq2) between ID and REL of all samples. And the DE genes were very heterogeneous in their expression pattern. And it was very difficult to call it as distinguish expression pattern between ID and REL. But the between subtype is showing a very distinguish pattern compared to ID versus REL. Yes I can incorporate the Time into model. Thank you

ADD REPLYlink written 2.6 years ago by alva.james0

Sorry, I am getting error here, at makeContrasts : Error: unexpected symbol in " makeContrasts( ( combined_facPhlike ."

 

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by alva.james0

Why is there a dot by itself? The dot in my post is part of the column name of design.

ADD REPLYlink written 2.6 years ago by Aaron Lun25k

In my design there is also a dot, as  combined_facPhlike . REL in the design matrix

ADD REPLYlink written 2.6 years ago by alva.james0

Ah, you should use paste0 rather than paste when making combined_fac.

ADD REPLYlink written 2.6 years ago by Aaron Lun25k

Yeah thanks it works :)

ADD REPLYlink written 2.6 years ago by alva.james0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour