Extracting a subset of samples following SVA
1
0
Entering edit mode
@jonellevillar-15405
Last seen 10 months ago
Bergen

I am performing differential methylation analyses on different subgroups of a patient population. The data was assayed on Illumina EPIC arrays. We have considered using combat for the batch effects, or just blocking in limma as previously discussed on this forum.

I have a question however regarding SVA and a singularity issue with one particular subset. It seems that there is only 1 sample on several of the plates, and 1 sample in the wells. I tried merging the samples and running the SVA model again. The singularity error message remained. I have now been considering running the model + batch effects with SVA on the entire targets sample, and then extracting the subsets afterwards. Would this "fix" lead to false estimates in the downstream analyses? It seems I need a larger dataset in order to avoid those single samples on the plates and the wells.

Thanks for your input.

Combat SVA Limma • 1.0k views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 3.9k
@kevin
Last seen 1 day ago
Republic of Ireland

Hey, if you are implying that some of your batches have just a single sample and that there is evidence of a batch effect that involves these [batches], then there is not much that can be done. You should probably remove these samples from the study.

Over the years, I have noticed how many users are keen to adjust for what they assume to be batch or other unknown effects in their data. Do you have concrete evidence of a batch effect or is it just your intuition that there is some batch or other effect that must exist? The best strategy is obviously a solid experimental design, but we do not always have this, as we know.

Kevin

Edit:

  • there is in fact a very informative thread where another approach is suggested: https://support.bioconductor.org/p/109040/#109042
  • The approach suggested by Aaron using duplicateCorrelation() is actually one that I have recently employed in a study (2 days ago)
ADD COMMENT
0
Entering edit mode

Hi Kevin,
Thank you for your response. The only evidence I have are two pronounced clusters in PCA plots and Scree plots that show 98% of the variability associated with dimension 1. At this point we are only considering batch effects from the EPIC array. I look forward to reading the approach suggested by Aaron using duplicateCorrelation. Were you satisfied with this option in your recent study? Best, Jonelle

ADD REPLY
0
Entering edit mode

Okay, I am convinced by 98% variation along PC1! Yes and no, with regard to the duplicateCorrelation approach: yes, because it allowed us to control for batch and Donor (Individual); however, I would have preferred a larger and differently-designed study.

Anyway, I trust that your analysis will go well.

ADD REPLY

Login before adding your answer.

Traffic: 910 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6