ComBat: 2 adjustment variables & continuous adjustment variables
0
0
Entering edit mode
Magda Price ▴ 60
@magda-price-6411
Last seen 7.4 years ago
Johnson, William Evan <wej at="" ...=""> writes: > > Hey Magda, > > The two-step method is still a reasonable approach. It has worked well for me in multiple situations. I do > have a beta version of a ComBat version that will handle two batch variables at the same time. It works well > in theory--but I have yet to test it thoroughly across multiple datasets. I'm willing to share the code if > you want to test it on your data (let me know). > > ComBat in the sva package can handle numeric covariates, but it does not deal with continuous batch > variables. Adjusting the mean of a continuous batch variable would be straight-forward (assuming a > linear effect), but the variance adjustment would be very tricky. > > Ultimately, since the two-step approach seems to have worked, I think your best option is to just move > forward with those results. > > Thanks! > > Evan > > On Feb 19, 2014, at 4:00 AM, <bioconductor-request at="" ...=""> > <bioconductor-request at="" ...=""> wrote: > > > Message: 23 > > Date: Tue, 18 Feb 2014 16:45:12 -0800 > > From: Magda Price <magdaprice at="" ...=""> > > To: "bioconductor at ..." <bioconductor at="" ...=""> > > Subject: [BioC] ComBat: 2 adjustment variables & continuous adjustment > > variables > > Message-ID: > > <cadkr4v=ydd1abjxfhtd+xwq8mzmp_=urhvdtpxoturpqjzb7tg at="" ...=""> > > Content-Type: text/plain > > > > Hi! > > > > I'm writing with a few questions about applying ComBat (sva package) to a > > set of ~50 samples run on the the Illumina Infinium HumanMethylation450 > > BeadChip array (~450,000 DNA methylation data points). > > > > There is a large amount of variation in my data due to both the batch the > > samples were run in (3 different batches), in addition to the position they > > were located on the chip - specifically the row (6 different rows), but not > > the column. The chips are set up in a 6 row * 2 column format like this: > > > > > > sample 01 sample 02 > > sample 03 sample 04 > > sample 05 sample 06 > > sample 07 sample 08 > > sample 09 sample 10 > > sample 11 sample 12 > > > > > > I read Dr. Evan Johnson's suggestions to someone else with this > > "2-batch-effect-variable" problem in the ComBat google group ( > > https://groups.google.com/forum/#!topic/combat-user- forum/PcTxNlaUmAI). He > > had 2 good suggestions: > > > > 1. Combine the two batch variables into one, if 3-4 reps are left in > > each batch > > 2. Use ComBat twice, adjusting for the first batch using the second > > batch as a covariate, and then adjust for the second batch. > > > > I cannot go with the first suggestion because combining the 2 batch > > variables would create 18 batch categories (3 batches * 6 rows), and I > > would not have enough replicates per batch category. > > > > So I tried the second option - applying ComBat twice. I first corrected for > > row and then took the row-corrected data and applied ComBat again, > > correcting for batch. It seems to have worked & the correlation of my > > technical replicates improves. I am seeking advice on two points: > > > > 1. The google group post is now a few years old, is it still thought > > that the step-wise correction is a valid approach? > > 2. Row would be better treated as a continuous adjustment variable than > > a factor. In the version of sva that I am using (3.0.2) I believe that only > > factors adjustment variables are supported. I have seen mention in a few > > forums that there might be an update to ComBat to adjust for a numeric > > batch variable, is one available? > > > > Thank you in advanced for your help! > > > > Magda Price, > > University of British Columbia > > _______________________________________________ > Bioconductor mailing list > Bioconductor at ... > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > Hi Evan, Thanks for your response & so sorry for my delay, I wasn't notified by e- mail that you had responded. Since I wrote the first note, a few things have changed: 1. I have discovered a third batch variable (chip, in addition to plate & row)! 2. Based on forum feedback, I figured I was okay to stick with all factor adjustment variables. An additional question that has come up in reference to your suggestion: > > Use ComBat twice, adjusting for the first batch using the second > > batch as a covariate, and then adjust for the second batch. I can't include the second batch variable as a covariate; I get a singularity error because the batches are confounded with each other. For example, all samples on chip A were run in batch 1. Do you still think it a valid approach if I can't use subsequent batch variables as covariates? Thank you for the offer & advice! Magda
GO Category sva GO Category sva • 1.8k views
ADD COMMENT

Login before adding your answer.

Traffic: 783 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6