Combining pfilter (wateRmelon) and preprocessNoob (minfi)
Entering edit mode
Last seen 4.3 years ago

Dear all,

I want to combine the packages minfi and wateRmelon to analyse my EPIC methylation data. I loaded my data as an RGChannelSetExtended into R with Minfi. Next I wanted to filter out low quality samples and probes with the pfilter function of wateRmelon. Downside: the function returns a Methylset, which I cannot use in the preprocessNoob function of Minfi (it requires a RGChannelSet). In the older versions of watermelon I noticed it was possible to return a RGChannelSetExtended.

Is there anyone who can help me with this problem?



wateRmelon minfi methylation pfilter preprocessNoob • 1.4k views
Entering edit mode

Is the goal to filter on detection p-values, or are you wanting to filter on bead count as well?

Entering edit mode
tgorri ▴ 10
Last seen 4.6 years ago

Hi Sara,

Performing pfilter of RGChannelSets has always been problem due to how the detection p values are calculated for the probes are calculated. The previous method did return an RGChannelSet which then required the manual filtering of data after normalization but this has been changed for a while now.

If you do want to use pfilter the only thing I can recommend is to run pfilter on your data and store it to an object, then normalize the older data object with preprocessNoob then subset the normalised data by the row and colnames of the pfilter object.

filt <- pfilter(data)
norm <- preprocessNoob(data)
norm_filt <- norm[rownames(filt), colnames(filt)]

I know it is not the most ideal method as we do recommend that you apply pfilter prior to normalization. We will look into this and try and come up with something for the upcoming bioconductor version.

Entering edit mode

In minfi we have the function subsetByLoci() which might be useful in these situations. The issue is that an RGChannelSet is indexed by Addresses (kind of locations of the probes) while a MethylSet (and friends) is indexed by CpG names and we sometimes have multiple addresses <-> 1 CpG because of the array design.

This function "Subset an RGChannelSet by CpG loci." The usage should be pretty clear, from the manage:

   loci <- c("cg00050873", "cg00212031", "cg00213748", "cg00214611")
   subsetByLoci(RGsetEx.sub, includeLoci = loci)
   subsetByLoci(RGsetEx.sub, excludeLoci = loci)

Best, Kasper


Login before adding your answer.

Traffic: 387 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6