Question: Combining pfilter (wateRmelon) and preprocessNoob (minfi)
0
3 months ago by
sara.blocquiaux0 wrote:

Dear all,

I want to combine the packages minfi and wateRmelon to analyse my EPIC methylation data. I loaded my data as an RGChannelSetExtended into R with Minfi. Next I wanted to filter out low quality samples and probes with the pfilter function of wateRmelon. Downside: the function returns a Methylset, which I cannot use in the preprocessNoob function of Minfi (it requires a RGChannelSet). In the older versions of watermelon I noticed it was possible to return a RGChannelSetExtended.

Is there anyone who can help me with this problem?

Best,

Sara

modified 3 months ago by tgorri10 • written 3 months ago by sara.blocquiaux0

Is the goal to filter on detection p-values, or are you wanting to filter on bead count as well?

ADD REPLYlink written 3 months ago by James W. MacDonald52k
Answer: Combining pfilter (wateRmelon) and preprocessNoob (minfi)
1
3 months ago by
tgorri10
tgorri10 wrote:

Hi Sara,

Performing pfilter of RGChannelSets has always been problem due to how the detection p values are calculated for the probes are calculated. The previous method did return an RGChannelSet which then required the manual filtering of data after normalization but this has been changed for a while now.

If you do want to use pfilter the only thing I can recommend is to run pfilter on your data and store it to an object, then normalize the older data object with preprocessNoob then subset the normalised data by the row and colnames of the pfilter object.

filt <- pfilter(data)
norm <- preprocessNoob(data)
norm_filt <- norm[rownames(filt), colnames(filt)]


I know it is not the most ideal method as we do recommend that you apply pfilter prior to normalization. We will look into this and try and come up with something for the upcoming bioconductor version.

In minfi we have the function subsetByLoci() which might be useful in these situations. The issue is that an RGChannelSet is indexed by Addresses (kind of locations of the probes) while a MethylSet (and friends) is indexed by CpG names and we sometimes have multiple addresses <-> 1 CpG because of the array design.

This function "Subset an RGChannelSet by CpG loci." The usage should be pretty clear, from the manage:

   loci <- c("cg00050873", "cg00212031", "cg00213748", "cg00214611")
subsetByLoci(RGsetEx.sub, includeLoci = loci)
subsetByLoci(RGsetEx.sub, excludeLoci = loci)


Best, Kasper

ADD REPLYlink written 3 months ago by Kasper Daniel Hansen6.4k

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.