Extracting normalized counts in MEDIPS
1
0
Entering edit mode
lshepard ▴ 40
@lshepard-7434
Last seen 8 days ago
United States

Hello,

Just a simple question about the package MEDIPS. I have found this to be a great source fo MeDIP-seq, however, one point which is bugging me is that the counts present in the resulting file is not the TMM normalized counts used for differential coverage testing (for the MEDIPS.meth functions. Is anyone aware of how I can extract the normalized values?

Thanks!

MEDIPS MeDIP-seq • 381 views
2
Entering edit mode
Lukas Chavez ▴ 570
@lukas-chavez-5781
Last seen 3.7 years ago
USA/La Jolla/UCSD

Hi Ishepard,

Thank you for your positive feedback about MEDIPS! Unfortunately, the implementation does not allow to output TMM normalized counts. The fold change, group-wise mean and p-values are all calculated based on the normalized counts, though.

To also output library size normalized counts it would be necessary to edit the MEDIPS.diffMeth and MEDIPS.meth functions. However, there are no plans on our side to do so. Instead, I recommend to look into our new package qsea! It might not be as straight forward to use as MEDIPS, but is much more flexible on the other side.

All the best, Lukas

0
Entering edit mode

Hello Lukas. Thank you for the reply. After coming back to this after a while and normalizing by quantile it seems like the counts are actually normalized by it and this outputs is represented in the table.

So I would like to just double check the following conclusions are correct:

When quantile normalization are selected:

• Results table count have quantile normalized values, which is what vignette states, but I am not sure it says "warning"

Warning: In case of quantile normalisation, the counts - but not rpkm and rms values - of the returned result table will be quantile normalized.

• Is it suitable then to use quantile normalized counts for visualization of gene-level methylation? I would prefer to use these values if they were the same used by edgeR.

Thanks so much on the checks!

1
Entering edit mode

Hi,

When quantile normalization are selected:

Results table count have quantile normalized values, which is what vignette states, but I am not sure it says "warning" Warning: In case of quantile normalisation, the counts - but not rpkm and rms values - of the returned result table will be quantile normalized.

Yes, the ‘count’ columns in the result table contain the quantile normalized values. The warning is there to emphasize that the rpkm and rms values in the other columns are not quantile normalized.

Is it suitable then to use quantile normalized counts for visualization of gene-level methylation? I would prefer to use these values if they were the same used by edgeR.

The quantile normalized counts are indeed what edgeR sees (in case you opted for that normalization). Whether it makes sense to visualize gene-level methylation values is another topic (do you mean average methylation values across a gene?).

All the best, Lukas

0
Entering edit mode

Hi Lukas, thank you for the quick reply!

Understood on the warning. Makes sense now.

Regarding visualization - simply put, yes on a "per row" basis from the outputs of MEDIPS. In short my workflow has specific windows of ROIs (e.g.: promoters, gene bodies which are later annotated to gene names), thus for each row the methylation quantification are for these ranges.

The goal is to plot methylation values for each biological replicate from ranges of interest. For that we need normalized methylation (very similar to how one can do this with RNA-seq with DESeq2) to plot it accurately. At first, when looking at TMMs and using the counts, it was not making much sense which is why I originally asked the question.

But now with quantile, it seems a lot better and appropriate to use these values to represent normalized methylation across the windows we define (aka each row from MEDIPS table).

It's worth noting that the version of MEDIPS utilized was 1.34.0 but I don't think it has change with the newest one (some value differences but for the purposes of this question, I believe the concept is the same as it relates to my main questions).

1
Entering edit mode

Hi Ishepard,

I am glad to hear that the quantile normalized values give you reasonable visualizations. The TMM normalization happens within edgeR and there will be a scaling factor that, I think, goes into the model. Quantile normalization is done by MEDIPS and the results are provided to edgeR. Therefore, I can write out the quantile normalized values in the result table.

I think your approach makes total sense. It's always just the question what is a reasonable region of interest. Does it make sense to calculate a mean methylation value across the entire gene body. Not sure, maybe, maybe not. Depends on your question. The only other thing is that the counts/ quantile normalized counts are not normalized by CpG density. While this might be neglect-able when you do a differential analysis, you still might want to plot actual %methylation values. The rms values in MEDIPS where supposed to reflect that, but honestly I would strongly recommend the qsea package for transforming counts into %methylation values (if that's something you want to do).

All the best, Lukas