missMethyl package: gsameth bias plots
Entering edit mode
Last seen 6.8 years ago

Hi all,

I am using the Illumina Infinium HumanMethylation450 assay and want to do gene ontology testing using the gsameth function in the missMethyl package.

For the generated plots that show the bias resulting from the differing number of CpG probes sites per gene, what is the significance of the fit curve? Why isn't a best fit line used?

Thank you!


missmethyl plots bias package • 1.1k views
Entering edit mode
Last seen 7 months ago

Hi Evani

I'm not 100% sure I understand your question. The plotBias=TRUE produces a plot that shows the proportion of signficantly differentially methylated (DM) genes in bins of ~ 200 genes. The bins are allocated based on the numbers of CpGs annotated to each gene. So each point in the plot represents the proportion of DM genes out of the 200 genes assigned to that bin. The blue line is a lowess fit through the points, which is a robust fit which can take any shape (i.e. it is not constrained to be a straight line, or polynomial of order 2 etc).

It is meant to help you eyeball the relationship between numbers of CpGs associated with each gene, and the proportion of genes called "significant". The expectation is that as numbers of CpGs associated with a gene increases, the more likely you are to call the gene "significant", and hence you would expect the blue line to increase from left to right. If you find that the line looks flat, then it is not as important to account for the bias in the data, although it won't hurt to use prior.prob=TRUE as the relationship is empirically determined by the data.

It is not meant to have a significance measure associated with it, it is more to aid in understanding the bias in your data.




Login before adding your answer.

Traffic: 340 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6