Question

dmpFinder for other objects

0

Entering edit mode

neyousha • 0

@4a906e85

Last seen 16 months ago

Canada

I have been looking at dmpFinder function, and I was wondering if the function could work with beta values (as a matrix) and phenotypes (as a vector, for example race) without having the methylset or methylation object. My only worry is that how the function would be able to match the correct phenotypes to correct sample id beta values. Since the sample IDs are columns in the beta matrix, I am not sure what to do. Any suggestions would be appreciated. Thank you!

methylset minfi dmpFinder betaHMM beta • 1.2k views

ADD COMMENT • link updated 16 months ago by James W. MacDonald 68k • written 16 months ago by neyousha • 0

score 1 · Answer 1 · 2024-10-07

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 1 hour ago

United States

The help page is helpful for answering your question:

Usage:

     dmpFinder(dat, pheno, type = c("categorical", "continuous"),
         qCutoff = 1, shrinkVar = FALSE)

Arguments:

     dat: A 'MethylSet' or a 'matrix'.

   pheno: The phenotype to be tested for association with methylation.

The expectation will be that you have samples in columns and beta values (ordered along the genome) in columns. The phenotypic data is also assumed to be in the same order as the columns of your matrix.

ADD COMMENT • link 16 months ago James W. MacDonald 68k

0

Entering edit mode

Hi James, Thank you a lot for responding. Hmm... so do you think if have matrix average_beta below: average_beta_matrix

and the phenotype dataset below pheno pheno

where both ids are matching in column and row, would dmpfiner give me a correct answer? Thank you! Neyousha

ADD REPLY • link 16 months ago neyousha • 0

1

Entering edit mode

Yes, but you should probably use M-values instead of betas (convert using logit2).

However, dmpFinder is just a way to allow people to use lmFit from limma to fit a model without having to figure out how to use lmFit. If you care to fit more than just one coefficient you will have to use lmFit directly.

ADD REPLY • link 16 months ago James W. MacDonald 68k

0

Entering edit mode

Thank you very much James, would you mind explaining why m-values instead of beta values? I have only beta_values available at this point.

ADD REPLY • link 16 months ago neyousha • 0

0

Entering edit mode

You are better off using M-values instead of beta values because you are fitting a conventional linear regression, in which case it's better if the underlying distribution of your data is at least 'hump shaped'. This will be true for M-values, but not necessarily for beta values which are strictly between 0-1 and tend to be clustered at either extreme. If you have large enough N you can assume the central limit theorem will be in effect, in which case the underlying distribution doesn't matter (but the less hump-shaped, the larger N you need for the CLT to kick in).

It's just easier to defend using M-values because they are 'normal-ish' whereas beta values are not. If you really want to use betas, you might consider using DSS which models the data assuming a beta distribution.

But anyway, the logit2 function will convert your betas to M-values, so you can have M-values if you so desire/

ADD REPLY • link 16 months ago James W. MacDonald 68k