Is it possible to run DMRcate directly with genomic positions instead of probe names?
2
0
Entering edit mode
Jeni • 0
@jeni-23673
Last seen 6 months ago
Spain

Hi!

I am trying to run DMRcate with a dataset downloaded from a paper. This dataset consist, instead of a set of probe names and their corresponding beta values, of a set of genomic positions and their corresponding beta values.

When I run cpg.annotate I have to specify arraytype = "EPIC" or "450k". But these data come from none of them. So, is it possible to indicate just genomic positions to cog.annotate and get an annotation to find differentially methylated regions?

Thanks!

DMRcate • 376 views
2
Entering edit mode
Tim Peters ▴ 120
@tim-peters-7579
Last seen 10 months ago
Australia

Hi Jeni,

It would be a very odd paper indeed that didn't specify which platform their assay was run on. Doesn't the Methods section say anything at all about this? The first thing I'd do is read it to find out how the beta values were generated.

Assuming that the Methods section does say which array type you use, you can rename your matrix by matching the chromosome and position to the probe ID. For example, for EPIC, load the data like so:

data(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)
data(Locations)
Locations
DataFrame with 865859 rows and 3 columns
chr       pos      strand
<character> <integer> <character>
cg18478105       chr20  61847650           -
cg09835024        chrX  24072640           -
cg14361672        chr9 131463936           +
cg01763666       chr17  80159506           +
cg12950382       chr14 105176736           +
...                ...       ...         ...
cg23079522        chr3 160569628           -
cg16818145        chr3 182782277           -
cg14585103        chr8 139940608           -
cg10633746       chr17  18164442           +
cg12623625        chr1  17946923           +


And then use whatever string formatting is appropriate to your rownames to rename them to the probe IDs.

Best, Tim

1
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States

You don't need to specify the array type if you provide a GenomicRatioSet. And you can look at ?GenomicRatioSet-class to see how you construct one of those.

0
Entering edit mode

The problem is that I cannot create a GenomicRatioSet because I just have the matrix of Beta values, I couldn't obtain more information (such as idat files). My intention is to perform a differencial methylation analysis by using the beta values matrix that I downloaded from the provided material.

0
Entering edit mode

Of course you can create a GenomicRatioSet. Why do you think you can't? Is there something in the help page that you don't understand? Here's a fake example

 > fakebetas <- matrix(runif(10000), 1000)
> fakegr <- GRanges("chr1", IRanges(seq(1,2000, 2), width = 1))
> fakeGRatioSet <- GenomicRatioSet(fakegr, fakebetas)
!> fakeGRatioSet
class: GenomicRatioSet
dim: 1000 10
assays(1): Beta
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
Annotation
array:
Preprocessing
Method: NA
minfi version: NA
Manifest version: NA



You would obviously use your real beta values, and construct an appropriate GRanges object using the 'corresponding genomic positions' that you say you have. Or if it's really Illumina data (as Tim Peters seems to think), then you can just do what he suggested.