Hi,
If you have only one region of interest you can do something like this:
library(BSgenome.Dmelanogaster.UCSC.dm3)
my_region_of_interest <- GRanges("chr3LHet:2500000-2555491")
dna <- getSeq(BSgenome.Dmelanogaster.UCSC.dm3,
my_region_of_interest)
CpG_sites <- shift(GRanges(seqnames(my_region_of_interest),
ranges(matchPattern("CG", dna[[1]]))),
start(my_region_of_interest) - 1)
Then:
CpG_sites
# GRanges object with 1846 ranges and 0 metadata columns:
# seqnames ranges strand
# <Rle> <IRanges> <Rle>
# [1] chr3LHet [2500021, 2500022] *
# [2] chr3LHet [2500058, 2500059] *
# [3] chr3LHet [2500060, 2500061] *
# [4] chr3LHet [2500075, 2500076] *
# [5] chr3LHet [2500080, 2500081] *
# ... ... ... ...
# [1842] chr3LHet [2555387, 2555388] *
# [1843] chr3LHet [2555427, 2555428] *
# [1844] chr3LHet [2555440, 2555441] *
# [1845] chr3LHet [2555464, 2555465] *
# [1846] chr3LHet [2555482, 2555483] *
# -------
# seqinfo: 1 sequence from an unspecified genome; no seqlengths
If you have a small number of regions (e.g. < 1000), you could loop over them and do the above in the loop. However, if you have thousands of regions or more, this might not be very efficient. In that case it might better to start by finding all the CpG sites for all chromosomes:
all_CpG_sites <- vmatchPattern("CG", BSgenome.Dmelanogaster.UCSC.dm3)
all_CpG_sites <- unstrand(all_CpG_sites[strand(all_CpG_sites) == "+"])
all_CpG_sites
# GRanges object with 6650479 ranges and 0 metadata columns:
# seqnames ranges strand
# <Rle> <IRanges> <Rle>
# [1] chr2L [ 1, 2] *
# [2] chr2L [11, 12] *
# [3] chr2L [83, 84] *
# [4] chr2L [85, 86] *
# [5] chr2L [91, 92] *
# ... ... ... ...
# [6650475] chrUextra [29003849, 29003850] *
# [6650476] chrUextra [29003856, 29003857] *
# [6650477] chrUextra [29004133, 29004134] *
# [6650478] chrUextra [29004383, 29004384] *
# [6650479] chrUextra [29004393, 29004394] *
# -------
# seqinfo: 15 sequences from an unspecified genome
and then to subset all_CpG_sites
by overlaps with the GRanges object containing your regions of interest:
subsetByOverlaps(all_CpG_sites, my_regions_of_interest)
Cheers,
H.