Entering edit mode
Florence Cavalli
▴
50
@florence-cavalli-4588
Last seen 9.2 years ago
Hello,
I am working with 450k methylation data.
- Is there a way to modify the annotation linked to an MethylSet
object,
(or is this something we are not supposed to do)?
I created my large MethylSet object under the pervious version of R
and
Bioc and at the time the annotation package
IlluminaHumanMethylation450kannotation.ilmn.v1.2 was used. However
this
package is not available in BioC 2.13
but IlluminaHumanMethylation450kanno.ilmn12.hg19 replaced it (as far
as I
understood).
I would like to be able to do this since I'd like to use the
new cpgCollapse() and blockFinder() functions from minfi that are
under
BioC 2.13. Indeed cpgCollapse() calls getAnnotation() and since I
cannot
load the annotation package in R-3.0.2/BioC 2.13 an error is returned
(and
the new functions are not available under the previous version of
BioC).
Is there a way around this problem ? I tried the annotation() function
but
without success. Or do I have to reconstruct my MethylSet object from
the
very beginning reading the raw data using the library
IlluminaHumanMethylation450kanno.ilmn12.hg19, normalysing (and re-
running
my clustering to be sure to have consistent results) ? It is just that
I
would save quite some time if I could actually convert my object.
- Also I am planning to use the the following workflow to run
blockFinder()
and find differentially methylated regions.
To be able to ultimately run blockFinder(), using your the minfi test
data
I created a GenomicMethylSet with mapToGenome() function from
the MethylSet object then I ran cpgCollapse() to obtain an
GenomicRatioSet object and finally used this object and a design
matrix to
run blockFinder().
See example:
As in the vignette:
library(IlluminaHumanMethylation450kmanifest)
library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
library(minfi)
require(minfiData)
baseDir <- system.file("extdata", package = "minfiData")
targets <- read.450k.sheet(baseDir)
targets
sub(baseDir, "", targets$Basename)
RGset <- read.450k.exp(base = baseDir, targets = targets)
RGset
pd <- pData(RGset)
pd[,1:4]
Then
MSet.norm <- preprocessIllumina(RGset, bg.correct = TRUE, normalize =
"controls")
Mset.swan <- preprocessSWAN(RGset, MSet.norm)
mset <- Mset.swan
> mset
MethylSet (storageMode: lockedEnvironment)
assayData: 485512 features, 6 samples
element names: Meth, Unmeth
phenoData
sampleNames: 5723646052_R02C02 5723646052_R04C01 ...
5723646053_R06C02 (6 total)
varLabels: Sample_Name Sample_Well ... filenames (13 total)
varMetadata: labelDescription
Annotation
array: IlluminaHumanMethylation450k
annotation: ilmn12.hg19
Preprocessing
Method: SWAN (based on a MethylSet preprocesses as 'Illumina,
bg.correct
= TRUE, normalize = controls, reference = 1'
minfi version: 1.8.1
Manifest version: 0.4.0
>
> class(mset)
[1] "MethylSet"
attr(,"package")
[1] "minfi"
> gmSet = mapToGenome(mset)
> class(gmSet)
[1] "GenomicMethylSet"
attr(,"package")
[1] "minfi"
> gmSet_cpgCollapse=cpgCollapse(gmSet,what="M",returnBlockInfo=FALSE)
[cpgCollapse] Creating annotation.
Error in cpgCollapseAnnotation(gr, relationToIsland, islandName,
maxGap =
maxGap, :
object has to be ordered.
Not sure about the following !? But I seems that sorting the object
works
> gmSet_s=sort(gmSet)
>
gmSet_cpgCollapse=cpgCollapse(gmSet_s,what="M",returnBlockInfo=FALSE)
[cpgCollapse] Creating annotation.
[cpgCollapseAnnotation] Clustering islands and clusters of probes.
[cpgCollapseAnnotation] Computing new annotation.
[cpgCollapseAnnotation] Defining blocks.
[cpgCollapse] Collapsing data......
> class(gmSet_cpgCollapse)
[1] "GenomicRatioSet"
attr(,"package")
[1] "minfi"
## Sample groups
## > targets[,"Sample_Group"]
## [1] "GroupA" "GroupA" "GroupB" "GroupB" "GroupA" "GroupB"
> design <- model.matrix(~ 0+factor(c(1,1,2,2,1,2)))
> colnames(design) <- c("groupA", "groupB")
> head(design)
groupA groupB
1 1 0
2 1 0
3 0 1
4 0 1
5 1 0
6 0 1
> Block_g2=blockFinder(gmSet_cpgCollapse, design=design, coef = 2,
what =
"M",pickCutoff=TRUE,smooth=FALSE)
[bumphunterEngine] Using a single core (backend: doSEQ, version:
1.4.1).
[bumphunterEngine] Computing coefficients.
[bumphunterEngine] Performing 1000 permutations.
[bumphunterEngine] Computing marginal permutation p-values.
[bumphunterEngine] cutoff: 12033998108438140
[bumphunterEngine] Finding regions.
[bumphunterEngine] No bumps found!
Error in res$table$indexStart : $ operator is invalid for atomic
vectors
I assume that if some bumps are found it would not return an error!?
Would you advice to smooth the signal or not?
Is that the proper way of running this analysis ? (I wonder since I
did not
find any example for such analysis)
- Finally I noticed that in the cpgCollapse function the attribute
Relation_to_UCSC_CpG_Island and UCSC_CpG_Islands_Name from a data
frame
return by getAnnotation object are called.
However looking at the result of the getAnnotation() function called
there
isn't any Relation_to_UCSC_CpG_Island and UCSC_CpG_Islands_Name
columns
attributes (see below)? Are we supposed to use another annotation
package?
> cpgCollapse
function (object, what = c("Beta", "M"), maxGap = 500, blockMaxGap =
2.5 *
10^5, maxClusterWidth = 1500, dataSummary = colMeans, na.rm =
FALSE,
returnBlockInfo = TRUE, verbose = TRUE, ...)
{
what <- match.arg(what)
gr <- granges(object)
ann <- getAnnotation(object)
relationToIsland <- ann$Relation_to_UCSC_CpG_Island
islandName <- ann$UCSC_CpG_Islands_Name
....
....
}
class(gmSet)
gmSet
> class(gmSet)
[1] "GenomicMethylSet"
attr(,"package")
[1] "minfi"
> gmSet
class: GenomicMethylSet
dim: 485512 6
exptData(0):
assays(2): Meth Unmeth
rownames(485512): cg13869341 cg14008030 ... cg08265308 cg14273923
rowData metadata column names(0):
colnames(6): 5723646052_R02C02 5723646052_R04C01 ... 5723646053_R05C02
5723646053_R06C02
colData names(13): Sample_Name Sample_Well ... Basename filenames
Annotation
array: IlluminaHumanMethylation450k
annotation: ilmn12.hg19
Preprocessing
Method: SWAN (based on a MethylSet preprocesses as 'Illumina,
bg.correct
= TRUE, normalize = controls, reference = 1'
minfi version: 1.8.1
Manifest version: 0.4.0
A=getAnnotation(gmSet)
colnames(A)
> A=getAnnotation(gmSet)
> colnames(A)
[1] "chr" "pos"
[3] "strand" "Name"
[5] "AddressA" "AddressB"
[7] "ProbeSeqA" "ProbeSeqB"
[9] "Type" "NextBase"
[11] "Color" "Probe_rs"
[13] "Probe_maf" "CpG_rs"
[15] "CpG_maf" "SBE_rs"
[17] "SBE_maf" "Islands_Name"
[19] "Relation_to_Island" "Forward_Sequence"
[21] "SourceSeq" "Random_Loci"
[23] "Methyl27_Loci" "UCSC_RefGene_Name"
[25] "UCSC_RefGene_Accession" "UCSC_RefGene_Group"
[27] "Phantom" "DMR"
[29] "Enhancer" "HMM_Island"
[31] "Regulatory_Feature_Name" "Regulatory_Feature_Group"
[33] "DHS"
> A$Relation_to_UCSC_CpG_Island
NULL
> A$UCSC_CpG_Islands_Name
NULL
sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] minfiData_0.4.2
[2] BiocInstaller_1.12.0
[3] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.2.0
[4] IlluminaHumanMethylation450kmanifest_0.4.0
[5] minfi_1.8.1
[6] bumphunter_1.2.0
[7] locfit_1.5-9.1
[8] iterators_1.0.6
[9] foreach_1.4.1
[10] Biostrings_2.30.0
[11] GenomicRanges_1.14.1
[12] XVector_0.2.0
[13] IRanges_1.20.0
[14] reshape_0.8.4
[15] plyr_1.8
[16] lattice_0.20-24
[17] Biobase_2.22.0
[18] BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] annotate_1.40.0 AnnotationDbi_1.24.0 base64_1.1
[4] beanplot_1.1 codetools_0.2-8 DBI_0.2-7
[7] digest_0.6.3 doRNG_1.5.5 genefilter_1.44.0
[10] grid_3.0.2 illuminaio_0.4.0 itertools_0.1-1
[13] limma_3.18.0 MASS_7.3-29 matrixStats_0.8.12
[16] mclust_4.2 multtest_2.18.0 nlme_3.1-111
[19] nor1mix_1.1-4 pkgmaker_0.17.4 preprocessCore_1.24.0
[22] R.methodsS3_1.5.2 RColorBrewer_1.0-5 registry_0.2
[25] rngtools_1.2.3 RSQLite_0.11.4 siggenes_1.36.0
[28] splines_3.0.2 stats4_3.0.2 stringr_0.6.2
[31] survival_2.37-4 tools_3.0.2 XML_3.98-1.1
[34] xtable_1.7-1
Thank you very much for developing these very useful packages!
Best,
Florence
--
Florence Cavalli, PhD.
The Hospital for Sick Children
Brain Tumour Research Centre
101 College Street, TMDT-11-701
Toronto, ON M5G1L7
Canada
[[alternative HTML version deleted]]