I have DNA methylation data that I have collected from male and female subjects using the Illumina 450K Array platform. I have read the data into R as a "methyLumiSet" from .IDAT files using the "methyLumi" package and I am planning on pre-processing the data using some of the normalization functions in the "wateRmelon" package:
suppressPackageStartupMessages(require(methylumi)) suppressPackageStartupMessages(require(wateRmelon)) suppressPackageStartupMessages(require(IlluminaHumanMethylation450kanno.ilmn12.hg19))# Read sample data. # phenoData is a data.frame with the row names = SentrixPosition_barcode # and columns containing sample group, age and gender. # Barcodes are pulled from a column in phenoData called 'barcodes'. phenoData <- read.csv("/Users/Martens/Desktop/08272014/IDATs/Sample_Sheet.csv",header=TRUE) barcodes <- subset(phenoData, select=barcodes) # Import .IDAT files as methyLumiSet methyLumiSet <- methylumIDAT(barcodes = barcodes, pdat = phenoData, idatPath = "/Users/Martens/Desktop/08272014/IDATs")
As confirmation that all of the features were imported, I checked the number of rows in methyLumiSet:
nrow(methyLumiSet) Features 485577
The methyLumiSet is based off of the eSet class in Biobase. I would like to remove all features spanning X and Y chromosomes, as is common practice in DNA methylation analysis.
As an initial attempt, I tried to determine which probes fall on the Y chromosome using the following code with the idea that I would then remove those probes from the methyLumiSet.
methyLumiSet.ChrY <- methyLumiSet[fData(methyLumiSet)$CHROMOSOME=="Y", ]
however, when I check the number of probes, the result is 0 features:
nrow(methyLumiSet.ChrY) Features 0
I cannot figure out why I am unable to subset features of my methyLumiSet. However, a potential issue might be with the annotation. When I try to run the methyLumi function 'featureFilter' to remove the X chromosome, I get the following error messages:
methyLumiSet.Xfilt <- featureFilter(methyLumiSet, exclude.ChrX = TRUE)
Warning message: In .featureFilter(eset, require.entrez = require.entrez, require.GOBP = require.GOBP, : HumanMethylation450k probes annotate to multiple accessions(!) Error in mget(featureNames(eset), envir = annotate::getAnnMap("CHR", annChip), : error in evaluating the argument 'envir' in selecting a method for function 'mget': Error: getAnnMap: package IlluminaHumanMethylation450k not available
When I try to install IlluminaHumanMethylation450k, I get the following:
source("http://bioconductor.org/biocLite.R") biocLite("IlluminaHumanMethylation450k") BioC_mirror: http://bioconductor.org Using Bioconductor version 3.1 (BiocInstaller 1.18.3), R version 3.2.0. Installing package(s) ‘IlluminaHumanMethylation450k’ Old packages: 'stringi', 'VariantAnnotation' Update all/some/none? [a/s/n]:
I type 'a' to update all and after updating I get the following error message.
# Warning message: package ‘IlluminaHumanMethylation450k’ is not available (for R version 3.2.0)
My only guess is that my issue with subsetting probes by chromosome has to do with not being able to load the annotation information, but I am stuck on trying to figure out how to fix it. According to the reference manual for methyLumi, the package should be compatible with R version 3.2.0 and depends on IlluminaHumanMethylation450kanno.ilmn12.hg19 but even when I require this package I don't understand how to link the annotation data to the methyLumiSet.
any advice on properly annotating a methyLumiSet and/or removing X,Y chromosomes from a methyLumiSet would be great. I am pretty new to R.
I would prefer to do this without coercing to another structure (e.g., SummarizedExperiment/minfi type of object) if possible. Thanks!