The dataset I am working on is Illumina HumanMethylation450 BeadChip GEO- GSE86829 and have too perform a similar normalization on other Illumina HumanMethylation 850k data too. Although normalised files are present the project requires me to do it from scratch to bring uniformity. I have found an R code for normalization of IDAT files but that requires a .bgx manifest file too. Posting the code below.
library(limma)
x <- read.idat(idatfiles, bgxfile)
y <- neqc(data)
My problem is that the above data and many others that I am working on does not have a .bgx file and instead have manifest file as .bpm or .csv provided along with GSE86829_RAW.tar file. I am new to processing microarray data and R libraries. Please help with a code that could work to simply normalize all the IDAT files for every sample in a directory and provide output as a single .txt with every sample in columns. Basically what I need is an equivalent of what ReadAffy does for CEL files, I need a similar one for IDAT files that does not need .bgx and work with .txt/.csv as manifest file.
library(affy)
celpath = "/mnt/store_room/Dataset1/processing/validation/GSE65663_RAW"
data = ReadAffy(celfile.path=celpath)
OR provide a different tool altogether to process IDAT files easily without much coding
From the workflow link you provided, I tried to proceed following their steps-
Warning message: In readChar(con, nchars = n) : truncating string with embedded nuls
class: RGChannelSet dim: 622399 15 metadata(0): assays(2): Green Red rownames(622399): 10600313 10600322 ... 74810490 74810492 rowData names(0): colnames(15): GSM2309154_6264509024_R01C02 GSM2309155_6264509024_R02C02 ... GSM2309167_200190110117_R01C02 GSM2309168_200190110117_R03C02 colData names(0): Annotation array: IlluminaHumanMethylation450k annotation: ilmn12.hg19
[preprocessQuantile] Mapping to genome. [preprocessQuantile] Fixing outliers. [preprocessQuantile] Quantile normalizing.
Seems like it worked till here. But I need the normalised values in matrix with the sample name as columns and probe_IDs as rows, hence trying the code below but it throws the following error-
Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'as.matrix': no method for coercing this S4 class to a vector
Similarly with
> write.table(mSetSq, file='/mnt/store_room/DatasetM')
Error in as.vector(x) : no method for coercing this S4 class to a vectorI believe the
write.matrix
function is from theMASS
package, and is intended to write amatrix
ordata.frame
to a file. But that's not what you have, so it is not unexpected that it should fail. I would also note that having just the normalized values (which normalized values, btw? There are two sets!) in a matrix is almost surely not what you want. Part of the analysis of methylation data requires you to know where each CpG is located, and theGenomicRatioSet
that is generated bypreprocessQuantile
contains that information. If you export the 'normalized values' to a file, you lose that information.The
minfi
package is intended to generate objects that are then useful for analysis using other packages, and the workflow I pointed you to gives a very good explanation of how they all fit together. Is there some reason that the workflow is not useful for your purposes?