I'm trying to link GEOquery and minfi.
Specifically I want to obtain beta values from the idat files on GEOquery. I was following this guide: https://kasperdanielhansen.github.io/genbioconductor/html/minfi.html, up until the preprocessing part. Meaning that I was able to obtain the RGset. Then I used my own preprocessing code to obtain the beta values. However, I cross checked them with the beta values that were on GEO and they weren't consistent.
For example, the accession number I used was GSE68777. So I went to that study on GEO and clicked on the first sample: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1681154. Then I scrolled down and clicked "Download full table" to download the samples and beta values in a text file.
Then I typed head(beta) and chose the first sample. Then I did command F in the text file for that sample and it's value there wasn't the same as the value from the beta data table. Hopefully you can help find the error.
Here is the code I'm using:
library(GEOquery) library(minfi) library("IlluminaHumanMethylation450kanno.ilmn12.hg19") library("IlluminaHumanMethylation450kmanifest") #Code copied from 450k Guide getGEOSuppFiles("GSE68777") untar("GSE68777/GSE68777_RAW.tar", exdir = "GSE68777/idat") head(list.files("GSE68777/idat", pattern = "idat")) idatFiles <- list.files("GSE68777/idat", pattern = "idat.gz$", full = TRUE) sapply(idatFiles, gunzip, overwrite = TRUE) rgSet <- read.metharray.exp("GSE68777/idat") geoMat <- getGEO("GSE68777") pD.all <- pData(geoMat[]) pD <- pD.all[, c("title", "geo_accession", "characteristics_ch1.1", "characteristics_ch1.2")] names(pD)[c(3,4)] <- c("group", "sex") pD$group <- sub("^diagnosis: ", "", pD$group) pD$sex <- sub("^Sex: ", "", pD$sex) sampleNames(rgSet) <- sub(".*_5", "5", sampleNames(rgSet)) rownames(pD) <- pD$title pD <- pD[sampleNames(rgSet),] pData(rgSet) <- pD