Loading ChAMP IdatFiles with two different ArrayTypes
2
0
Entering edit mode
FWNL15 • 0
@aba2f833
Last seen 3.2 years ago
Netherlands

I want to load IdatFiles from two different datasets to perform ChAMP analysis. One dataset has arraytype 450k, the other EPIC. If I want to load the data using the following script, I have to choose one of the arraytypes and therefore I get an error. Does anyone know how to solve?

filesDATA <- './IdatFiles'
myloadDATA <- champ.load(directory = filesDATA,
                              method="ChAMP",
                              methValue="B",
                              autoimpute=TRUE,
                              filterDetP=TRUE,
                              ProbeCutoff=0,
                              SampleCutoff=0.1,
                              detPcut=0.01,
                              filterBeads=TRUE,
                              beadCutoff=0.05,
                              filterNoCG=TRUE,
                              filterSNPs=TRUE,
                              population=NULL,
                              filterMultiHit=TRUE,
                              filterXY=TRUE, # not default
                              force=FALSE,
                              arraytype=" ")
ChAMP • 1.9k views
ADD COMMENT
0
Entering edit mode

do it separately and then use meta to combine two results

ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

I am not familiar with ChAMP, but there is a function called combineArrays in minfi that is meant to accomplish that task.

ADD COMMENT
0
Entering edit mode
Yuan Tian ▴ 290
@yuan-tian-13904
Last seen 6 months ago
United Kingdom

Hi FWNL15:

Normally I don't recommend people merge two data sets, nor to say this is one EPIC and one 450K. If it's my project, I would try analysis them separately then integrate their analysis results.

But in your case, if you really need to combined them into a big matrix, you can load them separately, then merge them by selecting common CpGs in both dataset. So basically the pipeline would be:

myLoad1 <- champ.load("path/to/450K" ...)
myLoad2 <- champ.load("path/to/EPIC", array=EPIC, ...)

# Then merge ONLY the the beta data. Maybe `merge` function do the work as well.
commonCpG <- intersect(rownames(myLoad1$beta), rownames(myLoad2$beta))
MergedBeta <- cbind(myLoad1$beta[commonCpG, ], myLoad2$beta[commonCpG, ])

Then you may try normalisation, and SVD .etc. In SVD please check the batch effect carefully.

ChAMP is designed to allow users who only have one beta matrix and pheno vector to use. So as long as you merge them into one big matrix, you should be able to use most ChAMP functions.

ADD COMMENT
0
Entering edit mode

I was able to merge the beta data. However, I don't understand how to perform QC and Normalisation on the merged beta data.

champ.QC(beta = MergedBeta$beta, pheno=MergedBeta$pd$Sample_Name, mdsPlot=TRUE, densityPlot=TRUE, dendrogram=TRUE, PDFplot=TRUE, Rplot=TRUE, Feature.sel="None", resultsDir = "./CHAMP_QCimages/MergedBeta/")
[===========================]
[<<<<< ChAMP.QC START >>>>>>]
-----------------------------
champ.QC Results will be saved in ./01_preprocessing/CHAMP_QCimages/MergedBeta/
[QC plots will be proceed with 344782 probes and 84 samples.]

Error in MergedBeta$pd : $ operator is invalid for atomic vectors

Usually I would use the code above for QC, but I get the error above. I don't really now hot to solve. Same problem for Normalisation. In addition; with Normalisation you have to choose the 'arraytype', but it still contains data of both array types.

ADD REPLY
0
Entering edit mode

Hi:

You don't have a merged pd, you also need to do it manually. For example:

MergedPD <- rbind(myLoad1$pd, myLoad2$pd)

And actually, your MergedBeta is a matrix already (equal two myLoad1$beta), so you don't need to specify MergedBeta$beta.

Most ChAMP functions are designed for solo matrix and vector, myLoad$beta is just a matrix, nothing special, so does the MergedBeta. You may want to read the vignette more carefully, and make sure you understand that parameter means.

Note that above code may fail if your two data set have different number of columns (which is common), in that case, you may want to select common columns exists in two pd files then merge. By the way, if I were you, I would try label samples from two different source, like "dataset1" and "dataset2", as your merged should have batch effect, which should be adjusted later in champ.runCombat(), with a source label, you may identify it better.

ADD REPLY

Login before adding your answer.

Traffic: 885 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6