Question

Loading ChAMP IdatFiles with two different ArrayTypes

0

Entering edit mode

FWNL15 • 0

@aba2f833

Last seen 2.6 years ago

Netherlands

I want to load IdatFiles from two different datasets to perform ChAMP analysis. One dataset has arraytype 450k, the other EPIC. If I want to load the data using the following script, I have to choose one of the arraytypes and therefore I get an error. Does anyone know how to solve?

filesDATA <- './IdatFiles'
myloadDATA <- champ.load(directory = filesDATA,
                              method="ChAMP",
                              methValue="B",
                              autoimpute=TRUE,
                              filterDetP=TRUE,
                              ProbeCutoff=0,
                              SampleCutoff=0.1,
                              detPcut=0.01,
                              filterBeads=TRUE,
                              beadCutoff=0.05,
                              filterNoCG=TRUE,
                              filterSNPs=TRUE,
                              population=NULL,
                              filterMultiHit=TRUE,
                              filterXY=TRUE, # not default
                              force=FALSE,
                              arraytype=" ")

ChAMP • 1.5k views

ADD COMMENT • link updated 2.7 years ago by Yuan Tian ▴ 280 • written 2.8 years ago by FWNL15 • 0

0

Entering edit mode

do it separately and then use meta to combine two results

ADD REPLY • link 2.8 years ago Shicheng Guo • 0

score 0 · Answer 1 · 2021-07-22

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 6 hours ago

United States

I am not familiar with ChAMP, but there is a function called combineArrays in minfi that is meant to accomplish that task.

ADD COMMENT • link 2.8 years ago James W. MacDonald 65k

score 0 · Answer 2 · 2021-07-25

0

Entering edit mode

Yuan Tian ▴ 280

@yuan-tian-13904

Last seen 4 days ago

United Kingdom

Hi FWNL15:

Normally I don't recommend people merge two data sets, nor to say this is one EPIC and one 450K. If it's my project, I would try analysis them separately then integrate their analysis results.

But in your case, if you really need to combined them into a big matrix, you can load them separately, then merge them by selecting common CpGs in both dataset. So basically the pipeline would be:

myLoad1 <- champ.load("path/to/450K" ...)
myLoad2 <- champ.load("path/to/EPIC", array=EPIC, ...)

# Then merge ONLY the the beta data. Maybe `merge` function do the work as well.
commonCpG <- intersect(rownames(myLoad1$beta), rownames(myLoad2$beta))
MergedBeta <- cbind(myLoad1$beta[commonCpG, ], myLoad2$beta[commonCpG, ])

Then you may try normalisation, and SVD .etc. In SVD please check the batch effect carefully.

ChAMP is designed to allow users who only have one beta matrix and pheno vector to use. So as long as you merge them into one big matrix, you should be able to use most ChAMP functions.

ADD COMMENT • link 2.8 years ago Yuan Tian ▴ 280

0

Entering edit mode

I was able to merge the beta data. However, I don't understand how to perform QC and Normalisation on the merged beta data.

champ.QC(beta = MergedBeta$beta, pheno=MergedBeta$pd$Sample_Name, mdsPlot=TRUE, densityPlot=TRUE, dendrogram=TRUE, PDFplot=TRUE, Rplot=TRUE, Feature.sel="None", resultsDir = "./CHAMP_QCimages/MergedBeta/")
[===========================]
[<<<<< ChAMP.QC START >>>>>>]
-----------------------------
champ.QC Results will be saved in ./01_preprocessing/CHAMP_QCimages/MergedBeta/
[QC plots will be proceed with 344782 probes and 84 samples.]

Error in MergedBeta$pd : $ operator is invalid for atomic vectors

Usually I would use the code above for QC, but I get the error above. I don't really now hot to solve. Same problem for Normalisation. In addition; with Normalisation you have to choose the 'arraytype', but it still contains data of both array types.

ADD REPLY • link 2.7 years ago FWNL15 • 0

0

Entering edit mode

Hi:

You don't have a merged pd, you also need to do it manually. For example:

MergedPD <- rbind(myLoad1$pd, myLoad2$pd)

And actually, your MergedBeta is a matrix already (equal two myLoad1$beta), so you don't need to specify MergedBeta$beta.

Most ChAMP functions are designed for solo matrix and vector, myLoad$beta is just a matrix, nothing special, so does the MergedBeta. You may want to read the vignette more carefully, and make sure you understand that parameter means.

Note that above code may fail if your two data set have different number of columns (which is common), in that case, you may want to select common columns exists in two pd files then merge. By the way, if I were you, I would try label samples from two different source, like "dataset1" and "dataset2", as your merged should have batch effect, which should be adjusted later in champ.runCombat(), with a source label, you may identify it better.

ADD REPLY • link 2.7 years ago Yuan Tian ▴ 280