Question: QC and normalization of 450K data from CSVs instead of IDATs
gravatar for Simone
15 months ago by
Simone170 wrote:


I have got some 450K data in CSV format, containing the following information (column names):  

CSV file 1:
SampleN.AVG_Beta, SampleN.Intensity, SampleN.Signal_A, SampleN.Signal_B

CSV file 2:
SampleN.Signal_Red, SampleN.Signal_Grn, SampleN.Pval

CSV file 3:
SampleID, Sample_Well, Sample_plate, Sentrix_ID, Sentrix_Position, [phenotype data...]

I have been able to create a MethyLumiSet from these data. However, to be able to normalize the data with ChAMP or minfi I would need an RGChannelSet and MethylSet for most normalization methods. When trying to convert the MethyLumiSet into an RGChannelSet I get the following error:

> methyl <- methylumiR(filename="data/mydata.csv", sampleDescriptions=annot)
> myrgset <- as(methyl, "RGChannelSet")
Error in methylumiToMinfi(from) :
  Cannot construct an RGChannelSet without full (OOB) intensities

I cannot get hold of out-of-band intensities or the IDATs of these data. I know I could simply perform PBC or standard quantile normalization on beta values. But I was wondering if there was a way of taking advantage of the fact that I have got more information than just beta values (i.e. red and green channel intensities, etc), for quality assessment and normalization procedures, even though I do not have OOB intensities. Is there a way of doing so in minfi or ChAMP, or do you have any other suggestions about how to deal with this?

Best wishes,

normalization minfi champ 450k csv • 278 views
ADD COMMENTlink modified 15 months ago by Yuan Tian110 • written 15 months ago by Simone170
Answer: QC and normalization of 450K data from CSVs instead of IDATs
gravatar for Yuan Tian
15 months ago by
Yuan Tian110
University College London
Yuan Tian110 wrote:

Hello Simone:

I am not sure where did you get these three CSV, but ChAMP actually does NOT need any MythyLumiSet object, I am sure about that. All ChAMP function support solo beta matrix, and a list of phenotype, that's all. So I suspect you can directly modify your CSV file, extract beta matrix, Intensity matrix .e.g from them. Then directly use ChAMP to do normalization, analysis...  I guess this information is hidden in your CSV1.

Maybe you can paste couple example of your CSV file here, thus more people could help you better.


Yuan Tian

ADD COMMENTlink written 15 months ago by Yuan Tian110

Thanks for your reply, Yuan. I know that ChAMP does not need an MethyLumiSet and that I can run

norm_pbc <- champ.norm(beta=mybeta, method="PBC", arraytype="450K")

on just the beta values. The same for BMIQ.

But for FunctionNormalization, I would require an rgSet, and for SWAN normalization both an rgSet and MethylSet.

Importantly, I was also thinking that it might make sense to make use of the additional data I have got (Signal_A, Signal_B, Signal_Grn, Signal_Red, ...) for performing some QC (as far as possible without having the IDATs?).

I  had already described all information content of the CSV files in my original posts. I think these files have been generated using GenomeStudio, but I do not know for sure. Which additional information would you require? Please let me know.

ADD REPLYlink written 15 months ago by Simone170
em...I see. Yes currently Functional Normalization indeed call for S4 object from minfi's reading function. I was thinking BMIQ solution. um...I can not see any easy solution now. Maybe some hacking on minfi's code work. Your data looks very much similar to a stage between IDAT file and RGChannalSet. Best Yuan Tian
ADD REPLYlink written 15 months ago by Yuan Tian110
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 330 users visited in the last hour