lumi: how is the controlData to be read and used?
1
0
Entering edit mode
Pan Du ★ 1.2k
@pan-du-2010
Last seen 10.3 years ago
Hi Gordon, The using of controlProbe data has not been well developed yet. Actually we would like to hear the opinions from you and other developers about how to using control probe information. The Control_Probe_profile.txt file can also be read by lumiR function. But the user has to manually extract the exprs slot and add it into the controlProbe slot. We will further develop on this part. Thanks! Pan On 10/27/07 5:00 AM, "bioconductor-request at stat.math.ethz.ch" <bioconductor-request at="" stat.math.ethz.ch=""> wrote: > Message: 24 > Date: Sat, 27 Oct 2007 15:12:04 +1000 > From: Gordon Smyth <smyth at="" wehi.edu.au=""> > Subject: [BioC] lumi: how is the controlData to be read and used? > To: "BioC Mailing List" <bioconductor at="" stat.math.ethz.ch=""> > Cc: Wei Shi <shi at="" wehi.edu.au="">, Belinda Phipson <phipson at="" wehi.edu.au=""> > Message-ID: <6.2.5.6.1.20071027143740.02d4ab40 at wehi.edu.au> > Content-Type: text/plain; charset="us-ascii"; format=flowed > > The lumi package functions lumiB() and bgAdjust() mention the fact > that control data can be used to background correct Illumina data. > There is however no documentation regarding what the control data > should contain, how it should be read it in, or exactly how it is used. > > I have summary probe profile data output from BeadStudio (not > normalized or background corrected): > > Sample_Probe_Profile.txt > > I also have summary control probe data. This includes both positive > and negative control probes: > > Control_Probe_Profile.txt > > How is it recommended that I read and use the control data, prior to > using lumiT(method="vst")? A complete code example would be helpful. > > Gordon >
probe lumi probe lumi • 1.3k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 minute ago
WEHI, Melbourne, Australia
At 08:11 AM 28/10/2007, Pan Du wrote: >Hi Gordon, > >The using of controlProbe data has not been well developed yet. Actually we >would like to hear the opinions from you and other developers about how to >using control probe information. I am glad that you are open to opinions, but it is disconcerting that you are not prepared to make any recommendations about background correction. This leaves me wondering how to use the lumi package at the moment. The results from vst do change somewhat depending on how the data has been background corrected. If you're not making any recommendations about background correction, does this mean that you're not yet ready to recommend vst either? Are you saying that background correction makes so little difference that vst can be recommended regardless of the background correction method? Or are you saying that the pre-processing methods of the lumi package are in general not yet fully developed or tested, so that we should not view them as recommendations? >The Control_Probe_profile.txt file can also be read by lumiR function. But >the user has to manually extract the exprs slot and add it into the >controlProbe slot. We will further develop on this part. No, this does not work, for several reasons. Firstly, there is no slot called "controlProbe". I think you mean "controlData". Secondly, there is no slot called "exprs". I think you mean the exprs() extractor function. Thirdly, exprs(x) is a matrix, whereas the controlData slot has to be a data.frame. Fourthly, and most serious, when lumiR() reads a control probe profile, exprs(x) loses any information on which probes are negative controls, making it completely useless for background correction. The row names are just probeID numbers: > x <- lumiR(file.path("data","Actn3_Control_Probe_Profile.txt")) Warning message: In lumiR(file.path("data", "Actn3_Control_Probe_Profile.txt")) : Duplicated IDs found and were merged! > rownames(exprs(x))[1:5] [1] "610064" "100610064" "100580056" "580056" "360035" As I said, a complete code example would be helpful! Gordon >Thanks! >Pan > > > > Message: 24 > > Date: Sat, 27 Oct 2007 15:12:04 +1000 > > From: Gordon Smyth <smyth at="" wehi.edu.au=""> > > Subject: [BioC] lumi: how is the controlData to be read and used? > > To: "BioC Mailing List" <bioconductor at="" stat.math.ethz.ch=""> > > Cc: Wei Shi <shi at="" wehi.edu.au="">, Belinda Phipson <phipson at="" wehi.edu.au=""> > > > > The lumi package functions lumiB() and bgAdjust() mention the fact > > that control data can be used to background correct Illumina data. > > There is however no documentation regarding what the control data > > should contain, how it should be read it in, or exactly how it is used. > > > > I have summary probe profile data output from BeadStudio (not > > normalized or background corrected): > > > > Sample_Probe_Profile.txt > > > > I also have summary control probe data. This includes both positive > > and negative control probes: > > > > Control_Probe_Profile.txt > > > > How is it recommended that I read and use the control data, prior to > > using lumiT(method="vst")? A complete code example would be helpful. > > > > Gordon
ADD COMMENT
0
Entering edit mode
>> >> The using of controlProbe data has not been well developed yet. Actually we >> would like to hear the opinions from you and other developers about how to >> using control probe information. > > I am glad that you are open to opinions, but it is disconcerting that > you are not prepared to make any recommendations about background correction. > > This leaves me wondering how to use the lumi package at the moment. > The results from vst do change somewhat depending on how the data has > been background corrected. If you're not making any recommendations > about background correction, does this mean that you're not yet ready > to recommend vst either? > > Are you saying that background correction makes so little difference > that vst can be recommended regardless of the background correction method? > > Or are you saying that the pre-processing methods of the lumi package > are in general not yet fully developed or tested, so that we should > not view them as recommendations? What I mean here for the using of control Probe data is using control Probe information for the quality control information. For the background adjustment part, currently, we believe using the BeadStudio recommended method works well. Of course further improvement is possible. The contribution in this part is very welcome. >> The Control_Probe_profile.txt file can also be read by lumiR function. But >> the user has to manually extract the exprs slot and add it into the >> controlProbe slot. We will further develop on this part. > > No, this does not work, for several reasons. > > Firstly, there is no slot called "controlProbe". I think you mean > "controlData". > > Secondly, there is no slot called "exprs". I think you mean the > exprs() extractor function. > > Thirdly, exprs(x) is a matrix, whereas the controlData slot has to be > a data.frame. > > Fourthly, and most serious, when lumiR() reads a control probe > profile, exprs(x) loses any information on which probes are negative > controls, making it completely useless for background correction. The > row names are just probeID numbers: We will work on this part later. Because we are busy with other projects, it may take several weeks. Thanks! Pan
ADD REPLY
0
Entering edit mode
At 10:17 PM 28/10/2007, Pan Du wrote: >What I mean here for the using of control Probe data is using control Probe >information for the quality control information. For the background >adjustment part, currently, we believe using the BeadStudio recommended >method works well. Of course further improvement is possible. The >contribution in this part is very welcome. OK, good, now we're getting somewhere. You're recommending BeadStudio's global background correction. Let me now rephrase my original question. Suppose that I have BeadStudio output data which is not background corrected. How can I use R to reproduce the background correction that BeadStudio would have done? This is a very important question, because most Bioconductor users of the lumi package will I guess have Illumina output data which is not normalized and not background corrected. And we will not necessarily want to go back to BeadStudio to background correct. I have summary probe profile data output from BeadStudio which is not background corrected. Let me repeat, it is not background corrected. Sample_Probe_Profile.txt I also have control probe summary profiles and control gene summary profiles. This includes both positive and negative control probes: Control_Probe_Profile.txt Control_Gene_Profile.txt I should surely be able to reproduce BeadStudio's background correction. Here is my best effort using the lumi package. Is this what you recommend? library(lumi) x <- lumiR("Sample_Probe_Profile.txt") controlgp <- lumiR("Control_Gene_Profile.txt") x at controlData <- as.data.frame(exprs(controlgp)) xb <- lumiB(x,method="bgAdjust") y <- lumiT(xb,method="vst") y <- lumiN(y,method="quantile") As you can see from the results below, lumiB() simply subtracted the negative control expression value from the expression values for each array. Best wishes Gordon > exprs(controlgp)[,1:4] 1957998084_A 1957998084_B 1957998084_C 1957998084_D biotin 11508.6 10857.9 10641.8 10536.3 cy3_hyb 20252.0 19227.1 18964.8 19457.2 high_stringency_hyb 47593.1 43267.2 43966.6 43207.8 housekeeping 16185.3 14039.6 13277.5 13280.2 labeling 85.2 89.5 77.4 80.7 low_stringency_hyb 17650.5 16441.4 16330.1 16844.8 negative 92.0 90.0 83.2 88.1 > summary(exprs(x)[,1:4]) 1957998084_A 1957998084_B 1957998084_C 1957998084_D Min. : 52.9 Min. : 50.2 Min. : 48.6 Min. : 54.1 1st Qu.: 86.6 1st Qu.: 84.3 1st Qu.: 78.2 1st Qu.: 82.3 Median : 99.0 Median : 96.6 Median : 88.7 Median : 93.9 Mean : 511.4 Mean : 501.0 Mean : 400.3 Mean : 448.0 3rd Qu.: 163.9 3rd Qu.: 159.3 3rd Qu.: 138.3 3rd Qu.: 148.9 Max. :59875.4 Max. :57223.1 Max. :50414.0 Max. :49213.6 > summary(exprs(xb)[,1:4]) 1957998084_A 1957998084_B 1957998084_C 1957998084_D Min. : -39.09 Min. : -39.83 Min. : -34.64 Min. : -34.08 1st Qu.: -5.40 1st Qu.: -5.73 1st Qu.: -5.01 1st Qu.: -5.80 Median : 7.05 Median : 6.65 Median : 5.48 Median : 5.76 Mean : 419.47 Mean : 411.01 Mean : 317.04 Mean : 359.90 3rd Qu.: 71.95 3rd Qu.: 69.27 3rd Qu.: 55.08 3rd Qu.: 60.77 Max. :59783.48 Max. :57133.12 Max. :50330.79 Max. :49125.42
ADD REPLY
0
Entering edit mode
Hi Gordon, Sorry for replying late. I think that should work because the Control_Gene_Profile.txt file basically averaged the negative control probes. As described in the BeadStudio manual, its background adjustment basically subtact the mean of negative control probes. But I am not sure whether BeadStudio did outlier removal or not. Anyway, the results should be close. Also I will update lumiR function (or write a new function) to read the Control_Probe_Profile.txt because the negative control probes have the same probe Ids. Thanks! Pan On 10/28/07 9:03 PM, "Gordon Smyth" <smyth at="" wehi.edu.au=""> wrote: > At 10:17 PM 28/10/2007, Pan Du wrote: >> What I mean here for the using of control Probe data is using control Probe >> information for the quality control information. For the background >> adjustment part, currently, we believe using the BeadStudio recommended >> method works well. Of course further improvement is possible. The >> contribution in this part is very welcome. > > OK, good, now we're getting somewhere. You're recommending > BeadStudio's global background correction. Let me now rephrase my > original question. Suppose that I have BeadStudio output data which > is not background corrected. How can I use R to reproduce the > background correction that BeadStudio would have done? > > This is a very important question, because most Bioconductor users of > the lumi package will I guess have Illumina output data which is not > normalized and not background corrected. And we will not necessarily > want to go back to BeadStudio to background correct. > > I have summary probe profile data output from BeadStudio which is not > background corrected. Let me repeat, it is not background corrected. > > Sample_Probe_Profile.txt > > I also have control probe summary profiles and control gene summary > profiles. This includes both positive and negative control probes: > > Control_Probe_Profile.txt > Control_Gene_Profile.txt > > I should surely be able to reproduce BeadStudio's background > correction. Here is my best effort using the lumi package. Is this > what you recommend? > > library(lumi) > x <- lumiR("Sample_Probe_Profile.txt") > controlgp <- lumiR("Control_Gene_Profile.txt") > x at controlData <- as.data.frame(exprs(controlgp)) > xb <- lumiB(x,method="bgAdjust") > y <- lumiT(xb,method="vst") > y <- lumiN(y,method="quantile") > > As you can see from the results below, lumiB() simply subtracted the > negative control expression value from the expression values for each array. > > Best wishes > Gordon > > >> exprs(controlgp)[,1:4] > 1957998084_A 1957998084_B 1957998084_C 1957998084_D > biotin 11508.6 10857.9 10641.8 10536.3 > cy3_hyb 20252.0 19227.1 18964.8 19457.2 > high_stringency_hyb 47593.1 43267.2 43966.6 43207.8 > housekeeping 16185.3 14039.6 13277.5 13280.2 > labeling 85.2 89.5 77.4 80.7 > low_stringency_hyb 17650.5 16441.4 16330.1 16844.8 > negative 92.0 90.0 83.2 88.1 >> summary(exprs(x)[,1:4]) > 1957998084_A 1957998084_B 1957998084_C 1957998084_D > Min. : 52.9 Min. : 50.2 Min. : 48.6 Min. : 54.1 > 1st Qu.: 86.6 1st Qu.: 84.3 1st Qu.: 78.2 1st Qu.: 82.3 > Median : 99.0 Median : 96.6 Median : 88.7 Median : 93.9 > Mean : 511.4 Mean : 501.0 Mean : 400.3 Mean : 448.0 > 3rd Qu.: 163.9 3rd Qu.: 159.3 3rd Qu.: 138.3 3rd Qu.: 148.9 > Max. :59875.4 Max. :57223.1 Max. :50414.0 Max. :49213.6 >> summary(exprs(xb)[,1:4]) > 1957998084_A 1957998084_B 1957998084_C 1957998084_D > Min. : -39.09 Min. : -39.83 Min. : -34.64 Min. : -34.08 > 1st Qu.: -5.40 1st Qu.: -5.73 1st Qu.: -5.01 1st Qu.: -5.80 > Median : 7.05 Median : 6.65 Median : 5.48 Median : 5.76 > Mean : 419.47 Mean : 411.01 Mean : 317.04 Mean : 359.90 > 3rd Qu.: 71.95 3rd Qu.: 69.27 3rd Qu.: 55.08 3rd Qu.: 60.77 > Max. :59783.48 Max. :57133.12 Max. :50330.79 Max. :49125.42 > > >
ADD REPLY
0
Entering edit mode
At 10:00 AM 30/10/2007, Pan Du wrote: >Hi Gordon, > >Sorry for replying late. I think that should work because the >Control_Gene_Profile.txt file basically averaged the negative control >probes. As described in the BeadStudio manual, its background adjustment >basically subtact the mean of negative control probes. But I am not sure >whether BeadStudio did outlier removal or not. Anyway, the results should be >close. Thanks. >Also I will update lumiR function (or write a new function) to read the >Control_Probe_Profile.txt because the negative control probes have the same >probe Ids. Actually, the ProbeIDs are all different for the negative controls. It is the TargetIDs which are the same. Repetition of ProbeIDs only occurs when the same probe can be classified as more than one type of control (for example mouse probe 60019 is both a cy3_hyb control and a low_stringency_hyb control). Best wishes Gordon >Thanks! >Pan
ADD REPLY

Login before adding your answer.

Traffic: 356 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6