Question

can't use weights in single channel analysis?

0

Entering edit mode

Jenny Drnevich ★ 2.0k

@jenny-drnevich-2812

Last seen 13 months ago

United States

Hello, I have some 2-color microarrays that I need to analyze as single channel because some of the conditions I want to compare are unconnected. I've been trying to follow the example in the limmaUsersGuide() on "Separate Channel Analysis of Two-Color Data". However, it appears that neither normalizeBetweenArrays(method="Aquantile") nor the lmscFit() function use the weights in the MAList object. Four of my 19 arrays had lower hybridization efficiency on the bottom quarter of the array (courtesy of an air bubble), leading to very low R and G values for these spots. The rest of the values on the arrays seem fine, so I don't want to completely throw these arrays out, just the bad spots which I've given weights==0. Within normalizeBetweenArrays(), the internal function normalizeQuantiles() can handle missing values, so I was able to do an appropriate Aquantile normalization by replacing the A values for spots with weight==0 with NA. However, you can't have missing values when calculating intraspotCorrelation(), nor can you have missing values for lmscFit(). In looking through the code of lmscFit(), it doesn't use weights, even if they are in the MAList object! So as is, lmscFit() won't give me the proper coefficients for the spots that have weight==0 on one or more arrays. I'm not sure I'm able to modify the code of lmscFit() to use the weights, or even if it can be modified to use the weights. I've though about manually creating a matrix of my R & G values, using NA for spots with weight==0, but I'm not sure how to model the correlation between the R & G channels on an array. Anybody have any suggestions on how I can appropriately analyze my data? Thanks, Jenny sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] grid stats graphics grDevices datasets utils methods base other attached packages: [1] statmod_1.4.1 limma_3.2.1 affyQCReport_1.24.0 lattice_0.17-26 xtable_1.5-6 [6] simpleaffy_2.22.0 genefilter_1.28.2 made4_1.20.0 scatterplot3d_0.3-29 gplots_2.7.4 [11] caTools_1.10 bitops_1.0-4.1 gdata_2.6.1 gtools_2.6.1 RColorBrewer_1.0-2 [16] ade4_1.4-14 affyPLM_1.22.0 preprocessCore_1.8.0 gcrma_2.18.1 affycoretools_1.18.0 [21] KEGG.db_2.3.5 GO.db_2.3.5 RSQLite_0.8-0 DBI_0.2-5 AnnotationDbi_1.8.1 [26] affy_1.24.2 Biobase_2.6.1 RWinEdt_1.8-2 loaded via a namespace (and not attached): [1] affyio_1.14.0 annaffy_1.18.0 annotate_1.24.0 biomaRt_2.2.0 Biostrings_2.14.10 [6] Category_2.12.0 GOstats_2.12.0 graph_1.24.1 GSEABase_1.8.0 IRanges_1.4.9 [11] RBGL_1.22.0 RCurl_1.2-1 splines_2.10.1 survival_2.35-7 tools_2.10.1 [16] XML_2.6-0 Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu

Normalization GO Normalization GO • 853 views

ADD COMMENT • link updated 15.9 years ago by Gordon Smyth 53k • written 15.9 years ago by Jenny Drnevich ★ 2.0k

score 0 · Answer 1 · 2010-03-11

Dear Jenny, I understand your problem. My best suggestion is that you 1. subset your MAList into two, one for the probes which are good on all arrays (MA1), and one for probes which are good on 15 arrays (MA2). For MA2, subset down to 15 arrays. 2. Do a single channel analysis on MA1 with 19 arrays. 3. Do a single channel analysis on MA2 with 15 arrays. Skip the intraspotCorrelation step, instead input to lmscFit() the same intraspot correlation as for MA1. 4. Combine gene lists at the end. Why don't intraspotCorrelation() and lmscFit() allow weights? There are two reasons. Firstly, it would need a lot more work from me. Secondly, it's not quite obvious how to do it. If you have a low weight w for a particular log-ratio on a particular array, one doesn't know in general whether both R and G were bad for that probe or just one. So there is no generally reliable way to translate MAList style weights into single channel weights. Best wishes Gordon On Wed, 10 Mar 2010, Jenny Drnevich wrote: > Hello, > > I have some 2-color microarrays that I need to analyze as single channel > because some of the conditions I want to compare are unconnected. I've been > trying to follow the example in the limmaUsersGuide() on "Separate Channel > Analysis of Two-Color Data". However, it appears that neither > normalizeBetweenArrays(method="Aquantile") nor the lmscFit() function use the > weights in the MAList object. Four of my 19 arrays had lower hybridization > efficiency on the bottom quarter of the array (courtesy of an air bubble), > leading to very low R and G values for these spots. The rest of the values on > the arrays seem fine, so I don't want to completely throw these arrays out, > just the bad spots which I've given weights==0. > > Within normalizeBetweenArrays(), the internal function normalizeQuantiles() > can handle missing values, so I was able to do an appropriate Aquantile > normalization by replacing the A values for spots with weight==0 with NA. > However, you can't have missing values when calculating > intraspotCorrelation(), nor can you have missing values for lmscFit(). In > looking through the code of lmscFit(), it doesn't use weights, even if they > are in the MAList object! So as is, lmscFit() won't give me the proper > coefficients for the spots that have weight==0 on one or more arrays. I'm not > sure I'm able to modify the code of lmscFit() to use the weights, or even if > it can be modified to use the weights. > > I've though about manually creating a matrix of my R & G values, using NA for > spots with weight==0, but I'm not sure how to model the correlation between > the R & G channels on an array. Anybody have any suggestions on how I can > appropriately analyze my data? > > Thanks, > Jenny > > sessionInfo() > R version 2.10.1 (2009-12-14) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] grid stats graphics grDevices datasets utils methods > base > > other attached packages: > [1] statmod_1.4.1 limma_3.2.1 affyQCReport_1.24.0 > lattice_0.17-26 xtable_1.5-6 > [6] simpleaffy_2.22.0 genefilter_1.28.2 made4_1.20.0 > scatterplot3d_0.3-29 gplots_2.7.4 > [11] caTools_1.10 bitops_1.0-4.1 gdata_2.6.1 gtools_2.6.1 > RColorBrewer_1.0-2 > [16] ade4_1.4-14 affyPLM_1.22.0 preprocessCore_1.8.0 > gcrma_2.18.1 affycoretools_1.18.0 > [21] KEGG.db_2.3.5 GO.db_2.3.5 RSQLite_0.8-0 DBI_0.2-5 > AnnotationDbi_1.8.1 > [26] affy_1.24.2 Biobase_2.6.1 RWinEdt_1.8-2 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 annaffy_1.18.0 annotate_1.24.0 biomaRt_2.2.0 > Biostrings_2.14.10 > [6] Category_2.12.0 GOstats_2.12.0 graph_1.24.1 GSEABase_1.8.0 > IRanges_1.4.9 > [11] RBGL_1.22.0 RCurl_1.2-1 splines_2.10.1 survival_2.35-7 > tools_2.10.1 > [16] XML_2.6-0 > > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}