HT HG-U133+ PM Array Plate: cdf and probe packages discrepancies
1
0
Entering edit mode
@marianne-tuefferd-3554
Last seen 9.7 years ago
Dear list, I am trying to analyze Affymetrix HT HG-U133+ PM Array Plate data. I found some discrepancies between cdf and probe packages. In fact for some probesets, the cdf package contains more information than the probe package. These probesets are apparently control ones, but is it still expected? (I did not find any difference in HG-U133Plus2 array) Thanks a lot for your help Kind regards Marianne > sizePSinCDFnotinProbe AFFX-NONSPECIFICGC10_AT AFFX-NONSPECIFICGC11_AT AFFX- NONSPECIFICGC12_AT 952 960 973 AFFX-NONSPECIFICGC13_AT AFFX-NONSPECIFICGC14_AT AFFX- NONSPECIFICGC15_AT 968 960 949 AFFX-NONSPECIFICGC16_AT AFFX-NONSPECIFICGC17_AT AFFX- NONSPECIFICGC18_AT 963 942 912 AFFX-NONSPECIFICGC19_AT AFFX-NONSPECIFICGC20_AT AFFX- NONSPECIFICGC21_AT 849 813 697 AFFX-NONSPECIFICGC22_AT AFFX-NONSPECIFICGC23_AT AFFX- NONSPECIFICGC24_AT 585 407 268 AFFX-NONSPECIFICGC25_AT AFFX-NONSPECIFICGC3_AT AFFX- NONSPECIFICGC4_AT 9 25 322 AFFX-NONSPECIFICGC5_AT AFFX-NONSPECIFICGC6_AT AFFX- NONSPECIFICGC7_AT 703 873 914 AFFX-NONSPECIFICGC8_AT AFFX-NONSPECIFICGC9_AT AFFX-R2-TAGA_AT 940 959 11 AFFX-R2-TAGB_AT AFFX-R2-TAGC_AT AFFX-R2-TAGD_AT 11 11 11 AFFX-R2-TAGE_AT AFFX-R2-TAGF_AT AFFX-R2-TAGG_AT 11 11 11 AFFX-R2-TAGH_AT AFFX-R2-TAGIN-3_AT AFFX-R2-TAGIN- 5_AT 11 11 11 AFFX-R2-TAGIN-M_AT AFFX-R2-TAGJ-3_AT AFFX-R2-TAGJ- 5_AT 11 11 11 AFFX-R2-TAGO-3_AT AFFX-R2-TAGO-5_AT AFFX-R2-TAGQ- 3_AT 11 11 11 AFFX-R2-TAGQ-5_AT 11 > unlist(lapply(PRinfoPSinCDFnotinProbe_spl, nrow)) AFFX-NONSPECIFICGC10_AT AFFX-NONSPECIFICGC11_AT AFFX- NONSPECIFICGC12_AT 1 1 1 AFFX-NONSPECIFICGC13_AT AFFX-NONSPECIFICGC14_AT AFFX- NONSPECIFICGC15_AT 1 1 1 AFFX-NONSPECIFICGC16_AT AFFX-NONSPECIFICGC17_AT AFFX- NONSPECIFICGC18_AT 1 1 1 AFFX-NONSPECIFICGC19_AT AFFX-NONSPECIFICGC20_AT AFFX- NONSPECIFICGC21_AT 1 1 1 AFFX-NONSPECIFICGC22_AT AFFX-NONSPECIFICGC23_AT AFFX- NONSPECIFICGC24_AT 1 1 1 AFFX-NONSPECIFICGC25_AT AFFX-NONSPECIFICGC3_AT AFFX- NONSPECIFICGC4_AT 1 1 1 AFFX-NONSPECIFICGC5_AT AFFX-NONSPECIFICGC6_AT AFFX- NONSPECIFICGC7_AT 1 1 1 AFFX-NONSPECIFICGC8_AT AFFX-NONSPECIFICGC9_AT AFFX-R2-TAGA_AT 1 1 1 AFFX-R2-TAGB_AT AFFX-R2-TAGC_AT AFFX-R2-TAGD_AT 1 1 1 AFFX-R2-TAGE_AT AFFX-R2-TAGF_AT AFFX-R2-TAGG_AT 1 1 1 AFFX-R2-TAGH_AT AFFX-R2-TAGIN-3_AT AFFX-R2-TAGIN- 5_AT 1 1 1 AFFX-R2-TAGIN-M_AT AFFX-R2-TAGJ-3_AT AFFX-R2-TAGJ- 5_AT 1 1 1 AFFX-R2-TAGO-3_AT AFFX-R2-TAGO-5_AT AFFX-R2-TAGQ- 3_AT 1 1 1 AFFX-R2-TAGQ-5_AT 1 The corresponding code is below: library*(*affy*)* library*(*hthgu133pluspmcdf*)* library*(*hthgu133pluspmprobe*)* PSn *<-* ls*(*hthgu133pluspmcdf*)* PSHT *<-* mget*(*PSn, hthgu133pluspmcdf*)* names*(*PSHT*)* *<-* toupper*(*names*(*PSHT*))* cdfInfo *<-* unlist*(*lapply*(*PSHT, *function**(*el*){*el*[*,1*]**}))* cdfInfo *<-* paste*(*cdfInfo, sub*(*"_AT\\w*$", "_AT", names*(*cdfInfo*))*, sep = "."*)* PSn *<-* toupper*(*PSn*)* HTprobe *<-* as.data.frame*(*hthgu133pluspmprobe*)* HTprobe*$*abs *<-* xy2indices*(*HTprobe*$*x, HTprobe*$*y, nr = 744*)* HTprobe*$*Probe.Set.Name <http: probe.set.name=""/> *<-* toupper*(*HTprobe*$* Probe.Set.Name <http: probe.set.name=""/>*)* ProbeInfo *<-* paste*(*HTprobe*$*abs, HTprobe*$*Probe.Set.Name<http: probe.set.name=""/>, sep = "."*)* length*(*unlist*(*lapply*(*PSHT, *function**(*el*){*el*[*,1*]**})))* *==*length *(*HTprobe*$*abs*)* ## FLAG!! length*(*intersect*(*ProbeInfo, cdfInfo*))* length*(*setdiff*(*ProbeInfo, cdfInfo*))* length*(*setdiff*(*cdfInfo, ProbeInfo*))* ## in common 519200 probe absolute positions PSlocinCDFnotinProbe *<-* setdiff*(*cdfInfo, ProbeInfo*)* PSinCDFnotinProbe *<-* unique*(*sub*(*"^.*\\.", "", PSlocinCDFnotinProbe*))* sizePSinCDFnotinProbe *<-* listLen*(*PSHT*[*PSinCDFnotinProbe*]**)*/2 names*(*sizePSinCDFnotinProbe*)* *<-* PSinCDFnotinProbe PRinfoPSinCDFnotinProbe *<-* HTprobe*[*HTprobe*$*Probe.Set.Name<http: probe.set.name=""/>%in% PSinCDFnotinProbe, *]* PRinfoPSinCDFnotinProbe_spl *<-* split*(*PRinfoPSinCDFnotinProbe, PRinfoPSinCDFnotinProbe*$*Probe.Set.Name <http: probe.set.name=""/>*)* unlist*(*lapply*(*PRinfoPSinCDFnotinProbe_spl, nrow*))* PS: my sessionInfo is: >sessionInfo() R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_M ONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia. 1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] hthgu133pluspmprobe_2.4.0 AnnotationDbi_1.6.1 [3] hthgu133pluspmcdf_2.4.0 affy_1.22.0 [5] Biobase_2.4.1 loaded via a namespace (and not attached): [1] affyio_1.12.0 DBI_0.2-4 preprocessCore_1.6.0 [4] RSQLite_0.7-1 tools_2.9.0 [[alternative HTML version deleted]]
cdf probe cdf probe • 996 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 5 days ago
United States
Hi Marianne, The cdf and probe packages we supply are simply re-packaging of the original Affy data. We don't add or subtract any of the data, so any discrepancies are due to differences in the data we get from Affy. There are several chips for which the probe and cdf data are not consistent, although AFAIK the differences are always control probes so not critical. Best, Jim Marianne Tuefferd wrote: > Dear list, > > I am trying to analyze Affymetrix HT HG-U133+ PM Array Plate data. I found > some discrepancies between cdf and probe packages. In fact for some > probesets, the cdf package contains more information than the probe package. > These probesets are apparently control ones, but is it still expected? (I > did not find any difference in HG-U133Plus2 array) > > Thanks a lot for your help > > Kind regards > > Marianne > >> sizePSinCDFnotinProbe > > AFFX-NONSPECIFICGC10_AT AFFX-NONSPECIFICGC11_AT AFFX- NONSPECIFICGC12_AT > > 952 960 973 > > AFFX-NONSPECIFICGC13_AT AFFX-NONSPECIFICGC14_AT AFFX- NONSPECIFICGC15_AT > > 968 960 949 > > AFFX-NONSPECIFICGC16_AT AFFX-NONSPECIFICGC17_AT AFFX- NONSPECIFICGC18_AT > > 963 942 912 > > AFFX-NONSPECIFICGC19_AT AFFX-NONSPECIFICGC20_AT AFFX- NONSPECIFICGC21_AT > > 849 813 697 > > AFFX-NONSPECIFICGC22_AT AFFX-NONSPECIFICGC23_AT AFFX- NONSPECIFICGC24_AT > > 585 407 268 > > AFFX-NONSPECIFICGC25_AT AFFX-NONSPECIFICGC3_AT AFFX- NONSPECIFICGC4_AT > > 9 25 322 > > AFFX-NONSPECIFICGC5_AT AFFX-NONSPECIFICGC6_AT AFFX- NONSPECIFICGC7_AT > > 703 873 914 > > AFFX-NONSPECIFICGC8_AT AFFX-NONSPECIFICGC9_AT AFFX-R2-TAGA_AT > > 940 959 11 > > AFFX-R2-TAGB_AT AFFX-R2-TAGC_AT AFFX-R2-TAGD_AT > > 11 11 11 > > AFFX-R2-TAGE_AT AFFX-R2-TAGF_AT AFFX-R2-TAGG_AT > > 11 11 11 > > AFFX-R2-TAGH_AT AFFX-R2-TAGIN-3_AT AFFX-R2-TAGIN- 5_AT > > 11 11 11 > > AFFX-R2-TAGIN-M_AT AFFX-R2-TAGJ-3_AT AFFX-R2-TAGJ- 5_AT > > 11 11 11 > > AFFX-R2-TAGO-3_AT AFFX-R2-TAGO-5_AT AFFX-R2-TAGQ- 3_AT > > 11 11 11 > > AFFX-R2-TAGQ-5_AT > > 11 > >> unlist(lapply(PRinfoPSinCDFnotinProbe_spl, nrow)) > > AFFX-NONSPECIFICGC10_AT AFFX-NONSPECIFICGC11_AT AFFX- NONSPECIFICGC12_AT > > 1 1 1 > > AFFX-NONSPECIFICGC13_AT AFFX-NONSPECIFICGC14_AT AFFX- NONSPECIFICGC15_AT > > 1 1 1 > > AFFX-NONSPECIFICGC16_AT AFFX-NONSPECIFICGC17_AT AFFX- NONSPECIFICGC18_AT > > 1 1 1 > > AFFX-NONSPECIFICGC19_AT AFFX-NONSPECIFICGC20_AT AFFX- NONSPECIFICGC21_AT > > 1 1 1 > > AFFX-NONSPECIFICGC22_AT AFFX-NONSPECIFICGC23_AT AFFX- NONSPECIFICGC24_AT > > 1 1 1 > > AFFX-NONSPECIFICGC25_AT AFFX-NONSPECIFICGC3_AT AFFX- NONSPECIFICGC4_AT > > 1 1 1 > > AFFX-NONSPECIFICGC5_AT AFFX-NONSPECIFICGC6_AT AFFX- NONSPECIFICGC7_AT > > 1 1 1 > > AFFX-NONSPECIFICGC8_AT AFFX-NONSPECIFICGC9_AT AFFX-R2-TAGA_AT > > 1 1 1 > > AFFX-R2-TAGB_AT AFFX-R2-TAGC_AT AFFX-R2-TAGD_AT > > 1 1 1 > > AFFX-R2-TAGE_AT AFFX-R2-TAGF_AT AFFX-R2-TAGG_AT > > 1 1 1 > > AFFX-R2-TAGH_AT AFFX-R2-TAGIN-3_AT AFFX-R2-TAGIN- 5_AT > > 1 1 1 > > AFFX-R2-TAGIN-M_AT AFFX-R2-TAGJ-3_AT AFFX-R2-TAGJ- 5_AT > > 1 1 1 > > AFFX-R2-TAGO-3_AT AFFX-R2-TAGO-5_AT AFFX-R2-TAGQ- 3_AT > > 1 1 1 > > AFFX-R2-TAGQ-5_AT > > 1 > > The corresponding code is below: > > library*(*affy*)* > > library*(*hthgu133pluspmcdf*)* > > library*(*hthgu133pluspmprobe*)* > > PSn *<-* ls*(*hthgu133pluspmcdf*)* > > PSHT *<-* mget*(*PSn, hthgu133pluspmcdf*)* > > names*(*PSHT*)* *<-* toupper*(*names*(*PSHT*))* > > cdfInfo *<-* unlist*(*lapply*(*PSHT, *function**(*el*){*el*[*,1*]**}))* > > cdfInfo *<-* paste*(*cdfInfo, sub*(*"_AT\\w*$", "_AT", names*(*cdfInfo*))*, > sep = "."*)* > > PSn *<-* toupper*(*PSn*)* > > HTprobe *<-* as.data.frame*(*hthgu133pluspmprobe*)* > > HTprobe*$*abs *<-* xy2indices*(*HTprobe*$*x, HTprobe*$*y, nr = 744*)* > > HTprobe*$*Probe.Set.Name <http: probe.set.name=""/> *<-* toupper*(*HTprobe*$* > Probe.Set.Name <http: probe.set.name=""/>*)* > > ProbeInfo *<-* paste*(*HTprobe*$*abs, > HTprobe*$*Probe.Set.Name<http: probe.set.name=""/>, > sep = "."*)* > > length*(*unlist*(*lapply*(*PSHT, *function**(*el*){*el*[*,1*]**})))* *==*length > *(*HTprobe*$*abs*)* ## FLAG!! > > length*(*intersect*(*ProbeInfo, cdfInfo*))* > > length*(*setdiff*(*ProbeInfo, cdfInfo*))* > > length*(*setdiff*(*cdfInfo, ProbeInfo*))* > > ## in common 519200 probe absolute positions > > PSlocinCDFnotinProbe *<-* setdiff*(*cdfInfo, ProbeInfo*)* > > PSinCDFnotinProbe *<-* unique*(*sub*(*"^.*\\.", "", PSlocinCDFnotinProbe*))* > > sizePSinCDFnotinProbe *<-* listLen*(*PSHT*[*PSinCDFnotinProbe*]**)*/2 > > names*(*sizePSinCDFnotinProbe*)* *<-* PSinCDFnotinProbe > > PRinfoPSinCDFnotinProbe *<-* > HTprobe*[*HTprobe*$*Probe.Set.Name<http: probe.set.name=""/>%in% > PSinCDFnotinProbe, > *]* > > PRinfoPSinCDFnotinProbe_spl *<-* split*(*PRinfoPSinCDFnotinProbe, > PRinfoPSinCDFnotinProbe*$*Probe.Set.Name <http: probe.set.name=""/>*)* > > unlist*(*lapply*(*PRinfoPSinCDFnotinProbe_spl, nrow*))* > > > > PS: my sessionInfo is: > >> sessionInfo() > > R version 2.9.0 (2009-04-17) > > i386-pc-mingw32 > > locale: > > LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC _MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australi a.1252 > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > > [1] hthgu133pluspmprobe_2.4.0 AnnotationDbi_1.6.1 > > [3] hthgu133pluspmcdf_2.4.0 affy_1.22.0 > > [5] Biobase_2.4.1 > > loaded via a namespace (and not attached): > > [1] affyio_1.12.0 DBI_0.2-4 preprocessCore_1.6.0 > > [4] RSQLite_0.7-1 tools_2.9.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826
ADD COMMENT

Login before adding your answer.

Traffic: 316 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6