ReadAffy() Error - 2 Platforms in one file
2
0
Entering edit mode
Voke AO ▴ 760
@voke-ao-4830
Last seen 9.6 years ago
Hi all, I get the following error when I try to get an affybatch object in the code below. I understand that the Hu133a and Hu133b platforms were used. When using getGEO, it's a bit more straightforward as I can specify pData(phenoData(gse9006dat[[2]])) or pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse but I can seem to figure it out with the raw data. Any help will be greatly appreciated. Thanks. -Avoks > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > celFiles = unlist(list.files("study2", full.names = TRUE)) > gse9006preset = ReadAffy(filenames = celFiles) Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C [5] LC_TIME=English_x.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 [19] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 [22] xtable_1.7-0 zlibbioc_1.2.0 >
hgu133a hgu133b hgu133plus2 hgu95a hgu95av2 hu6800 u133x3p hgu133a hgu133b hgu133plus2 • 2.2k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran < ovokeraye@gmail.com> wrote: > Hi all, > > I get the following error when I try to get an affybatch object in the > code below. I understand that the Hu133a and Hu133b platforms were > used. When using getGEO, it's a bit more straightforward as I can > specify pData(phenoData(gse9006dat[[2]])) or > pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse > but I can seem to figure it out with the raw data. Any help will be > greatly appreciated. Thanks. > > -Avoks > > > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > > celFiles = unlist(list.files("study2", full.names = TRUE)) > > gse9006preset = ReadAffy(filenames = celFiles) > Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, > : > Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type > > You'll need to process the HG-U133A and HG-U133B files separately. ReadAffy is just telling you that. Sean > > > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 > [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C > [5] LC_TIME=English_x.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 > [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 > [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 > [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 > [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 > [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 > [19] BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 > [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 > [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 > [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 > [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 > [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 > [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 > [22] xtable_1.7-0 zlibbioc_1.2.0 > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Sean, I realize that from the error but I'm not quite sure how to go about processing it, hence the request for help on how to go about it. That's why I had said that the getGEO() is a bit more straightforward. Thanks. -Avoks On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> >> Hi all, >> >> I get the following error when I try to get an affybatch object in the >> code below. I understand that the Hu133a and Hu133b platforms were >> used. When using getGEO, it's a bit more straightforward as I can >> specify pData(phenoData(gse9006dat[[2]])) or >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse >> but I can seem to figure it out with the raw data. Any help will be >> greatly appreciated. Thanks. >> >> -Avoks >> >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") >> > celFiles = unlist(list.files("study2", full.names = TRUE)) >> > gse9006preset = ReadAffy(filenames = celFiles) >> Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, >> ?: >> ?Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type >> > > You'll need to process the HG-U133A and HG-U133B files separately. ?ReadAffy > is just telling you that. > > Sean > >> >> >> >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_x.1252 ?LC_CTYPE=English_x.1252 >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C >> [5] LC_TIME=English_x.1252 >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> other attached packages: >> ?[1] u133x3pcdf_2.10.0 ? ?ggplot2_0.9.0 ? ? ? ?u133x3p.db_2.7.1 >> ?[4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 ? ?hgu95a.db_2.7.1 >> ?[7] hgu133b.db_2.7.1 ? ? hgu133a.db_2.7.1 ? ? hu6800.db_2.7.1 >> [10] org.Hs.eg.db_2.7.1 ? RSQLite_0.11.1 ? ? ? DBI_0.2-5 >> [13] annotate_1.34.0 ? ? ?AnnotationDbi_1.18.0 limma_3.12.0 >> [16] affy_1.34.0 ? ? ? ? ?GEOquery_2.23.1 ? ? ?Biobase_2.16.0 >> [19] BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> ?[1] affyio_1.24.0 ? ? ? ? BiocInstaller_1.4.3 ? colorspace_1.1-1 >> ?[4] dichromat_1.2-4 ? ? ? digest_0.5.2 ? ? ? ? ?grid_2.15.0 >> ?[7] IRanges_1.14.2 ? ? ? ?MASS_7.3-17 ? ? ? ? ? memoise_0.1 >> [10] munsell_0.3 ? ? ? ? ? plyr_1.7.1 ? ? ? ? ? ?preprocessCore_1.18.0 >> [13] proto_0.3-9.2 ? ? ? ? RColorBrewer_1.0-5 ? ?RCurl_1.91-1.1 >> [16] reshape2_1.2.1 ? ? ? ?scales_0.2.0 ? ? ? ? ?stats4_2.15.0 >> [19] stringr_0.6 ? ? ? ? ? tools_2.15.0 ? ? ? ? ?XML_3.9-4.1 >> [22] xtable_1.7-0 ? ? ? ? ?zlibbioc_1.2.0 >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
On Thu, May 10, 2012 at 7:41 AM, Ovokeraye Achinike-Oduaran < ovokeraye@gmail.com> wrote: > Hi Sean, > > I realize that from the error but I'm not quite sure how to go about > processing it, hence the request for help on how to go about it. > That's why I had said that the getGEO() is a bit more straightforward. > > You will need to determine which .CEL files are from the U133A platform and which are from the U133B platform. Then, you'll want to use those lists of CEL file names to call ReadAffy once for each list. It will be up to you to combine the results into one ExpressionSet after your normalization is complete for both platforms. Hope that helps clarify things a bit. Sean > Thanks. > > -Avoks > > On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > > > > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran > > <ovokeraye@gmail.com> wrote: > >> > >> Hi all, > >> > >> I get the following error when I try to get an affybatch object in the > >> code below. I understand that the Hu133a and Hu133b platforms were > >> used. When using getGEO, it's a bit more straightforward as I can > >> specify pData(phenoData(gse9006dat[[2]])) or > >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse > >> but I can seem to figure it out with the raw data. Any help will be > >> greatly appreciated. Thanks. > >> > >> -Avoks > >> > >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > >> > celFiles = unlist(list.files("study2", full.names = TRUE)) > >> > gse9006preset = ReadAffy(filenames = celFiles) > >> Error in read.affybatch(filenames = l$filenames, phenoData = > l$phenoData, > >> : > >> Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type > >> > > > > You'll need to process the HG-U133A and HG-U133B files separately. > ReadAffy > > is just telling you that. > > > > Sean > > > >> > >> > >> > >> R version 2.15.0 (2012-03-30) > >> Platform: i386-pc-mingw32/i386 (32-bit) > >> > >> locale: > >> [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 > >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C > >> [5] LC_TIME=English_x.1252 > >> > >> attached base packages: > >> [1] stats graphics grDevices utils datasets methods base > >> > >> other attached packages: > >> [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 > >> [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 > >> [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 > >> [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 > >> [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 > >> [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 > >> [19] BiocGenerics_0.2.0 > >> > >> loaded via a namespace (and not attached): > >> [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 > >> [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 > >> [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 > >> [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 > >> [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 > >> [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 > >> [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 > >> [22] xtable_1.7-0 zlibbioc_1.2.0 > >> > > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks Sean. I guess I was looking for an easier way out. But all sorted. Thanks. -Avoks On Thu, May 10, 2012 at 1:46 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > On Thu, May 10, 2012 at 7:41 AM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> >> Hi Sean, >> >> I realize that from the error but I'm not quite sure how to go about >> processing it, hence the request for help on how to go about it. >> That's why I had said that the getGEO() is a bit more straightforward. >> > > You will need to determine which .CEL files are from the U133A platform and > which are from the U133B platform. ?Then, you'll want to use those lists of > CEL file names to call ReadAffy once for each list. ?It will be up to you to > combine the results into one ExpressionSet after your normalization is > complete for both platforms. > > Hope that helps clarify things a bit. > > Sean > > >> >> Thanks. >> >> -Avoks >> >> On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: >> > >> > >> > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran >> > <ovokeraye at="" gmail.com=""> wrote: >> >> >> >> Hi all, >> >> >> >> I get the following error when I try to get an affybatch object in the >> >> code below. I understand that the Hu133a and Hu133b platforms were >> >> used. When using getGEO, it's a bit more straightforward as I can >> >> specify pData(phenoData(gse9006dat[[2]])) or >> >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse >> >> but I can seem to figure it out with the raw data. Any help will be >> >> greatly appreciated. Thanks. >> >> >> >> -Avoks >> >> >> >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") >> >> > celFiles = unlist(list.files("study2", full.names = TRUE)) >> >> > gse9006preset = ReadAffy(filenames = celFiles) >> >> Error in read.affybatch(filenames = l$filenames, phenoData = >> >> l$phenoData, >> >> ?: >> >> ?Cel file study2/GSM254177.CEL.gz does not seem to be of HG- U133A type >> >> >> > >> > You'll need to process the HG-U133A and HG-U133B files separately. >> > ?ReadAffy >> > is just telling you that. >> > >> > Sean >> > >> >> >> >> >> >> >> >> R version 2.15.0 (2012-03-30) >> >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> >> >> locale: >> >> [1] LC_COLLATE=English_x.1252 ?LC_CTYPE=English_x.1252 >> >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C >> >> [5] LC_TIME=English_x.1252 >> >> >> >> attached base packages: >> >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> >> >> other attached packages: >> >> ?[1] u133x3pcdf_2.10.0 ? ?ggplot2_0.9.0 ? ? ? ?u133x3p.db_2.7.1 >> >> ?[4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 ? ?hgu95a.db_2.7.1 >> >> ?[7] hgu133b.db_2.7.1 ? ? hgu133a.db_2.7.1 ? ? hu6800.db_2.7.1 >> >> [10] org.Hs.eg.db_2.7.1 ? RSQLite_0.11.1 ? ? ? DBI_0.2-5 >> >> [13] annotate_1.34.0 ? ? ?AnnotationDbi_1.18.0 limma_3.12.0 >> >> [16] affy_1.34.0 ? ? ? ? ?GEOquery_2.23.1 ? ? ?Biobase_2.16.0 >> >> [19] BiocGenerics_0.2.0 >> >> >> >> loaded via a namespace (and not attached): >> >> ?[1] affyio_1.24.0 ? ? ? ? BiocInstaller_1.4.3 ? colorspace_1.1-1 >> >> ?[4] dichromat_1.2-4 ? ? ? digest_0.5.2 ? ? ? ? ?grid_2.15.0 >> >> ?[7] IRanges_1.14.2 ? ? ? ?MASS_7.3-17 ? ? ? ? ? memoise_0.1 >> >> [10] munsell_0.3 ? ? ? ? ? plyr_1.7.1 ? ? ? ? ? ?preprocessCore_1.18.0 >> >> [13] proto_0.3-9.2 ? ? ? ? RColorBrewer_1.0-5 ? ?RCurl_1.91-1.1 >> >> [16] reshape2_1.2.1 ? ? ? ?scales_0.2.0 ? ? ? ? ?stats4_2.15.0 >> >> [19] stringr_0.6 ? ? ? ? ? tools_2.15.0 ? ? ? ? ?XML_3.9-4.1 >> >> [22] xtable_1.7-0 ? ? ? ? ?zlibbioc_1.2.0 >> >> > >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor at r-project.org >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: >> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
@alexvpickering
Last seen 2.0 years ago
Canada

Bit late on the response but might be useful to someone else. Just had to solve this problem myself. Basic idea is to get each GSEMatrix, grab the sample names, and use those to load the appropriate CEL files seperately.

gse_name <- "GSE5054"
data_dir <- paste(getwd(), "data", sep="/")
gse_dir <- paste(data_dir, gse_name, sep="/")
esets <- getGEO(gse_name, destdir=gse_dir, GSEMatrix=T)

data_list <- list()

for (eset in esets) {   


    sample_names <- sampleNames(eset)
    pattern <- paste(".*", sample_names, ".*CEL", collapse="|", sep="")
   
    #get full paths for CEL's that match pattern
    cel_paths <- list.files(gse_dir, pattern, full.names=T, ignore.case=T)
   
    #the following will attempt to load/rma correct CELs with affy package
    #if it fails, it tries oligo
    data <- tryCatch (
        {
        raw_data <- ReadAffy (celfile.path=gse_dir)
        affy::rma(raw_data)
        },
        warning = function(cond) {
            raw_data <- read.celfiles(cel_paths)
            return (oligo::rma(raw_data)) 
        },
        error = function(cond) {
            raw_data <- read.celfiles(cel_paths)
            return (oligo::rma(raw_data)) 
        }
    )
    #add data to data_list
    gpl_name <- annotation(eset)
    gse.gpl_name <- paste(gse_name, gpl_name, sep=".")
    data_list[[gse.gpl_name]] <- data
}
ADD COMMENT

Login before adding your answer.

Traffic: 681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6