ReadAffy() Error - 2 Platforms in one file

0

Entering edit mode

Voke AO ▴ 760

@voke-ao-4830

Last seen 9.6 years ago

Hi all, I get the following error when I try to get an affybatch object in the code below. I understand that the Hu133a and Hu133b platforms were used. When using getGEO, it's a bit more straightforward as I can specify pData(phenoData(gse9006dat[[2]])) or pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse but I can seem to figure it out with the raw data. Any help will be greatly appreciated. Thanks. -Avoks > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > celFiles = unlist(list.files("study2", full.names = TRUE)) > gse9006preset = ReadAffy(filenames = celFiles) Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C [5] LC_TIME=English_x.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 [19] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 [22] xtable_1.7-0 zlibbioc_1.2.0 >

hgu133a hgu133b hgu133plus2 hgu95a hgu95av2 hu6800 u133x3p hgu133a hgu133b hgu133plus2 • 2.2k views

ADD COMMENT • link updated 8.1 years ago by alexvpickering ▴ 110 • written 12.0 years ago by Voke AO ▴ 760

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran < ovokeraye@gmail.com> wrote: > Hi all, > > I get the following error when I try to get an affybatch object in the > code below. I understand that the Hu133a and Hu133b platforms were > used. When using getGEO, it's a bit more straightforward as I can > specify pData(phenoData(gse9006dat[[2]])) or > pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse > but I can seem to figure it out with the raw data. Any help will be > greatly appreciated. Thanks. > > -Avoks > > > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > > celFiles = unlist(list.files("study2", full.names = TRUE)) > > gse9006preset = ReadAffy(filenames = celFiles) > Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, > : > Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type > > You'll need to process the HG-U133A and HG-U133B files separately. ReadAffy is just telling you that. Sean > > > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 > [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C > [5] LC_TIME=English_x.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 > [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 > [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 > [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 > [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 > [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 > [19] BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 > [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 > [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 > [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 > [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 > [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 > [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 > [22] xtable_1.7-0 zlibbioc_1.2.0 > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD COMMENT • link 12.0 years ago Sean Davis 21k

0

Entering edit mode

Hi Sean, I realize that from the error but I'm not quite sure how to go about processing it, hence the request for help on how to go about it. That's why I had said that the getGEO() is a bit more straightforward. Thanks. -Avoks On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> >> Hi all, >> >> I get the following error when I try to get an affybatch object in the >> code below. I understand that the Hu133a and Hu133b platforms were >> used. When using getGEO, it's a bit more straightforward as I can >> specify pData(phenoData(gse9006dat[[2]])) or >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse >> but I can seem to figure it out with the raw data. Any help will be >> greatly appreciated. Thanks. >> >> -Avoks >> >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") >> > celFiles = unlist(list.files("study2", full.names = TRUE)) >> > gse9006preset = ReadAffy(filenames = celFiles) >> Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, >> ?: >> ?Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type >> > > You'll need to process the HG-U133A and HG-U133B files separately. ?ReadAffy > is just telling you that. > > Sean > >> >> >> >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_x.1252 ?LC_CTYPE=English_x.1252 >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C >> [5] LC_TIME=English_x.1252 >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> other attached packages: >> ?[1] u133x3pcdf_2.10.0 ? ?ggplot2_0.9.0 ? ? ? ?u133x3p.db_2.7.1 >> ?[4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 ? ?hgu95a.db_2.7.1 >> ?[7] hgu133b.db_2.7.1 ? ? hgu133a.db_2.7.1 ? ? hu6800.db_2.7.1 >> [10] org.Hs.eg.db_2.7.1 ? RSQLite_0.11.1 ? ? ? DBI_0.2-5 >> [13] annotate_1.34.0 ? ? ?AnnotationDbi_1.18.0 limma_3.12.0 >> [16] affy_1.34.0 ? ? ? ? ?GEOquery_2.23.1 ? ? ?Biobase_2.16.0 >> [19] BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> ?[1] affyio_1.24.0 ? ? ? ? BiocInstaller_1.4.3 ? colorspace_1.1-1 >> ?[4] dichromat_1.2-4 ? ? ? digest_0.5.2 ? ? ? ? ?grid_2.15.0 >> ?[7] IRanges_1.14.2 ? ? ? ?MASS_7.3-17 ? ? ? ? ? memoise_0.1 >> [10] munsell_0.3 ? ? ? ? ? plyr_1.7.1 ? ? ? ? ? ?preprocessCore_1.18.0 >> [13] proto_0.3-9.2 ? ? ? ? RColorBrewer_1.0-5 ? ?RCurl_1.91-1.1 >> [16] reshape2_1.2.1 ? ? ? ?scales_0.2.0 ? ? ? ? ?stats4_2.15.0 >> [19] stringr_0.6 ? ? ? ? ? tools_2.15.0 ? ? ? ? ?XML_3.9-4.1 >> [22] xtable_1.7-0 ? ? ? ? ?zlibbioc_1.2.0 >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD REPLY • link 12.0 years ago Voke AO ▴ 760

0

Entering edit mode

On Thu, May 10, 2012 at 7:41 AM, Ovokeraye Achinike-Oduaran < ovokeraye@gmail.com> wrote: > Hi Sean, > > I realize that from the error but I'm not quite sure how to go about > processing it, hence the request for help on how to go about it. > That's why I had said that the getGEO() is a bit more straightforward. > > You will need to determine which .CEL files are from the U133A platform and which are from the U133B platform. Then, you'll want to use those lists of CEL file names to call ReadAffy once for each list. It will be up to you to combine the results into one ExpressionSet after your normalization is complete for both platforms. Hope that helps clarify things a bit. Sean > Thanks. > > -Avoks > > On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > > > > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran > > <ovokeraye@gmail.com> wrote: > >> > >> Hi all, > >> > >> I get the following error when I try to get an affybatch object in the > >> code below. I understand that the Hu133a and Hu133b platforms were > >> used. When using getGEO, it's a bit more straightforward as I can > >> specify pData(phenoData(gse9006dat[[2]])) or > >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse > >> but I can seem to figure it out with the raw data. Any help will be > >> greatly appreciated. Thanks. > >> > >> -Avoks > >> > >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > >> > celFiles = unlist(list.files("study2", full.names = TRUE)) > >> > gse9006preset = ReadAffy(filenames = celFiles) > >> Error in read.affybatch(filenames = l$filenames, phenoData = > l$phenoData, > >> : > >> Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type > >> > > > > You'll need to process the HG-U133A and HG-U133B files separately. > ReadAffy > > is just telling you that. > > > > Sean > > > >> > >> > >> > >> R version 2.15.0 (2012-03-30) > >> Platform: i386-pc-mingw32/i386 (32-bit) > >> > >> locale: > >> [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 > >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C > >> [5] LC_TIME=English_x.1252 > >> > >> attached base packages: > >> [1] stats graphics grDevices utils datasets methods base > >> > >> other attached packages: > >> [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 > >> [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 > >> [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 > >> [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 > >> [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 > >> [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 > >> [19] BiocGenerics_0.2.0 > >> > >> loaded via a namespace (and not attached): > >> [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 > >> [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 > >> [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 > >> [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 > >> [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 > >> [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 > >> [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 > >> [22] xtable_1.7-0 zlibbioc_1.2.0 > >> > > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD REPLY • link 12.0 years ago Sean Davis 21k

0

Entering edit mode

Thanks Sean. I guess I was looking for an easier way out. But all sorted. Thanks. -Avoks On Thu, May 10, 2012 at 1:46 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > On Thu, May 10, 2012 at 7:41 AM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> >> Hi Sean, >> >> I realize that from the error but I'm not quite sure how to go about >> processing it, hence the request for help on how to go about it. >> That's why I had said that the getGEO() is a bit more straightforward. >> > > You will need to determine which .CEL files are from the U133A platform and > which are from the U133B platform. ?Then, you'll want to use those lists of > CEL file names to call ReadAffy once for each list. ?It will be up to you to > combine the results into one ExpressionSet after your normalization is > complete for both platforms. > > Hope that helps clarify things a bit. > > Sean > > >> >> Thanks. >> >> -Avoks >> >> On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: >> > >> > >> > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran >> > <ovokeraye at="" gmail.com=""> wrote: >> >> >> >> Hi all, >> >> >> >> I get the following error when I try to get an affybatch object in the >> >> code below. I understand that the Hu133a and Hu133b platforms were >> >> used. When using getGEO, it's a bit more straightforward as I can >> >> specify pData(phenoData(gse9006dat[[2]])) or >> >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse >> >> but I can seem to figure it out with the raw data. Any help will be >> >> greatly appreciated. Thanks. >> >> >> >> -Avoks >> >> >> >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") >> >> > celFiles = unlist(list.files("study2", full.names = TRUE)) >> >> > gse9006preset = ReadAffy(filenames = celFiles) >> >> Error in read.affybatch(filenames = l$filenames, phenoData = >> >> l$phenoData, >> >> ?: >> >> ?Cel file study2/GSM254177.CEL.gz does not seem to be of HG- U133A type >> >> >> > >> > You'll need to process the HG-U133A and HG-U133B files separately. >> > ?ReadAffy >> > is just telling you that. >> > >> > Sean >> > >> >> >> >> >> >> >> >> R version 2.15.0 (2012-03-30) >> >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> >> >> locale: >> >> [1] LC_COLLATE=English_x.1252 ?LC_CTYPE=English_x.1252 >> >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C >> >> [5] LC_TIME=English_x.1252 >> >> >> >> attached base packages: >> >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> >> >> other attached packages: >> >> ?[1] u133x3pcdf_2.10.0 ? ?ggplot2_0.9.0 ? ? ? ?u133x3p.db_2.7.1 >> >> ?[4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 ? ?hgu95a.db_2.7.1 >> >> ?[7] hgu133b.db_2.7.1 ? ? hgu133a.db_2.7.1 ? ? hu6800.db_2.7.1 >> >> [10] org.Hs.eg.db_2.7.1 ? RSQLite_0.11.1 ? ? ? DBI_0.2-5 >> >> [13] annotate_1.34.0 ? ? ?AnnotationDbi_1.18.0 limma_3.12.0 >> >> [16] affy_1.34.0 ? ? ? ? ?GEOquery_2.23.1 ? ? ?Biobase_2.16.0 >> >> [19] BiocGenerics_0.2.0 >> >> >> >> loaded via a namespace (and not attached): >> >> ?[1] affyio_1.24.0 ? ? ? ? BiocInstaller_1.4.3 ? colorspace_1.1-1 >> >> ?[4] dichromat_1.2-4 ? ? ? digest_0.5.2 ? ? ? ? ?grid_2.15.0 >> >> ?[7] IRanges_1.14.2 ? ? ? ?MASS_7.3-17 ? ? ? ? ? memoise_0.1 >> >> [10] munsell_0.3 ? ? ? ? ? plyr_1.7.1 ? ? ? ? ? ?preprocessCore_1.18.0 >> >> [13] proto_0.3-9.2 ? ? ? ? RColorBrewer_1.0-5 ? ?RCurl_1.91-1.1 >> >> [16] reshape2_1.2.1 ? ? ? ?scales_0.2.0 ? ? ? ? ?stats4_2.15.0 >> >> [19] stringr_0.6 ? ? ? ? ? tools_2.15.0 ? ? ? ? ?XML_3.9-4.1 >> >> [22] xtable_1.7-0 ? ? ? ? ?zlibbioc_1.2.0 >> >> > >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor at r-project.org >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: >> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD REPLY • link 12.0 years ago Voke AO ▴ 760

0

Entering edit mode

alexvpickering ▴ 110

@alexvpickering

Last seen 2.0 years ago

Canada

Bit late on the response but might be useful to someone else. Just had to solve this problem myself. Basic idea is to get each GSEMatrix, grab the sample names, and use those to load the appropriate CEL files seperately.

gse_name <- "GSE5054"
data_dir <- paste(getwd(), "data", sep="/")
gse_dir <- paste(data_dir, gse_name, sep="/")
esets <- getGEO(gse_name, destdir=gse_dir, GSEMatrix=T)

data_list <- list()

for (eset in esets) {   


    sample_names <- sampleNames(eset)
    pattern <- paste(".*", sample_names, ".*CEL", collapse="|", sep="")
   
    #get full paths for CEL's that match pattern
    cel_paths <- list.files(gse_dir, pattern, full.names=T, ignore.case=T)
   
    #the following will attempt to load/rma correct CELs with affy package
    #if it fails, it tries oligo
    data <- tryCatch (
        {
        raw_data <- ReadAffy (celfile.path=gse_dir)
        affy::rma(raw_data)
        },
        warning = function(cond) {
            raw_data <- read.celfiles(cel_paths)
            return (oligo::rma(raw_data)) 
        },
        error = function(cond) {
            raw_data <- read.celfiles(cel_paths)
            return (oligo::rma(raw_data)) 
        }
    )
    #add data to data_list
    gpl_name <- annotation(eset)
    gse.gpl_name <- paste(gse_name, gpl_name, sep=".")
    data_list[[gse.gpl_name]] <- data
}

ADD COMMENT • link 8.1 years ago alexvpickering ▴ 110

Login before adding your answer.