Question: ReadAffy() Error - 2 Platforms in one file
0
gravatar for Voke AO
7.6 years ago by
Voke AO760
Voke AO760 wrote:
Hi all, I get the following error when I try to get an affybatch object in the code below. I understand that the Hu133a and Hu133b platforms were used. When using getGEO, it's a bit more straightforward as I can specify pData(phenoData(gse9006dat[[2]])) or pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse but I can seem to figure it out with the raw data. Any help will be greatly appreciated. Thanks. -Avoks > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > celFiles = unlist(list.files("study2", full.names = TRUE)) > gse9006preset = ReadAffy(filenames = celFiles) Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C [5] LC_TIME=English_x.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 [19] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 [22] xtable_1.7-0 zlibbioc_1.2.0 >
ADD COMMENTlink modified 3.7 years ago by alexvpickering110 • written 7.6 years ago by Voke AO760
Answer: ReadAffy() Error - 2 Platforms in one file
0
gravatar for Sean Davis
7.6 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran < ovokeraye@gmail.com> wrote: > Hi all, > > I get the following error when I try to get an affybatch object in the > code below. I understand that the Hu133a and Hu133b platforms were > used. When using getGEO, it's a bit more straightforward as I can > specify pData(phenoData(gse9006dat[[2]])) or > pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse > but I can seem to figure it out with the raw data. Any help will be > greatly appreciated. Thanks. > > -Avoks > > > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > > celFiles = unlist(list.files("study2", full.names = TRUE)) > > gse9006preset = ReadAffy(filenames = celFiles) > Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, > : > Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type > > You'll need to process the HG-U133A and HG-U133B files separately. ReadAffy is just telling you that. Sean > > > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 > [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C > [5] LC_TIME=English_x.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 > [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 > [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 > [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 > [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 > [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 > [19] BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 > [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 > [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 > [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 > [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 > [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 > [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 > [22] xtable_1.7-0 zlibbioc_1.2.0 > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 7.6 years ago by Sean Davis21k
Hi Sean, I realize that from the error but I'm not quite sure how to go about processing it, hence the request for help on how to go about it. That's why I had said that the getGEO() is a bit more straightforward. Thanks. -Avoks On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> >> Hi all, >> >> I get the following error when I try to get an affybatch object in the >> code below. I understand that the Hu133a and Hu133b platforms were >> used. When using getGEO, it's a bit more straightforward as I can >> specify pData(phenoData(gse9006dat[[2]])) or >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse >> but I can seem to figure it out with the raw data. Any help will be >> greatly appreciated. Thanks. >> >> -Avoks >> >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") >> > celFiles = unlist(list.files("study2", full.names = TRUE)) >> > gse9006preset = ReadAffy(filenames = celFiles) >> Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, >> ?: >> ?Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type >> > > You'll need to process the HG-U133A and HG-U133B files separately. ?ReadAffy > is just telling you that. > > Sean > >> >> >> >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_x.1252 ?LC_CTYPE=English_x.1252 >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C >> [5] LC_TIME=English_x.1252 >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> other attached packages: >> ?[1] u133x3pcdf_2.10.0 ? ?ggplot2_0.9.0 ? ? ? ?u133x3p.db_2.7.1 >> ?[4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 ? ?hgu95a.db_2.7.1 >> ?[7] hgu133b.db_2.7.1 ? ? hgu133a.db_2.7.1 ? ? hu6800.db_2.7.1 >> [10] org.Hs.eg.db_2.7.1 ? RSQLite_0.11.1 ? ? ? DBI_0.2-5 >> [13] annotate_1.34.0 ? ? ?AnnotationDbi_1.18.0 limma_3.12.0 >> [16] affy_1.34.0 ? ? ? ? ?GEOquery_2.23.1 ? ? ?Biobase_2.16.0 >> [19] BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> ?[1] affyio_1.24.0 ? ? ? ? BiocInstaller_1.4.3 ? colorspace_1.1-1 >> ?[4] dichromat_1.2-4 ? ? ? digest_0.5.2 ? ? ? ? ?grid_2.15.0 >> ?[7] IRanges_1.14.2 ? ? ? ?MASS_7.3-17 ? ? ? ? ? memoise_0.1 >> [10] munsell_0.3 ? ? ? ? ? plyr_1.7.1 ? ? ? ? ? ?preprocessCore_1.18.0 >> [13] proto_0.3-9.2 ? ? ? ? RColorBrewer_1.0-5 ? ?RCurl_1.91-1.1 >> [16] reshape2_1.2.1 ? ? ? ?scales_0.2.0 ? ? ? ? ?stats4_2.15.0 >> [19] stringr_0.6 ? ? ? ? ? tools_2.15.0 ? ? ? ? ?XML_3.9-4.1 >> [22] xtable_1.7-0 ? ? ? ? ?zlibbioc_1.2.0 >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLYlink written 7.6 years ago by Voke AO760
On Thu, May 10, 2012 at 7:41 AM, Ovokeraye Achinike-Oduaran < ovokeraye@gmail.com> wrote: > Hi Sean, > > I realize that from the error but I'm not quite sure how to go about > processing it, hence the request for help on how to go about it. > That's why I had said that the getGEO() is a bit more straightforward. > > You will need to determine which .CEL files are from the U133A platform and which are from the U133B platform. Then, you'll want to use those lists of CEL file names to call ReadAffy once for each list. It will be up to you to combine the results into one ExpressionSet after your normalization is complete for both platforms. Hope that helps clarify things a bit. Sean > Thanks. > > -Avoks > > On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > > > > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran > > <ovokeraye@gmail.com> wrote: > >> > >> Hi all, > >> > >> I get the following error when I try to get an affybatch object in the > >> code below. I understand that the Hu133a and Hu133b platforms were > >> used. When using getGEO, it's a bit more straightforward as I can > >> specify pData(phenoData(gse9006dat[[2]])) or > >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse > >> but I can seem to figure it out with the raw data. Any help will be > >> greatly appreciated. Thanks. > >> > >> -Avoks > >> > >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") > >> > celFiles = unlist(list.files("study2", full.names = TRUE)) > >> > gse9006preset = ReadAffy(filenames = celFiles) > >> Error in read.affybatch(filenames = l$filenames, phenoData = > l$phenoData, > >> : > >> Cel file study2/GSM254177.CEL.gz does not seem to be of HG-U133A type > >> > > > > You'll need to process the HG-U133A and HG-U133B files separately. > ReadAffy > > is just telling you that. > > > > Sean > > > >> > >> > >> > >> R version 2.15.0 (2012-03-30) > >> Platform: i386-pc-mingw32/i386 (32-bit) > >> > >> locale: > >> [1] LC_COLLATE=English_x.1252 LC_CTYPE=English_x.1252 > >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C > >> [5] LC_TIME=English_x.1252 > >> > >> attached base packages: > >> [1] stats graphics grDevices utils datasets methods base > >> > >> other attached packages: > >> [1] u133x3pcdf_2.10.0 ggplot2_0.9.0 u133x3p.db_2.7.1 > >> [4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 hgu95a.db_2.7.1 > >> [7] hgu133b.db_2.7.1 hgu133a.db_2.7.1 hu6800.db_2.7.1 > >> [10] org.Hs.eg.db_2.7.1 RSQLite_0.11.1 DBI_0.2-5 > >> [13] annotate_1.34.0 AnnotationDbi_1.18.0 limma_3.12.0 > >> [16] affy_1.34.0 GEOquery_2.23.1 Biobase_2.16.0 > >> [19] BiocGenerics_0.2.0 > >> > >> loaded via a namespace (and not attached): > >> [1] affyio_1.24.0 BiocInstaller_1.4.3 colorspace_1.1-1 > >> [4] dichromat_1.2-4 digest_0.5.2 grid_2.15.0 > >> [7] IRanges_1.14.2 MASS_7.3-17 memoise_0.1 > >> [10] munsell_0.3 plyr_1.7.1 preprocessCore_1.18.0 > >> [13] proto_0.3-9.2 RColorBrewer_1.0-5 RCurl_1.91-1.1 > >> [16] reshape2_1.2.1 scales_0.2.0 stats4_2.15.0 > >> [19] stringr_0.6 tools_2.15.0 XML_3.9-4.1 > >> [22] xtable_1.7-0 zlibbioc_1.2.0 > >> > > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLYlink written 7.6 years ago by Sean Davis21k
Thanks Sean. I guess I was looking for an easier way out. But all sorted. Thanks. -Avoks On Thu, May 10, 2012 at 1:46 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > On Thu, May 10, 2012 at 7:41 AM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> >> Hi Sean, >> >> I realize that from the error but I'm not quite sure how to go about >> processing it, hence the request for help on how to go about it. >> That's why I had said that the getGEO() is a bit more straightforward. >> > > You will need to determine which .CEL files are from the U133A platform and > which are from the U133B platform. ?Then, you'll want to use those lists of > CEL file names to call ReadAffy once for each list. ?It will be up to you to > combine the results into one ExpressionSet after your normalization is > complete for both platforms. > > Hope that helps clarify things a bit. > > Sean > > >> >> Thanks. >> >> -Avoks >> >> On Thu, May 10, 2012 at 1:35 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: >> > >> > >> > On Thu, May 10, 2012 at 7:11 AM, Ovokeraye Achinike-Oduaran >> > <ovokeraye at="" gmail.com=""> wrote: >> >> >> >> Hi all, >> >> >> >> I get the following error when I try to get an affybatch object in the >> >> code below. I understand that the Hu133a and Hu133b platforms were >> >> used. When using getGEO, it's a bit more straightforward as I can >> >> specify pData(phenoData(gse9006dat[[2]])) or >> >> pData(phenoData(gse9006dat[[1]])) for the data I would like to analyse >> >> but I can seem to figure it out with the raw data. Any help will be >> >> greatly appreciated. Thanks. >> >> >> >> -Avoks >> >> >> >> > untar("GSE9006/GSE9006_RAW.tar", exdir="study2") >> >> > celFiles = unlist(list.files("study2", full.names = TRUE)) >> >> > gse9006preset = ReadAffy(filenames = celFiles) >> >> Error in read.affybatch(filenames = l$filenames, phenoData = >> >> l$phenoData, >> >> ?: >> >> ?Cel file study2/GSM254177.CEL.gz does not seem to be of HG- U133A type >> >> >> > >> > You'll need to process the HG-U133A and HG-U133B files separately. >> > ?ReadAffy >> > is just telling you that. >> > >> > Sean >> > >> >> >> >> >> >> >> >> R version 2.15.0 (2012-03-30) >> >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> >> >> locale: >> >> [1] LC_COLLATE=English_x.1252 ?LC_CTYPE=English_x.1252 >> >> [3] LC_MONETARY=English_x.1252 LC_NUMERIC=C >> >> [5] LC_TIME=English_x.1252 >> >> >> >> attached base packages: >> >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> >> >> other attached packages: >> >> ?[1] u133x3pcdf_2.10.0 ? ?ggplot2_0.9.0 ? ? ? ?u133x3p.db_2.7.1 >> >> ?[4] hgu133plus2.db_2.7.1 hgu95av2.db_2.7.1 ? ?hgu95a.db_2.7.1 >> >> ?[7] hgu133b.db_2.7.1 ? ? hgu133a.db_2.7.1 ? ? hu6800.db_2.7.1 >> >> [10] org.Hs.eg.db_2.7.1 ? RSQLite_0.11.1 ? ? ? DBI_0.2-5 >> >> [13] annotate_1.34.0 ? ? ?AnnotationDbi_1.18.0 limma_3.12.0 >> >> [16] affy_1.34.0 ? ? ? ? ?GEOquery_2.23.1 ? ? ?Biobase_2.16.0 >> >> [19] BiocGenerics_0.2.0 >> >> >> >> loaded via a namespace (and not attached): >> >> ?[1] affyio_1.24.0 ? ? ? ? BiocInstaller_1.4.3 ? colorspace_1.1-1 >> >> ?[4] dichromat_1.2-4 ? ? ? digest_0.5.2 ? ? ? ? ?grid_2.15.0 >> >> ?[7] IRanges_1.14.2 ? ? ? ?MASS_7.3-17 ? ? ? ? ? memoise_0.1 >> >> [10] munsell_0.3 ? ? ? ? ? plyr_1.7.1 ? ? ? ? ? ?preprocessCore_1.18.0 >> >> [13] proto_0.3-9.2 ? ? ? ? RColorBrewer_1.0-5 ? ?RCurl_1.91-1.1 >> >> [16] reshape2_1.2.1 ? ? ? ?scales_0.2.0 ? ? ? ? ?stats4_2.15.0 >> >> [19] stringr_0.6 ? ? ? ? ? tools_2.15.0 ? ? ? ? ?XML_3.9-4.1 >> >> [22] xtable_1.7-0 ? ? ? ? ?zlibbioc_1.2.0 >> >> > >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor at r-project.org >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: >> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLYlink written 7.6 years ago by Voke AO760
Answer: ReadAffy() Error - 2 Platforms in one file
0
gravatar for alexvpickering
3.7 years ago by
alexvpickering110 wrote:

Bit late on the response but might be useful to someone else. Just had to solve this problem myself. Basic idea is to get each GSEMatrix, grab the sample names, and use those to load the appropriate CEL files seperately.

gse_name <- "GSE5054"
data_dir <- paste(getwd(), "data", sep="/")
gse_dir <- paste(data_dir, gse_name, sep="/")
esets <- getGEO(gse_name, destdir=gse_dir, GSEMatrix=T)

data_list <- list()

for (eset in esets) {   


    sample_names <- sampleNames(eset)
    pattern <- paste(".*", sample_names, ".*CEL", collapse="|", sep="")
   
    #get full paths for CEL's that match pattern
    cel_paths <- list.files(gse_dir, pattern, full.names=T, ignore.case=T)
   
    #the following will attempt to load/rma correct CELs with affy package
    #if it fails, it tries oligo
    data <- tryCatch (
        {
        raw_data <- ReadAffy (celfile.path=gse_dir)
        affy::rma(raw_data)
        },
        warning = function(cond) {
            raw_data <- read.celfiles(cel_paths)
            return (oligo::rma(raw_data)) 
        },
        error = function(cond) {
            raw_data <- read.celfiles(cel_paths)
            return (oligo::rma(raw_data)) 
        }
    )
    #add data to data_list
    gpl_name <- annotation(eset)
    gse.gpl_name <- paste(gse_name, gpl_name, sep=".")
    data_list[[gse.gpl_name]] <- data
}
ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by alexvpickering110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 154 users visited in the last hour