Question

Array Express HuGene-1_1-st-v1 CDF problem

0

Entering edit mode

mlyall • 0

@mlyall-7242

Last seen 6.1 years ago

United Kingdom

Dear Team.

I am trying to analyse some microarray data I downloaded from array express - specifically E-GEOD-48452.

Downloading the data to return an Affy batch object is fine however when I try to perform any analysis

geod_48452 <- ArrayExpress("E-GEOD-48452")
boxplot(geod_48452)
gcrma(geod_48452)

I get the message:

Could not obtain CDF environment, problems encountered:
Specified environment does not contain HuGene-1_1-st-v1
Library - package hugene11stv1cdf not installed
Bioconductor - hugene11stv1cdf not available

I have downloaded and updated every package I can find but cannot seem to get around this. Any help on how to download and integrate this CDF file would be greatly appreciated.

Many thanks in advance

Marcus Lyall

Clinical Fellow University of Edinburgh

sessionInfo

locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] simpleaffy_2.42.0 gcrma_2.38.0 genefilter_1.48.1 ArrayExpress_1.26.0 affycoretools_1.38.0 GO.db_3.0.0 RSQLite_1.0.0 DBI_0.3.1
[9] AnnotationDbi_1.28.2 GenomeInfoDb_1.2.5 IRanges_2.0.1 S4Vectors_0.4.0 affy_1.44.0 Biobase_2.26.0 BiocGenerics_0.12.1 BiocInstaller_1.16.2

loaded via a namespace (and not attached):
[1] acepack_1.3-3.3 affyio_1.34.0 annaffy_1.38.0 annotate_1.44.0 AnnotationForge_1.8.2 base64enc_0.1-2
[7] BatchJobs_1.6 BBmisc_1.9 BiocParallel_1.0.3 biomaRt_2.22.0 Biostrings_2.34.1 biovizBase_1.14.1
[13] bit_1.1-12 bitops_1.0-6 brew_1.0-6 BSgenome_1.34.1 Category_2.32.0 caTools_1.17.1
[19] checkmate_1.5.2 cluster_2.0.1 codetools_0.2-11 colorspace_1.2-6 DESeq2_1.6.3 dichromat_2.0-0
[25] digest_0.6.8 edgeR_3.8.6 evaluate_0.6 fail_1.2 ff_2.2-13 foreach_1.4.2
[31] foreign_0.8-63 formatR_1.1 Formula_1.2-1 gdata_2.13.3 geneplotter_1.44.0 GenomicAlignments_1.2.2
[37] GenomicFeatures_1.18.7 GenomicRanges_1.18.4 GGally_0.5.0 ggbio_1.14.0 ggplot2_1.0.1 GOstats_2.32.0
[43] gplots_2.16.0 graph_1.44.1 grid_3.1.2 gridExtra_0.9.1 GSEABase_1.28.0 gtable_0.1.2
[49] gtools_3.4.2 Hmisc_3.15-0 hwriter_1.3.2 iterators_1.0.7 KernSmooth_2.23-14 knitr_1.9
[55] lattice_0.20-31 latticeExtra_0.6-26 limma_3.22.7 locfit_1.5-9.1 MASS_7.3-40 Matrix_1.2-0
[61] munsell_0.4.2 nnet_7.3-9 oligoClasses_1.28.0 OrganismDbi_1.8.1 PFAM.db_3.0.0 plyr_1.8.1
[67] preprocessCore_1.28.0 proto_0.3-10 R.methodsS3_1.7.0 R.oo_1.19.0 R.utils_2.0.0 RBGL_1.42.0
[73] RColorBrewer_1.1-2 Rcpp_0.11.5 RcppArmadillo_0.4.650.1.1 RCurl_1.95-4.5 ReportingTools_2.6.0 reshape_0.8.5
[79] reshape2_1.4.1 rpart_4.1-9 Rsamtools_1.18.3 rtracklayer_1.26.3 scales_0.2.4 sendmailR_1.2-1
[85] splines_3.1.2 stringr_0.6.2 survival_2.38-1 tools_3.1.2 VariantAnnotation_1.12.9 XML_3.98-1.1
[91] xtable_1.7-4 XVector_0.6.0 zlibbioc_1.12.0

cdf microarray affy • 2.9k views

ADD COMMENT • link updated 9.0 years ago by James W. MacDonald 65k • written 9.0 years ago by mlyall • 0

score 0 · Answer 1 · 2015-04-14

You will be better off using either oligo or xps to process HuGene 1.1 ST data. You can hypothetically use gcrma() to process these files but that will likely take quite a bit of work to get going. Instead you can just use oligo thusly:

> geod_48452 <- getAE("E-GEOD-48452")
> library(oligo)
> dat <- read.celfiles(filenames = geod_48452$rawFiles)
> eset <- rma(dat)
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 33297 features, 73 samples
  element names: exprs
protocolData
  rowNames: GSM1179042_A1649-14.CEL GSM1179041_A1649-13.CEL ...
    GSM1178970_A1359-01.CEL (73 total)
  varLabels: exprs dates
  varMetadata: labelDescription channel
phenoData
  rowNames: GSM1179042_A1649-14.CEL GSM1179041_A1649-13.CEL ...
    GSM1178970_A1359-01.CEL (73 total)
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hugene.1.1.st.v1

If you want to do something that takes into account probe sequence information, you could use SCAN.UPC. In addition, since this set of data is also deposited at GEO, under GSE48452, you can just do

library(SCAN.UPC)
eset <- SCAN("GSE48452")