Search
Question: GEOquery: Missing featureNames after creating expression set from series matrix file
1
gravatar for knaxerova
3.2 years ago by
knaxerova10
United States
knaxerova10 wrote:

Hi all, 

 

 

I am having some trouble with expression sets created by the GEOquery package from GEO series matrix files. There seems to be a problem with parsing featureNames.

I begin by downloading a series matrix file:

data <- getGEO(GEO="GSE63252",destdir=getwd())

> eset <- data[[1]]
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 54675 features, 27 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM1544474 GSM1544475 ... GSM1544500 (27 total)
  varLabels: title geo_accession ... data_row_count (33 total)
  varMetadata: labelDescription
featureData
  featureNames: 1007_s_at 1053_at ... NA.17590 (54675 total)
  fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL570 

The expression sets created in this way have missing feature names after row 37085.

> featureNames(eset)[37084]
[1] "227829_at"
> featureNames(eset)[37085]
[1] "NA"
> featureNames(eset)[37086]
[1] "NA.1"
> featureNames(eset)[50000]
[1] "NA.12915"

This happens with all series matrix files, not just with this one, BUT everything is fine when creating expression sets from GDS records. Thanks so much in advance for any help.

Kamila

> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.3 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] affy_1.44.0         siggenes_1.40.0     multtest_2.22.0     GEOquery_2.32.0     Biobase_2.26.0      BiocGenerics_0.12.1

loaded via a namespace (and not attached):
 [1] affyio_1.34.0         BiocInstaller_1.16.4  bitops_1.0-6          MASS_7.3-40           preprocessCore_1.28.0 RCurl_1.95-4.5        stats4_3.1.3          survival_2.38-1      
 [9] tools_3.1.3           XML_3.98-1.1          zlibbioc_1.12.0  
ADD COMMENTlink modified 3.2 years ago by James W. MacDonald46k • written 3.2 years ago by knaxerova10

Can you do me a favor and let me know the output of:

file.info('GPL570.soft')$size
ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Sean Davis21k

Hi Sean - so sorry for the late reply! I did not realize my email notifications were turned off, and I did not see your post until today. Thanks so much for your help.

> file.info('GPL570.soft')$size
[1] 51941825

ADD REPLYlink written 3.2 years ago by knaxerova10
1

Looks like the GPL570.soft file is probably truncated.  I would suggest removing it and refetching a new copy.  I get a file size of 65028051.

ADD REPLYlink written 3.2 years ago by Sean Davis21k

Wonderful, problem solved! Thank you so much!

ADD REPLYlink written 3.2 years ago by knaxerova10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 169 users visited in the last hour