Entering edit mode
Jack Zhu
▴
170
@jack-zhu-3338
Last seen 7.1 years ago
Hi Thomas,
Sorry that I missed your posts on the bioconductor mailing list. We
did have issues with updating recent GEO data and that seem has been
fixed:
-----------------------
> con <- dbConnect(SQLite(), "GEOmetadb.sqlite")
> dat <- dbGetQuery(con, "select * from gds where gds = 'GDS4252'")
> dat
ID gds
title
1 3354 GDS4252 Cystic fibrosis bronchial epithelial cells exposure to
Pseudomonas aeruginosa PA01 biofilms
description
1 Analysis of cystic fibrosis (CF) bronchial epithelial CFBE41o- cells
exposed to Pseudomonas aeruginosa PA01 biofilms. Cells overexpressing
508del-CFTR and cells rescued with wild type CFTR were examined. CFTR
mutations enhance the inflammatory response in the lung to PA01
infection.
type pubmed_id gpl platform_organism
platform_technology_type feature_count sample_organism sample_type
channel_count
1 Expression profiling by array 22821996 GPL570 Homo sapiens in
situ oligonucleotide 54675 Homo sapiens RNA
1
sample_count value_type gse order update_date
1 16 transformed count GSE30439 none 2013-04-23
----------------------------
Could you redonwload the GEOmetadb.sqlite.gz and try again? Please
don't hesitate to contact me directly if you still see any problems.
Thanks.
Jack
On Thu, Jun 13, 2013 at 9:05 AM, Thomas H. Hampton
<thomas.h.hampton at="" dartmouth.edu=""> wrote:
> Hi Sean and Jack,
>
> Sorry to pester you with this. I posted it to BioC twice and got no
response so I thought I should try contacting you more directly.
>
>
> The following getGEO query retrieves data files and meta data for a
recent GEO submission of mine,
> one that has been curated:
>
> GDS4252 <- getGEO("GDS4252")
> Columns(GDS4252)
>> str(Columns(GDS4252))
> 'data.frame': 16 obs. of 4 variables:
> $ sample : Factor w/ 16 levels
"GSM754979","GSM754980",..: 5 6 7 8 1 2 3 4 13 14 ...
> $ genotype/variation: Factor w/ 2 levels "CFTR mutant",..: 1 1 1 1
1 1 1 1 2 2 ...
> $ agent : Factor w/ 2 levels "PA01","unexposed": 1 1 1 1
2 2 2 2 1 1 ...
>
> The folks at NCBI have correctly created two factors with two levels
to describe the 16 samples in my experiment.
>
> I am interested in retrieving similar information using GEOmetadb,
but this has proved problematic.
>
> getSQLiteFile(destdir = getwd(), destfile = "GEOmetadb.sqlite.gz")
>
> con <- dbConnect(SQLite(), "GEOmetadb.sqlite")
> dat <- dbGetQuery(con, "select * from gds where gds = 'GDS4252'")
>
>> dat
> [1] ID gds title
> [4] description type pubmed_id
> [7] gpl platform_organism
platform_technology_type
> [10] feature_count sample_organism sample_type
> [13] channel_count sample_count value_type
> [16] gse order update_date
> <0 rows> (or 0-length row.names)
>
> It seems, for starters, that this GDS identifier for my particular
submission isn't accounted for in the current
> database.
>
> Others are, so it looks like my syntax and so forth is ok:
>
>> dat <- dbGetQuery(con, "select gds from gds limit 10")
>> dat
> gds
> 1 GDS5
> 2 GDS6
> 3 GDS10
> 4 GDS12
> 5 GDS15
> 6 GDS16
> 7 GDS17
> 8 GDS18
> 9 GDS19
> 10 GDS20
>
>
> There is also the question of where a set of fields (variable in
number) describing sample factors and their levels would actually
"live"
> in the SQLite database.
>
> This information does not seem to be an attribute of the GDS in any
case:
>
>> dat <- dbGetQuery(con, "select fieldname from geodb_column_desc
where TableName = 'gds'")
>> dat
> FieldName
> 1 ID
> 2 channel_count
> 3 description
> 4 feature_count
> 5 gds
> 6 order
> 7 platform
> 8 platform_organism
> 9 platform_technology_type
> 10 pubmed_id
> 11 reference_series
> 12 sample_count
> 13 sample_organism
> 14 sample_type
> 15 title
> 16 type
> 17 update_date
> 18 value_type
>
> Nor does it seem to be a feature stored in the samples:
>
>> dat <- dbGetQuery(con, "select fieldname from geodb_column_desc
where TableName = 'gsm'")
>> dat
> FieldName
> 1 ID
> 2 channel_count
> 3 characteristics_ch1
> 4 characteristics_ch2
> 5 contact
> 6 data_processing
> 7 data_row_count
> 8 description
> 9 extract_protocol_ch1
> 10 extract_protocol_ch2
> 11 gpl
> 12 gse
> 13 gsm
> 14 hyb_protocol
> 15 label_ch1
> 16 label_ch2
> 17 label_protocol_ch1
> 18 label_protocol_ch2
> 19 last_update_date
> 20 molecule_ch1
> 21 molecule_ch2
> 22 organism_ch1
> 23 organism_ch2
> 24 source_name_ch1
> 25 source_name_ch2
> 26 status
> 27 submission_date
> 28 supplementary_file
> 29 title
> 30 treatment_protocol_ch1
> 31 treatment_protocol_ch2
> 32 type
>
>
> Any advice greatly appreciated.
>
>
> Tom
>