Question

collect an expression dataset

0

Entering edit mode

libya.tahani • 0

@libyatahani-9206

Last seen 9.3 years ago

Libyan Arab Jamahiriya

Hello ,

I want to collect a raw data or an expression dataset for breast cancer as a validation for my project. But I did not find within my purpose:

1) tumor-normal samples

2) affy

Is there anyone who used to have this dataset or information please?

Could you give me a GEO link or website?

Regards,

Tahani,

limma CEL GEO affydata expresiondataset • 2.6k views

ADD COMMENT • link updated 10.1 years ago by Robert Castelo ★ 3.4k • written 10.1 years ago by libya.tahani • 0

score 0 · Answer 1 · 2015-11-25

0

Entering edit mode

Robert Castelo ★ 3.4k

@rcastelo

Last seen 3 months ago

Barcelona/Universitat Pompeu Fabra

hi Tahani,

you should check the GEOmetadb BioC package that will allow you to interrogate the Gene Expression Omnibus database for the data you are looking for. The vignette of the package is a good starting point, it has several examples that you can adapt to your needs.

cheers,

robert.

ADD COMMENT • link 10.1 years ago Robert Castelo ★ 3.4k

Martin Morgan · Answer 2 · 2015-11-25

Thanks , but I faced some problem while I downloading some thing in R like :

if(!file.exists('GEOmetadb.sqlite')) getSQLiteFile()

 trying URL 'http://dl.dropbox.com/u/51653511/GEOmetadb.sqlite.gz'
Content type 'application/octet-stream' length 272661479 bytes (260.0 MB)
downloaded 66.5 MB

Unzipping...
Error in sqliteSendQuery(con, statement, bind.data) :
  error in statement: database disk image is malformed
In addition: Warning messages:
1: In url(url_geo_1, open = "rb") :
  InternetOpenUrl failed: '‏‏انتهت مهلة العملية'
2: In url(url_geo_2, open = "rb") :
  cannot open: HTTP status was '403 Forbidden'
3: In download.file(url_geo, destfile = localfile, mode = "wb") :
  downloaded length 69693440 != reported length 272661479
4: closing unused connection 3 (http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz)
5: In file.remove(filename) :
  cannot remove file 'C:/Users/SONY/Documents/myproject/CELCEL/CELfiles/GEOmetadb.sqlite.gz', reason 'Permission denied'
Error in sqliteSendQuery(con, statement, bind.data) :
  error in statement: database disk image is malformed

score 0 · Answer 3 · 2015-11-25

0

Entering edit mode

Robert Castelo ★ 3.4k

@rcastelo

Last seen 3 months ago

Barcelona/Universitat Pompeu Fabra

Hi,

from your output it looks like the database (.sqlite.gz) file has not been properly downloaded. This file should take actually a few Gbytes so the message that 66.5 MB have been downloaded indicates that something went wrong when doing this first step.

Please make sure that you are running the latest R 3.2.2 and the latest version of GEOmetadb. Please show the output of the command sessionInfo() if you are unsure about this.

Maybe you should check with your local system administrator about these connectivity issues.

cheers,

robert.

ADD COMMENT • link 10.1 years ago Robert Castelo ★ 3.4k

0

Entering edit mode

A small correction to my answer, when I said that the sqlite file is a few Gbytes large I was referring to the uncompresed version, the .sqlite.gz compressed version is 260.0 MB as shown in your output. Just to show you whas you should be seeing, this is the result in my linux box of this operation:

library(GEOmetadb)

## here i'm explicitly giving a destination for the file using argument 'destdir'
getSQLiteFile(destdir="/home/rcastelo")
trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz'
Content type 'application/x-gzip' length 272661479 bytes (260.0 MB)
==================================================
downloaded 260.0 MB

Unzipping...
Metadata associate with downloaded file:
                name               value
1     schema version                 1.0
2 creation timestamp 2015-11-21 11:48:28
[1] "/home/rcastelo/GEOmetadb.sqlite"

After this, you can start interrogating the GEOmetadb.sqlite file as described in the vignetted of GEOmetadb. This is my session information showing the specific version of each package involved:

sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Fedora release 12 (Constantine)

locale:
 [1] LC_CTYPE=en_US.UTF8       LC_NUMERIC=C             
 [3] LC_TIME=en_US.UTF8        LC_COLLATE=en_US.UTF8    
 [5] LC_MONETARY=en_US.UTF8    LC_MESSAGES=en_US.UTF8   
 [7] LC_PAPER=en_US.UTF8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C      

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets
[7] methods   base     

other attached packages:
 [1] GEOmetadb_1.30.0     RSQLite_1.0.0        DBI_0.3.1           
 [4] GEOquery_2.36.0      Biobase_2.30.0       BiocGenerics_0.16.1
 [7] BiocInstaller_1.20.1 vimcom_1.2-3         setwidth_1.0-4      
[10] colorout_1.1-0      

loaded via a namespace (and not attached):
[1] tools_3.2.2    RCurl_1.95-4.7 bitops_1.0-6   XML_3.98-1.3

cheers,

robert.

ADD REPLY • link 10.1 years ago Robert Castelo ★ 3.4k

0

Entering edit mode

Hi dear Robert,

Thanks a lot for your care and reply . ok, now I have raw data and I want to analysis it using linear modeling with Affymetrix microarray and I need you to help me to do so step by step because I'm new user in R and bioconductor please ..!

cheers,

Tahani.

ADD REPLY • link 10.1 years ago libya.tahani • 0