HI
I want to have a look at this experiment which is deposited in GEO
under
the reference:GSM 290549, this experiment contains 6 files
GSM296418.csv.gz 293.0 Kb (ftp)
<ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2ecsv%2egz="">(http)
<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM296418&" db="GSM296418%2Ecsv%2Egz&is_ftp=true">
CSV
GSM296418.locs.gz 7.2 Mb (ftp)
<ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2elocs%2egz="">(http)
<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM296418&" db="GSM296418%2Elocs%2Egz&is_ftp=true">
LOCS
GSM296418.tif.gz 51.7 Mb (ftp)
<ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2etif%2egz="">(http)
<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM296418&" db="GSM296418%2Etif%2Egz&is_ftp=true">
TIFF
GSM296418.txt.gz 11.4 Mb (ftp)
<ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2etxt%2egz="">(http)
<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM296418&" db="GSM296418%2Etxt%2Egz&is_ftp=true">
TXT
GSM296418.xml.gz 665 b (ftp)
<ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2exml%2egz="">(http)
<http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM296418&" db="GSM296418%2Exml%2Egz&is_ftp=true">
XML
I am trying to find a way to reanalyse this, but is struggling to
find
a appropriate way. I just want to know whether the genes in this array
are below or above the detection level threshold used in beadarray.
Has anybody got any advice about a way to analyse microarray data
deposited in GEO to get this kind of information?
many thanks
Nat
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
Hi Nathalie,
On 06/15/2011 02:13 PM, Nathalie Conte wrote:
> HI
> I want to have a look at this experiment which is deposited in GEO
under
> the reference:GSM 290549, this experiment contains 6 files
> GSM296418.csv.gz 293.0 Kb (ftp)
> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2ecsv%2egz="">(http)
> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM29641" 8&db="GSM296418%2Ecsv%2Egz&is_ftp=true">
> CSV
> GSM296418.locs.gz 7.2 Mb (ftp)
> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2elocs%2egz="">(http)
> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM29641" 8&db="GSM296418%2Elocs%2Egz&is_ftp=true">
> LOCS
> GSM296418.tif.gz 51.7 Mb (ftp)
> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2etif%2egz="">(http)
> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM29641" 8&db="GSM296418%2Etif%2Egz&is_ftp=true">
> TIFF
> GSM296418.txt.gz 11.4 Mb (ftp)
> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2etxt%2egz="">(http)
> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM29641" 8&db="GSM296418%2Etxt%2Egz&is_ftp=true">
> TXT
> GSM296418.xml.gz 665 b (ftp)
> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2exml%2egz="">(http)
> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM29641" 8&db="GSM296418%2Exml%2Egz&is_ftp=true">
> XML
>
> I am trying to find a way to reanalyse this, but is struggling to
find a
> appropriate way. I just want to know whether the genes in this array
are
> below or above the detection level threshold used in beadarray.
> Has anybody got any advice about a way to analyse microarray data
> deposited in GEO to get this kind of information?
> many thanks
> Nat
I am not sure what you mean by the 'detection level threshold used in
beadarray'. But you can use GEOquery as Sean suggested in a previous
mail/thread.
By plotting the density of the expression values of this array I would
say that the large peak represents the 'non-expressed' genes or genes
expressed a very low levels which could be similar to the detection
level threshold you mention. By running the following code I get that
roughly 30% of the 20K genes are below this threshold (5.36 on log2
scale from signals ranging from 0 to 16).
require("GEOquery") || stop("Could not load package 'GEOquery'.")
## download single array GSM290549
gsm <- getGEO("GSM290549")
## extract 'Illumina average value' signal data
head(Table(gsm), n=3)
## ID_REF VALUE
##1 ILMN_10000 105.0698
##2 ILMN_10001 2355.704
##3 ILMN_10002 -9.846933
x <- as.numeric(Table(gsm)[, 'VALUE'])
range(x)
##[1] -35.65039 53405.58000
## transform data according to authors in original study
Meta(gsm)$data_processing
##[1] "Data were extracted with Illumina BeadStudio software using
##background subtraction and cubic spline normalization. Data were
then
##adjusted by shifting the absolute minimum value for each array to be
##equal to 1; and then log2 transformed."
y <- log2(x + abs(min(x)) + 1)
range(y)
##[1] 0.00000 16.25923
## plot kernel density of signals
yDens <- density(y)
plot(yDens, main=Meta(gsm)$geo_accession)
## calculate the density peak value
densPeak <- yDens$x[which.max(yDens$y)]
## draw it
abline(v=densPeak, lwd=2, lty=2)
densPeak
##[1] 5.367655
2^(densPeak)
##[1] 41.28812
sum(y < densPeak)
##[1] 5821
sum(y > densPeak)
##[1] 14768
HTH.
J.
Hi Nathalie,
There is a calculateDetection function in beadarray that will compute
the detection scores commonly used for thresholding. However, this
relies on the negative controls being identifiable and present in the
data, which is not always the case for GEO-submitted data. The
approach that James suggests may be the only option.
Best wishes,
Mark
On Wed, Jun 15, 2011 at 3:09 PM, James F. Reid
<james.reid at="" ifom-ieo-campus.it=""> wrote:
> Hi Nathalie,
>
> On 06/15/2011 02:13 PM, Nathalie Conte wrote:
>>
>> HI
>> I want to have a look at this experiment which is deposited in GEO
under
>> the reference:GSM 290549, this experiment contains 6 files
>> GSM296418.csv.gz 293.0 Kb (ftp)
>>
>> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2ecsv%2egz="">(http)
>>
>> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM2964" 18&db="GSM296418%2Ecsv%2Egz&is_ftp=true">
>> CSV
>> GSM296418.locs.gz 7.2 Mb (ftp)
>>
>> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2elocs%2egz="">(http)
>>
>> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM2964" 18&db="GSM296418%2Elocs%2Egz&is_ftp=true">
>> LOCS
>> GSM296418.tif.gz 51.7 Mb (ftp)
>>
>> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2etif%2egz="">(http)
>>
>> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM2964" 18&db="GSM296418%2Etif%2Egz&is_ftp=true">
>> TIFF
>> GSM296418.txt.gz 11.4 Mb (ftp)
>>
>> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2etxt%2egz="">(http)
>>
>> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM2964" 18&db="GSM296418%2Etxt%2Egz&is_ftp=true">
>> TXT
>> GSM296418.xml.gz 665 b (ftp)
>>
>> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2exml%2egz="">(http)
>>
>> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&acc=GSM2964" 18&db="GSM296418%2Exml%2Egz&is_ftp=true">
>> XML
>>
>> I am trying to find a way to reanalyse this, but is struggling to
find a
>> appropriate way. I just want to know whether the genes in this
array are
>> below or above the detection level threshold used in beadarray.
>> Has anybody got any advice about a way to analyse microarray data
>> deposited in GEO to get this kind of information?
>> many thanks
>> Nat
>
> I am not sure what you mean by the 'detection level threshold used
in
> beadarray'. But you can use GEOquery as Sean suggested in a previous
> mail/thread.
> By plotting the density of the expression values of this array I
would say
> that the large peak represents the 'non-expressed' genes or genes
expressed
> a very low levels which could be similar to the detection level
threshold
> you mention. By running the following code I get that roughly 30% of
the 20K
> genes are below this threshold (5.36 on log2 scale from signals
ranging from
> 0 to 16).
>
> require("GEOquery") || stop("Could not load package 'GEOquery'.")
> ## download single array GSM290549
> gsm <- getGEO("GSM290549")
>
> ## extract 'Illumina average value' signal data
> head(Table(gsm), n=3)
> ## ? ? ?ID_REF ? ? VALUE
> ##1 ILMN_10000 ?105.0698
> ##2 ILMN_10001 ?2355.704
> ##3 ILMN_10002 -9.846933
> x <- as.numeric(Table(gsm)[, 'VALUE'])
> range(x)
> ##[1] ? -35.65039 53405.58000
>
> ## transform data according to authors in original study
> Meta(gsm)$data_processing
> ##[1] "Data were extracted with Illumina BeadStudio software using
> ##background subtraction and cubic spline normalization. Data were
then
> ##adjusted by shifting the absolute minimum value for each array to
be
> ##equal to 1; and then log2 transformed."
> y <- log2(x + abs(min(x)) + 1)
> range(y)
> ##[1] ?0.00000 16.25923
>
> ## plot kernel density of signals
> yDens <- density(y)
> plot(yDens, main=Meta(gsm)$geo_accession)
> ## calculate the density peak value
> densPeak <- yDens$x[which.max(yDens$y)]
> ## draw it
> abline(v=densPeak, lwd=2, lty=2)
> densPeak
> ##[1] 5.367655
> 2^(densPeak)
> ##[1] 41.28812
> sum(y < densPeak)
> ##[1] 5821
> sum(y > densPeak)
> ##[1] 14768
>
> HTH.
> J.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>