illumina beadarray GEO files
1
0
Entering edit mode
nac ▴ 280
@nac-4545
Last seen 10.2 years ago
HI I want to have a look at this experiment which is deposited in GEO under the reference:GSM 290549, this experiment contains 6 files GSM296418.csv.gz 293.0 Kb (ftp) <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2ecsv%2egz="">(http) <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM296418&amp;" db="GSM296418%2Ecsv%2Egz&amp;is_ftp=true"> CSV GSM296418.locs.gz 7.2 Mb (ftp) <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2elocs%2egz="">(http) <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM296418&amp;" db="GSM296418%2Elocs%2Egz&amp;is_ftp=true"> LOCS GSM296418.tif.gz 51.7 Mb (ftp) <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2etif%2egz="">(http) <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM296418&amp;" db="GSM296418%2Etif%2Egz&amp;is_ftp=true"> TIFF GSM296418.txt.gz 11.4 Mb (ftp) <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2etxt%2egz="">(http) <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM296418&amp;" db="GSM296418%2Etxt%2Egz&amp;is_ftp=true"> TXT GSM296418.xml.gz 665 b (ftp) <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" g="" sm296418="" gsm296418%2exml%2egz="">(http) <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM296418&amp;" db="GSM296418%2Exml%2Egz&amp;is_ftp=true"> XML I am trying to find a way to reanalyse this, but is struggling to find a appropriate way. I just want to know whether the genes in this array are below or above the detection level threshold used in beadarray. Has anybody got any advice about a way to analyse microarray data deposited in GEO to get this kind of information? many thanks Nat -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
Microarray beadarray Microarray beadarray • 1.2k views
ADD COMMENT
0
Entering edit mode
James F. Reid ▴ 610
@james-f-reid-3148
Last seen 10.2 years ago
Hi Nathalie, On 06/15/2011 02:13 PM, Nathalie Conte wrote: > HI > I want to have a look at this experiment which is deposited in GEO under > the reference:GSM 290549, this experiment contains 6 files > GSM296418.csv.gz 293.0 Kb (ftp) > <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2ecsv%2egz="">(http) > <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM29641" 8&db="GSM296418%2Ecsv%2Egz&amp;is_ftp=true"> > CSV > GSM296418.locs.gz 7.2 Mb (ftp) > <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2elocs%2egz="">(http) > <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM29641" 8&db="GSM296418%2Elocs%2Egz&amp;is_ftp=true"> > LOCS > GSM296418.tif.gz 51.7 Mb (ftp) > <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2etif%2egz="">(http) > <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM29641" 8&db="GSM296418%2Etif%2Egz&amp;is_ftp=true"> > TIFF > GSM296418.txt.gz 11.4 Mb (ftp) > <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2etxt%2egz="">(http) > <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM29641" 8&db="GSM296418%2Etxt%2Egz&amp;is_ftp=true"> > TXT > GSM296418.xml.gz 665 b (ftp) > <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nnn="" gsm296418="" gsm296418%2exml%2egz="">(http) > <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM29641" 8&db="GSM296418%2Exml%2Egz&amp;is_ftp=true"> > XML > > I am trying to find a way to reanalyse this, but is struggling to find a > appropriate way. I just want to know whether the genes in this array are > below or above the detection level threshold used in beadarray. > Has anybody got any advice about a way to analyse microarray data > deposited in GEO to get this kind of information? > many thanks > Nat I am not sure what you mean by the 'detection level threshold used in beadarray'. But you can use GEOquery as Sean suggested in a previous mail/thread. By plotting the density of the expression values of this array I would say that the large peak represents the 'non-expressed' genes or genes expressed a very low levels which could be similar to the detection level threshold you mention. By running the following code I get that roughly 30% of the 20K genes are below this threshold (5.36 on log2 scale from signals ranging from 0 to 16). require("GEOquery") || stop("Could not load package 'GEOquery'.") ## download single array GSM290549 gsm <- getGEO("GSM290549") ## extract 'Illumina average value' signal data head(Table(gsm), n=3) ## ID_REF VALUE ##1 ILMN_10000 105.0698 ##2 ILMN_10001 2355.704 ##3 ILMN_10002 -9.846933 x <- as.numeric(Table(gsm)[, 'VALUE']) range(x) ##[1] -35.65039 53405.58000 ## transform data according to authors in original study Meta(gsm)$data_processing ##[1] "Data were extracted with Illumina BeadStudio software using ##background subtraction and cubic spline normalization. Data were then ##adjusted by shifting the absolute minimum value for each array to be ##equal to 1; and then log2 transformed." y <- log2(x + abs(min(x)) + 1) range(y) ##[1] 0.00000 16.25923 ## plot kernel density of signals yDens <- density(y) plot(yDens, main=Meta(gsm)$geo_accession) ## calculate the density peak value densPeak <- yDens$x[which.max(yDens$y)] ## draw it abline(v=densPeak, lwd=2, lty=2) densPeak ##[1] 5.367655 2^(densPeak) ##[1] 41.28812 sum(y < densPeak) ##[1] 5821 sum(y > densPeak) ##[1] 14768 HTH. J.
ADD COMMENT
0
Entering edit mode
Hi Nathalie, There is a calculateDetection function in beadarray that will compute the detection scores commonly used for thresholding. However, this relies on the negative controls being identifiable and present in the data, which is not always the case for GEO-submitted data. The approach that James suggests may be the only option. Best wishes, Mark On Wed, Jun 15, 2011 at 3:09 PM, James F. Reid <james.reid at="" ifom-ieo-campus.it=""> wrote: > Hi Nathalie, > > On 06/15/2011 02:13 PM, Nathalie Conte wrote: >> >> HI >> I want to have a look at this experiment which is deposited in GEO under >> the reference:GSM 290549, this experiment contains 6 files >> GSM296418.csv.gz 293.0 Kb (ftp) >> >> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2ecsv%2egz="">(http) >> >> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM2964" 18&db="GSM296418%2Ecsv%2Egz&amp;is_ftp=true"> >> CSV >> GSM296418.locs.gz 7.2 Mb (ftp) >> >> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2elocs%2egz="">(http) >> >> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM2964" 18&db="GSM296418%2Elocs%2Egz&amp;is_ftp=true"> >> LOCS >> GSM296418.tif.gz 51.7 Mb (ftp) >> >> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2etif%2egz="">(http) >> >> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM2964" 18&db="GSM296418%2Etif%2Egz&amp;is_ftp=true"> >> TIFF >> GSM296418.txt.gz 11.4 Mb (ftp) >> >> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2etxt%2egz="">(http) >> >> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM2964" 18&db="GSM296418%2Etxt%2Egz&amp;is_ftp=true"> >> TXT >> GSM296418.xml.gz 665 b (ftp) >> >> <ftp: ftp.ncbi.nih.gov="" pub="" geo="" data="" supplementary="" samples="" gsm296nn="" n="" gsm296418="" gsm296418%2exml%2egz="">(http) >> >> <http: www.ncbi.nlm.nih.gov="" geo="" query="" acc.cgi?mode="raw&amp;acc=GSM2964" 18&db="GSM296418%2Exml%2Egz&amp;is_ftp=true"> >> XML >> >> I am trying to find a way to reanalyse this, but is struggling to find a >> appropriate way. I just want to know whether the genes in this array are >> below or above the detection level threshold used in beadarray. >> Has anybody got any advice about a way to analyse microarray data >> deposited in GEO to get this kind of information? >> many thanks >> Nat > > I am not sure what you mean by the 'detection level threshold used in > beadarray'. But you can use GEOquery as Sean suggested in a previous > mail/thread. > By plotting the density of the expression values of this array I would say > that the large peak represents the 'non-expressed' genes or genes expressed > a very low levels which could be similar to the detection level threshold > you mention. By running the following code I get that roughly 30% of the 20K > genes are below this threshold (5.36 on log2 scale from signals ranging from > 0 to 16). > > require("GEOquery") || stop("Could not load package 'GEOquery'.") > ## download single array GSM290549 > gsm <- getGEO("GSM290549") > > ## extract 'Illumina average value' signal data > head(Table(gsm), n=3) > ## ? ? ?ID_REF ? ? VALUE > ##1 ILMN_10000 ?105.0698 > ##2 ILMN_10001 ?2355.704 > ##3 ILMN_10002 -9.846933 > x <- as.numeric(Table(gsm)[, 'VALUE']) > range(x) > ##[1] ? -35.65039 53405.58000 > > ## transform data according to authors in original study > Meta(gsm)$data_processing > ##[1] "Data were extracted with Illumina BeadStudio software using > ##background subtraction and cubic spline normalization. Data were then > ##adjusted by shifting the absolute minimum value for each array to be > ##equal to 1; and then log2 transformed." > y <- log2(x + abs(min(x)) + 1) > range(y) > ##[1] ?0.00000 16.25923 > > ## plot kernel density of signals > yDens <- density(y) > plot(yDens, main=Meta(gsm)$geo_accession) > ## calculate the density peak value > densPeak <- yDens$x[which.max(yDens$y)] > ## draw it > abline(v=densPeak, lwd=2, lty=2) > densPeak > ##[1] 5.367655 > 2^(densPeak) > ##[1] 41.28812 > sum(y < densPeak) > ##[1] 5821 > sum(y > densPeak) > ##[1] 14768 > > HTH. > J. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 999 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6