Question: Unexpected results of differential expression analysis
0
gravatar for Guest User
6.3 years ago by
Guest User12k
Guest User12k wrote:
Hello, I am analysing the GEO dataset GSE19736 using SAM (significance analysis for microarrays), particularly the R package called samr but I am not getting the results that I was expecting. According to the published study, which also uses this tool, there should be 1028 differentially expressed genes (554 up-regulated and 474 down-regulated). When I run the analysis on the data I get a lot more of genes that are differentially expressed. I don't know what I might be doing wrong or where the difference lays. I am using the following code: #Extracting files >cel <- list.celfiles() >abatch.raw <- read.celfiles(cel) #Processing >geneSummaries <- rma(abatch.raw) #Extracting expression matrix >expressionmatrix <- exprs (geneSummaries) #SAM >samrobj <- samr (data, resp.type="Quantitative", nperms=50, center.arrays=TRUE, assay.type="array") >delta=2 >samr.plot(samrobj,delta) >delta.table <- samr.compute.delta.table(samrobj) >siggenes.table<-samr.compute.siggenes.table(samrobj,2.5, data, delta.table, min.foldchange=1.5, compute.localfdr=TRUE) >samr.pvalues.from.perms (samrobj$tt, samrobj$ttstar) If I understood it correctly you can know the number of differentially expressed genes this way for the upregulated: > siggenes.table$ngenes.up and this way for the downregulated: > siggenes.table$ngenes.lo I find there are 1598 upregulated genes and 1721 downregulated genes, and the number varies greatly depending on the value I give to delta. I tried assesing differential expression with limma instead, in this case I found that the number of differentially expressed genes was half the expected... Does anyone have any clue? Thanks! -- output of sessionInfo(): R version 2.15.1 (2012-06-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 [4] LC_COLLATE=es_ES.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8 [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] compiler splines parallel stats graphics grDevices utils datasets methods [10] base other attached packages: [1] limma_3.14.4 pd.hugene.1.0.st.v1_3.8.0 GOstats_2.26.0 [4] Category_2.26.0 GSEABase_1.22.0 graph_1.38.2 [7] annaffy_1.32.0 KEGG.db_2.9.1 GO.db_2.9.0 [10] preprocessCore_1.20.0 samr_2.0 matrixStats_0.8.1 [13] impute_1.34.0 pdInfoBuilder_1.22.0 affxparser_1.30.2 [16] pd.huex.1.0.st.v2_3.8.0 RSQLite_0.11.4 oligo_1.22.0 [19] oligoClasses_1.20.0 nnet_7.3-4 mgcv_1.7-18 [22] Matrix_1.0-6 lattice_0.20-6 KernSmooth_2.23-8 [25] gcrma_2.30.0 affy_1.36.1 foreign_0.8-50 [28] DBI_0.2-7 cluster_1.14.2 survival_2.36-14 [31] rpart_3.1-54 BiocInstaller_1.8.3 annotate_1.38.0 [34] AnnotationDbi_1.22.6 Biobase_2.18.0 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] affyio_1.26.0 AnnotationForge_1.2.1 Biostrings_2.26.3 bit_1.1-10 [5] codetools_0.2-8 ff_2.2-11 foreach_1.4.1 genefilter_1.42.0 [9] GenomicRanges_1.10.7 grid_2.15.1 IRanges_1.16.6 iterators_1.0.6 [13] nlme_3.1-104 RBGL_1.36.2 R.methodsS3_1.4.2 rstudio_0.97.246 [17] stats4_2.15.1 tools_2.15.1 XML_3.96-1.1 xtable_1.7-1 [21] zlibbioc_1.4.0 -- Sent via the guest posting facility at bioconductor.org.
go limma siggenes • 680 views
ADD COMMENTlink modified 6.3 years ago by Wolfgang Huber13k • written 6.3 years ago by Guest User12k
Answer: Unexpected results of differential expression analysis
0
gravatar for Wolfgang Huber
6.3 years ago by
EMBL European Molecular Biology Laboratory
Wolfgang Huber13k wrote:
Dear Laura did you already contact the authors of that paper for a transcript of their analysis / the exact parameters, software versions, filters, etc. used? Best wishes Wolfgang On 22 Jun 2013, at 13:06, Laura [guest] <guest at="" bioconductor.org=""> wrote: > > Hello, > > I am analysing the GEO dataset GSE19736 using SAM (significance analysis for microarrays), particularly the R package called samr but I am not getting the results that I was expecting. > > According to the published study, which also uses this tool, there should be 1028 differentially expressed genes (554 up-regulated and 474 down-regulated). When I run the analysis on the data I get a lot more of genes that are differentially expressed. I don't know what I might be doing wrong or where the difference lays. > > I am using the following code: > #Extracting files >> cel <- list.celfiles() >> abatch.raw <- read.celfiles(cel) > > #Processing >> geneSummaries <- rma(abatch.raw) > > #Extracting expression matrix >> expressionmatrix <- exprs (geneSummaries) > > #SAM >> samrobj <- samr (data, resp.type="Quantitative", nperms=50, center.arrays=TRUE, assay.type="array") >> delta=2 >> samr.plot(samrobj,delta) >> delta.table <- samr.compute.delta.table(samrobj) >> siggenes.table<-samr.compute.siggenes.table(samrobj,2.5, data, delta.table, min.foldchange=1.5, compute.localfdr=TRUE) >> samr.pvalues.from.perms (samrobj$tt, samrobj$ttstar) > > > If I understood it correctly you can know the number of differentially expressed genes this way for the upregulated: >> siggenes.table$ngenes.up > > and this way for the downregulated: >> siggenes.table$ngenes.lo > > I find there are 1598 upregulated genes and 1721 downregulated genes, and the number varies greatly depending on the value I give to delta. > > I tried assesing differential expression with limma instead, in this case I found that the number of differentially expressed genes was half the expected... > > Does anyone have any clue? > Thanks! > > -- output of sessionInfo(): > > R version 2.15.1 (2012-06-22) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 > [4] LC_COLLATE=es_ES.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8 > [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C > [10] LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] compiler splines parallel stats graphics grDevices utils datasets methods > [10] base > > other attached packages: > [1] limma_3.14.4 pd.hugene.1.0.st.v1_3.8.0 GOstats_2.26.0 > [4] Category_2.26.0 GSEABase_1.22.0 graph_1.38.2 > [7] annaffy_1.32.0 KEGG.db_2.9.1 GO.db_2.9.0 > [10] preprocessCore_1.20.0 samr_2.0 matrixStats_0.8.1 > [13] impute_1.34.0 pdInfoBuilder_1.22.0 affxparser_1.30.2 > [16] pd.huex.1.0.st.v2_3.8.0 RSQLite_0.11.4 oligo_1.22.0 > [19] oligoClasses_1.20.0 nnet_7.3-4 mgcv_1.7-18 > [22] Matrix_1.0-6 lattice_0.20-6 KernSmooth_2.23-8 > [25] gcrma_2.30.0 affy_1.36.1 foreign_0.8-50 > [28] DBI_0.2-7 cluster_1.14.2 survival_2.36-14 > [31] rpart_3.1-54 BiocInstaller_1.8.3 annotate_1.38.0 > [34] AnnotationDbi_1.22.6 Biobase_2.18.0 BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] affyio_1.26.0 AnnotationForge_1.2.1 Biostrings_2.26.3 bit_1.1-10 > [5] codetools_0.2-8 ff_2.2-11 foreach_1.4.1 genefilter_1.42.0 > [9] GenomicRanges_1.10.7 grid_2.15.1 IRanges_1.16.6 iterators_1.0.6 > [13] nlme_3.1-104 RBGL_1.36.2 R.methodsS3_1.4.2 rstudio_0.97.246 > [17] stats4_2.15.1 tools_2.15.1 XML_3.96-1.1 xtable_1.7-1 > [21] zlibbioc_1.4.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENTlink written 6.3 years ago by Wolfgang Huber13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 327 users visited in the last hour