Search
Question: Use probesets with highest baseline expression for differntial gene
0
gravatar for Gordon Smyth
6.5 years ago by
Gordon Smyth34k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth34k wrote:
Dear Ekta, Jim as already pointed out that you have some incorrect perceptions about what limma does by default. If you need to keep one probe for each gene symbol after a limma lmFit, and you want to choose the probe with highest average expression, it is easy to do like this. I will assume that your linear model fit object is called 'fit', and your annotation includes a column called "Symbol" containing the gene symbol. o <- order(fit$Amean, decreasing=TRUE) dup <- duplicated(fit$genes$Symbol[o]) fit.unique <- fit[o,][!dup,] Now your fit object fit.unique has only one row for each symbol. This sort of filtering has been done in many papers when it is wished to match symbols across platforms, or to do gene set testing. Best wishes Gordon ------------------ original message ---------------- [BioC] Use probesets with highest baseline expression for differntial gene expression in LIMMA Ekta Jain Ekta_Jain at jubilantbiosys.com Thu Feb 23 04:06:09 CET 2012 Hi Jim, I am using an affymetrix chip data. I need to analyse my dataset for differential gene expression (LIMMA). Each gene can be referenced by multiple probesets and while performing LIMMA the expression values of these multiple probesets gets averaged and this averaged value is assigned to that gene. I need to be able to simply select the probeset with the highest expression value to represent a gene. LIMMA by default averages the probeset values. I am not sure if i need to modify any default settings in LIMMA or use another package. Thanks Regards, Ekta -----Original Message----- From: James W. MacDonald [mailto:jmacdon@uw.edu] Sent: 22 February 2012 19:26 To: Ekta [guest] Cc: bioconductor at r-project.org; Ekta Jain Subject: Re: [BioC] Use probesets with highest baseline expression for differntial gene expression in LIMMA Hi Ekta, On 2/21/2012 10:57 PM, Ekta [guest] wrote: > Hello All, > I am relatively new to R and bioconductor. I would like to know if there is a way to alter LIMMA defualt options such that the package instead of averaging signal intensities of probesets selects the probesets with highest baseline > expression/signal intensity? You will have to be more precise than that. What exactly do you mean by 'selects the probesets with highest baseline expression'? Do you just want any probesets where one or more samples has high expression? That doesn't require limma. Or do you want probesets where some of the samples have much higher expression than others? Best, Jim > > Any help would be greatly appreciated. > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 2.9.1 (2009-06-26) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_India.1252;LC_CTYPE=English_India.1252;LC_MONETARY= English_India.1252;LC_NUMERIC=C;LC_TIME=English_India.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_2.18.3 ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENTlink modified 6.5 years ago by ying chen340 • written 6.5 years ago by Gordon Smyth34k
0
gravatar for ying chen
6.5 years ago by
ying chen340
ying chen340 wrote:
Hi guys, When I ran arrayQualityMetrics, I got a strange error message and no result was generated. I did a similar run with another dataset last night and had no problem. I cannot tell what went wrong today. Any suggestion? Thanks a lot for the help! Ying > library("affy") Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation("pkgname")'. > library("hgu133plus2hsentrezgcdf") > mydata <- ReadAffy(cdfname="hgu133plus2hsentrezgcdf") > mydata AffyBatch object size of arrays=1164x1164 features (150 kb) cdf=hgu133plus2hsentrezgcdf (18185 affyids) number of samples=353 number of genes=18185 annotation=hgu133plus2hsentrezgcdf notes= > hist(mydata) > boxplot(mydata,col="red") > library("arrayQualityMetrics") > arrayQualityMetrics(expressionset=mydata,do.logtransform=TRUE) The directory 'arrayQualityMetrics report for mydata' has been created. Error in cpSubs(src, dest) : 'dest' does not exist, and it cannot be created: arrayQualityMetrics report for mydata In addition: Warning message: In dir.create(dest) : cannot create dir 'arrayQualityMetrics report for mydata', reason 'No such file or directory' > > sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] arrayQualityMetrics_3.10.0 hgu133plus2hsentrezgcdf_13.0.0 [3] affy_1.32.1 Biobase_2.14.0 loaded via a namespace (and not attached): [1] affyio_1.22.0 affyPLM_1.30.0 annotate_1.32.1 [4] AnnotationDbi_1.16.17 beadarray_2.4.1 BiocInstaller_1.2.1 [7] Biostrings_2.22.0 Cairo_1.5-1 cluster_1.14.2 [10] DBI_0.2-5 genefilter_1.36.0 grid_2.14.1 [13] Hmisc_3.9-2 hwriter_1.3 IRanges_1.12.6 [16] lattice_0.20-0 latticeExtra_0.6-19 limma_3.10.2 [19] preprocessCore_1.16.0 RColorBrewer_1.0-5 RSQLite_0.11.1 [22] setRNG_2009.11-1 splines_2.14.1 survival_2.36-12 [25] SVGAnnotation_0.9-0 tcltk_2.14.1 tools_2.14.1 [28] vsn_3.22.0 XML_3.9-4 xtable_1.7-0 [31] zlibbioc_1.0.0 > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.5 years ago by ying chen340
Dear Ying Something funny is going on with your filesystem. What is the output of 'getwd()' and 'dir()' after the error is thrown? Best wishes Wolfgang Feb/24/12 8:54 PM, ying chen scripsit:: > > > Hi guys, > > When I ran arrayQualityMetrics, I got a strange error message and no result was generated. I did a similar run with another dataset last night and had no problem. I cannot tell what went wrong today. Any suggestion? > > Thanks a lot for the help! > > Ying > > >> library("affy") > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'browseVignettes()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation("pkgname")'. > >> library("hgu133plus2hsentrezgcdf") >> mydata<- ReadAffy(cdfname="hgu133plus2hsentrezgcdf") >> mydata > AffyBatch object > size of arrays=1164x1164 features (150 kb) > cdf=hgu133plus2hsentrezgcdf (18185 affyids) > number of samples=353 > number of genes=18185 > annotation=hgu133plus2hsentrezgcdf > notes= >> hist(mydata) >> boxplot(mydata,col="red") >> library("arrayQualityMetrics") >> arrayQualityMetrics(expressionset=mydata,do.logtransform=TRUE) > The directory 'arrayQualityMetrics report for mydata' has been created. > Error in cpSubs(src, dest) : > 'dest' does not exist, and it cannot be created: arrayQualityMetrics report for mydata > In addition: Warning message: > In dir.create(dest) : > cannot create dir 'arrayQualityMetrics report for mydata', reason 'No such file or directory' >> >> sessionInfo() > R version 2.14.1 (2011-12-22) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] arrayQualityMetrics_3.10.0 hgu133plus2hsentrezgcdf_13.0.0 > [3] affy_1.32.1 Biobase_2.14.0 > > loaded via a namespace (and not attached): > [1] affyio_1.22.0 affyPLM_1.30.0 annotate_1.32.1 > [4] AnnotationDbi_1.16.17 beadarray_2.4.1 BiocInstaller_1.2.1 > [7] Biostrings_2.22.0 Cairo_1.5-1 cluster_1.14.2 > [10] DBI_0.2-5 genefilter_1.36.0 grid_2.14.1 > [13] Hmisc_3.9-2 hwriter_1.3 IRanges_1.12.6 > [16] lattice_0.20-0 latticeExtra_0.6-19 limma_3.10.2 > [19] preprocessCore_1.16.0 RColorBrewer_1.0-5 RSQLite_0.11.1 > [22] setRNG_2009.11-1 splines_2.14.1 survival_2.36-12 > [25] SVGAnnotation_0.9-0 tcltk_2.14.1 tools_2.14.1 > [28] vsn_3.22.0 XML_3.9-4 xtable_1.7-0 > [31] zlibbioc_1.0.0 >> > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Best wishes Wolfgang Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber
ADD REPLYlink written 6.5 years ago by Wolfgang Huber13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 233 users visited in the last hour