Question: Romer and symbols2indices query
0
gravatar for Gordon Smyth
9.1 years ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:
Dear Loren, I don't understand why you would want to read in a gmt file from the Broad Institute rather than use the curated rdata files that we provide for use with romer. The raw gmt files contain a mix of gene symbols of different species and a mix of official and non-official symbols. So one can't expect to match the symbols you get from a raw gmt file to your own data with any reliability. To construct the rdata files, we have carefully converted all gene aliases to official symbols and have mapped mouse to human and human to mouse orthologs. This is the reason why we don't provide a read.gmt() function in limma, or a pre-made pipeline from the GSEABase read functions. We don't want you to get unreliable results simply because the gene symbols haven't been curated. Best wishes Gordon > Date: Tue, 04 May 2010 07:43:46 -0700 > From: Loren Engrav <engrav at="" u.washington.edu=""> > To: rbioc <bioconductor at="" stat.math.ethz.ch=""> > Subject: Re: [BioC] Romer and symbols2indices query > Message-ID: <c80580b2.27d83%engrav at="" u.washington.edu=""> > Content-Type: text/plain; charset="US-ASCII" > > Thank you, got it > > Downloading rdata objects saves reading them into an rdata object, cool > > But for interest, in R/GSA there is > GSA.read.gmt(filename.gmt) to read in a .gmt file > > Does limma or romer have an equivalent function? > > >> From: Matthew Ritchie <mritchie at="" wehi.edu.au=""> >> Date: Tue, 4 May 2010 14:44:23 +1000 (EST) >> To: Loren Engrav <engrav at="" u.washington.edu=""> >> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >> Subject: Re: [BioC] Romer and symbols2indices query >> >> Dear Loren, >> >> You can find rdata objects of the Broad's MSigDB gene sets at >> >> http://bioinf.wehi.edu.au/software/MSigDB/index.html >> >> You are right, the 'symbols' argument in the function symbols2indicies() >> are the gene symbols corresponding to the probes from your microarray >> data. >> >> For example, to use the human C2 collection, download the rdata file, then >> run the following. >> >> load("human_c2.rdata") >> c2 = symbols2indices(Hs.gmtl.c2, symbols) >> >> (this assumes 'symbols' is a vector containing the gene symbols from your >> array data) >> >> Best wishes, >> >> Matt >> >>> Have done GSEA and GSA for set enrichment and am setting out to try romer >>> and have probably "simple" question >>> >>> To get the Broad set into a list of indices there is >>> symbols2indices(gmtl.official, symbols) but >>> >>> 1)how do I get the Broad set into gmtl.official? And >>> 2)is symbols a vector of MY probe sets of interest? >>> >>> I checked gmane and found only one comment about romer >>> Also checked limma reference pdf >>> >>> Thank you ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
probe limma gseabase • 660 views
ADD COMMENTlink modified 9.0 years ago • written 9.1 years ago by Gordon Smyth37k
Answer: Romer and symbols2indices query
0
gravatar for Loren Engrav
9.1 years ago by
Loren Engrav1.0k
Loren Engrav1.0k wrote:
Thank you for the response Well maybe I don't, and maybe I shouldn't. My thought was that tomorrow or day after or ??? there will be a new version of the .gmt file and it would be useful to just be able to quickly rerun things. But maybe that is faulty logic, maybe the .gmt files do not change that often. And I thought the c2 set was curated. And then simple curiosity. And I am also running GSEA, GSA, GSEA in MEV so seemed best to keep the files similar. However... I have now run romer and romer2 at weeks 1 2 3 12 and 20 as below but have not perused the results. And I have only ~2000 genes of interest so romer does not take very long so can easily run again with the files you mention. But I am searching for the files you mention in help(romer) and the 3 pdfs and have missed them. Where may I find them? Or are they the files Matthew mentioned below at <http: bioinf.wehi.edu.au="" software="" msigdb="" index.html="">? I also have the other two questions in the "Romer warning serious? and nrot=9999?" thread. Thank you again for the response ===================== romerDesign <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8,9,9,9,10,10 ,10))) colnames(romerDesign) <- c("DW1", "YW1", "DW2", "YW2", "DW3", "YW3", "DW12","YW12", "DW20", "YW20") romerWkxcontrast.matrix <- makeContrasts(YWx-DWx, levels=romerDesign) romerLR <- read.delim (file="romer1842LR.txt", header= FALSE, sep = "\t") romerLRmatrix<- as.matrix (romerLR) romerSymbols <- GSA1842symbolsvector Broad_c2.all.v2.5.symbols.gmt <- getGmt("c2.all.v2.5.symbols.gmt", collectionType=BroadCollection(category="c2"), geneIdType=SymbolIdentifier()) Broad_c2.all.v2.5.symbols.gmtList <- geneIds(Broad_c2.all.v2.5.symbols.gmt) names(Broad_c2.all.v2.5.symbols.gmtList) <- names(Broad_c2.all.v2.5.symbols.gmt) Broad_c2.all.v2.5.symbols.gmtIndices = symbols2indices(Broad_c2.all.v2.5.symbols.gmtList, romerSymbols) romerResultWkx <- romer(Broad_c2.all.v2.5.symbols.gmtIndices,romerLRmatrix,romerDesign,c ontras t=romerWxcontrast.matrix,array.weights=NULL,block=NULL,correlation,flo or=FAL SE,nrot=1000) romer2ResultWkx <- romer2(Broad_c2.all.v2.5.symbols.gmtIndices,romerLRmatrix,romerDesign, contra st=romerWxcontrast.matrix,array.weights=NULL,block=NULL,correlation,nr ot=100 0) > From: Gordon K Smyth <smyth at="" wehi.edu.au=""> > Date: Thu, 6 May 2010 09:26:50 +1000 (AUS Eastern Standard Time) > To: Loren Engrav <engrav at="" u.washington.edu=""> > Cc: Yifang Hu <hu at="" wehi.edu.au="">, rbioc <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Romer and symbols2indices query > > Dear Loren, > > I don't understand why you would want to read in a gmt file from the Broad > Institute rather than use the curated rdata files that we provide for use > with romer. The raw gmt files contain a mix of gene symbols of different > species and a mix of official and non-official symbols. So one can't > expect to match the symbols you get from a raw gmt file to your own data > with any reliability. To construct the rdata files, we have carefully > converted all gene aliases to official symbols and have mapped mouse to > human and human to mouse orthologs. > > This is the reason why we don't provide a read.gmt() function in limma, or > a pre-made pipeline from the GSEABase read functions. We don't want you > to get unreliable results simply because the gene symbols haven't been > curated. > > Best wishes > Gordon > >> Date: Tue, 04 May 2010 07:43:46 -0700 >> From: Loren Engrav <engrav at="" u.washington.edu=""> >> To: rbioc <bioconductor at="" stat.math.ethz.ch=""> >> Subject: Re: [BioC] Romer and symbols2indices query >> Message-ID: <c80580b2.27d83%engrav at="" u.washington.edu=""> >> Content-Type: text/plain; charset="US-ASCII" >> >> Thank you, got it >> >> Downloading rdata objects saves reading them into an rdata object, cool >> >> But for interest, in R/GSA there is >> GSA.read.gmt(filename.gmt) to read in a .gmt file >> >> Does limma or romer have an equivalent function? >> >> >>> From: Matthew Ritchie <mritchie at="" wehi.edu.au=""> >>> Date: Tue, 4 May 2010 14:44:23 +1000 (EST) >>> To: Loren Engrav <engrav at="" u.washington.edu=""> >>> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >>> Subject: Re: [BioC] Romer and symbols2indices query >>> >>> Dear Loren, >>> >>> You can find rdata objects of the Broad's MSigDB gene sets at >>> >>> http://bioinf.wehi.edu.au/software/MSigDB/index.html >>> >>> You are right, the 'symbols' argument in the function symbols2indicies() >>> are the gene symbols corresponding to the probes from your microarray >>> data. >>> >>> For example, to use the human C2 collection, download the rdata file, then >>> run the following. >>> >>> load("human_c2.rdata") >>> c2 = symbols2indices(Hs.gmtl.c2, symbols) >>> >>> (this assumes 'symbols' is a vector containing the gene symbols from your >>> array data) >>> >>> Best wishes, >>> >>> Matt >>> >>>> Have done GSEA and GSA for set enrichment and am setting out to try romer >>>> and have probably "simple" question >>>> >>>> To get the Broad set into a list of indices there is >>>> symbols2indices(gmtl.official, symbols) but >>>> >>>> 1)how do I get the Broad set into gmtl.official? And >>>> 2)is symbols a vector of MY probe sets of interest? >>>> >>>> I checked gmane and found only one comment about romer >>>> Also checked limma reference pdf >>>> >>>> Thank you > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:7}}
ADD COMMENTlink written 9.1 years ago by Loren Engrav1.0k
Answer: Romer and symbols2indices query
0
gravatar for Gordon Smyth
9.0 years ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:
Dear Loren, Don't forget to run MYsymbolsOfficial <- alias2SymbolsTable(MYsymbols,species="Hs") (choose the appropriate species) before running X <- symbols2indices (ListOfGeneSets, MYsymbolOfficial) otherwise you may miss many matches. Matching by gene symbol is unreliable unless you convert everything to current official symbols. Best wishes Gordon > Date: Wed, 05 May 2010 08:15:04 -0700 > From: Loren Engrav <engrav at="" u.washington.edu=""> > To: rbioc <bioconductor at="" stat.math.ethz.ch=""> > Subject: Re: [BioC] Romer and symbols2indices query > Message-ID: <c806d988.27e94%engrav at="" u.washington.edu=""> > Content-Type: text/plain; charset="US-ASCII" > > > Bingo, thank you > and romer ran > > These missing little tidbits can be brutal > >> From: Martin Morgan <mtmorgan at="" fhcrc.org=""> >> Date: Wed, 05 May 2010 05:11:02 -0700 >> To: Loren Engrav <engrav at="" u.washington.edu=""> >> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >> Subject: Re: [BioC] Romer and symbols2indices query >> >> On 05/04/2010 07:57 PM, Loren Engrav wrote: >>> Am back >>> >>> So I have romer and GSEABase running via previous help thank you, but while >>> running I explore GSEABase >>> >>> And I have a lesser question for interest >>> >>> In GSEABase I do >>> gmtObject <- getGMT("c2all.v2.5.symbols.gmt", >>> collectionType=BroadCollection(category="c2"), geneType=SymbolIdentifier()) >> >> or maybe getBroadSets ? >> >>> which finishes without error >>> Then >>> class(gmtObject) is GeneSetCollection >>> >>> How do I convert gmtObject to a list of gene sets as required in romer when >>> using >> >> gmtl <- geneIds(gmtObject) >> names(gmtl) <- names(gmtObject) >> >> ? >> >> Martin >> >>> X <- symbols2indices (ListOfGeneSets, MYsymbols) >>> >>> >>> >>> From: Vincent Carey <stvjc at="" channing.harvard.edu=""> >>> Date: Tue, 4 May 2010 11:40:39 -0400 >>> To: Loren Engrav <engrav at="" u.washington.edu=""> >>> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >>> Subject: Re: [BioC] Romer and symbols2indices query >>> >>> Very briefly, the GSEABase package has relevant utilities for gmt file >>> import/export and may be worth considering for these tasks. >>> >>> On Tue, May 4, 2010 at 10:43 AM, Loren Engrav <engrav at="" u.washington.edu=""> >>> wrote: >>>> Thank you, got it >>>> >>>> Downloading rdata objects saves reading them into an rdata object, cool >>>> >>>> But for interest, in R/GSA there is >>>> GSA.read.gmt(filename.gmt) to read in a .gmt file >>>> >>>> Does limma or romer have an equivalent function? >>>> >>>> >>>>> From: Matthew Ritchie <mritchie at="" wehi.edu.au=""> >>>>> Date: Tue, 4 May 2010 14:44:23 +1000 (EST) >>>>> To: Loren Engrav <engrav at="" u.washington.edu=""> >>>>> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >>>>> Subject: Re: [BioC] Romer and symbols2indices query >>>>> >>>>> Dear Loren, >>>>> >>>>> You can find rdata objects of the Broad's MSigDB gene sets at >>>>> >>>>> http://bioinf.wehi.edu.au/software/MSigDB/index.html >>>>> >>>>> You are right, the 'symbols' argument in the function symbols2indicies() >>>>> are the gene symbols corresponding to the probes from your microarray >>>>> data. >>>>> >>>>> For example, to use the human C2 collection, download the rdata file, then >>>>> run the following. >>>>> >>>>> load("human_c2.rdata") >>>>> c2 = symbols2indices(Hs.gmtl.c2, symbols) >>>>> >>>>> (this assumes 'symbols' is a vector containing the gene symbols from your >>>>> array data) >>>>> >>>>> Best wishes, >>>>> >>>>> Matt >>>>> >>>>>> Have done GSEA and GSA for set enrichment and am setting out to try romer >>>>>> and have probably "simple" question >>>>>> >>>>>> To get the Broad set into a list of indices there is >>>>>> symbols2indices(gmtl.official, symbols) but >>>>>> >>>>>> 1)how do I get the Broad set into gmtl.official? And >>>>>> 2)is symbols a vector of MY probe sets of interest? >>>>>> >>>>>> I checked gmane and found only one comment about romer >>>>>> Also checked limma reference pdf >>>>>> >>>>>> Thank you ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENTlink written 9.0 years ago by Gordon Smyth37k
Thank you very much, this is interesting I did as you suggest and 7 of 1842 symbols were "incorrect" I will fix and rerun Thank you again, am grateful > From: Gordon K Smyth <smyth at="" wehi.edu.au=""> > Date: Fri, 7 May 2010 10:35:55 +1000 (AUS Eastern Standard Time) > To: Loren Engrav <engrav at="" u.washington.edu=""> > Cc: Yifang Hu <hu at="" wehi.edu.au="">, rbioc <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Romer and symbols2indices query > > Dear Loren, > > Don't forget to run > > MYsymbolsOfficial <- alias2SymbolsTable(MYsymbols,species="Hs") > > (choose the appropriate species) before running > > X <- symbols2indices (ListOfGeneSets, MYsymbolOfficial) > > otherwise you may miss many matches. Matching by gene symbol is > unreliable unless you convert everything to current official symbols. > > Best wishes > Gordon > > > >> Date: Wed, 05 May 2010 08:15:04 -0700 >> From: Loren Engrav <engrav at="" u.washington.edu=""> >> To: rbioc <bioconductor at="" stat.math.ethz.ch=""> >> Subject: Re: [BioC] Romer and symbols2indices query >> Message-ID: <c806d988.27e94%engrav at="" u.washington.edu=""> >> Content-Type: text/plain; charset="US-ASCII" >> >> >> Bingo, thank you >> and romer ran >> >> These missing little tidbits can be brutal >> >>> From: Martin Morgan <mtmorgan at="" fhcrc.org=""> >>> Date: Wed, 05 May 2010 05:11:02 -0700 >>> To: Loren Engrav <engrav at="" u.washington.edu=""> >>> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >>> Subject: Re: [BioC] Romer and symbols2indices query >>> >>> On 05/04/2010 07:57 PM, Loren Engrav wrote: >>>> Am back >>>> >>>> So I have romer and GSEABase running via previous help thank you, but while >>>> running I explore GSEABase >>>> >>>> And I have a lesser question for interest >>>> >>>> In GSEABase I do >>>> gmtObject <- getGMT("c2all.v2.5.symbols.gmt", >>>> collectionType=BroadCollection(category="c2"), geneType=SymbolIdentifier()) >>> >>> or maybe getBroadSets ? >>> >>>> which finishes without error >>>> Then >>>> class(gmtObject) is GeneSetCollection >>>> >>>> How do I convert gmtObject to a list of gene sets as required in romer when >>>> using >>> >>> gmtl <- geneIds(gmtObject) >>> names(gmtl) <- names(gmtObject) >>> >>> ? >>> >>> Martin >>> >>>> X <- symbols2indices (ListOfGeneSets, MYsymbols) >>>> >>>> >>>> >>>> From: Vincent Carey <stvjc at="" channing.harvard.edu=""> >>>> Date: Tue, 4 May 2010 11:40:39 -0400 >>>> To: Loren Engrav <engrav at="" u.washington.edu=""> >>>> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >>>> Subject: Re: [BioC] Romer and symbols2indices query >>>> >>>> Very briefly, the GSEABase package has relevant utilities for gmt file >>>> import/export and may be worth considering for these tasks. >>>> >>>> On Tue, May 4, 2010 at 10:43 AM, Loren Engrav <engrav at="" u.washington.edu=""> >>>> wrote: >>>>> Thank you, got it >>>>> >>>>> Downloading rdata objects saves reading them into an rdata object, cool >>>>> >>>>> But for interest, in R/GSA there is >>>>> GSA.read.gmt(filename.gmt) to read in a .gmt file >>>>> >>>>> Does limma or romer have an equivalent function? >>>>> >>>>> >>>>>> From: Matthew Ritchie <mritchie at="" wehi.edu.au=""> >>>>>> Date: Tue, 4 May 2010 14:44:23 +1000 (EST) >>>>>> To: Loren Engrav <engrav at="" u.washington.edu=""> >>>>>> Cc: rbioc <bioconductor at="" stat.math.ethz.ch=""> >>>>>> Subject: Re: [BioC] Romer and symbols2indices query >>>>>> >>>>>> Dear Loren, >>>>>> >>>>>> You can find rdata objects of the Broad's MSigDB gene sets at >>>>>> >>>>>> http://bioinf.wehi.edu.au/software/MSigDB/index.html >>>>>> >>>>>> You are right, the 'symbols' argument in the function symbols2indicies() >>>>>> are the gene symbols corresponding to the probes from your microarray >>>>>> data. >>>>>> >>>>>> For example, to use the human C2 collection, download the rdata file, >>>>>> then >>>>>> run the following. >>>>>> >>>>>> load("human_c2.rdata") >>>>>> c2 = symbols2indices(Hs.gmtl.c2, symbols) >>>>>> >>>>>> (this assumes 'symbols' is a vector containing the gene symbols from your >>>>>> array data) >>>>>> >>>>>> Best wishes, >>>>>> >>>>>> Matt >>>>>> >>>>>>> Have done GSEA and GSA for set enrichment and am setting out to try >>>>>>> romer >>>>>>> and have probably "simple" question >>>>>>> >>>>>>> To get the Broad set into a list of indices there is >>>>>>> symbols2indices(gmtl.official, symbols) but >>>>>>> >>>>>>> 1)how do I get the Broad set into gmtl.official? And >>>>>>> 2)is symbols a vector of MY probe sets of interest? >>>>>>> >>>>>>> I checked gmane and found only one comment about romer >>>>>>> Also checked limma reference pdf >>>>>>> >>>>>>> Thank you > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:7}}
ADD REPLYlink written 9.1 years ago by Loren Engrav1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 158 users visited in the last hour