Another question Re: question on hgu95 metadata and ontoTools
2
0
Entering edit mode
@elisabetta-manduchi-575
Last seen 9.7 years ago
Hi, I'm writing with a questions that follows up on a previous correspondence I had within this mailing list (copied below). Essentially I'm trying to use ontoTools to ge a mapping between the union of the probe sets on the HGU95 av2, b, c, d, e and GO biological process terms. To this end, I've first created an environment, named hgu95GO, which was defined as described below, using parent.env, following a suggestion by R. Gentleman: > hgu95GO<-hgu95av2GO > parent.env(hgu95GO)<-hgu95bGO > parent.env(hgu95bGO)<-hgu95cGO > parent.env(hgu95cGO)<-hgu95dGO) > parent.env(hgu95dGO)<-hgu95eGO For this we have > length(ls(env=hgu95GO)) [1] 12625 which is the same as the length for hgu95av2GO, rather then the length of the union of the 5 probe set collections from av2 to e (which is 62906). However if I look for a value of a key corresponding to a probe set from b, c,... indeed it gives me something by looking at the parent. I'm not too familiar with environments in R, but I guess this is the expected behavior. Now, I've built a mapping: ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO) where tms are the nodes of a go biological process graph that I've built and obs is ls(env=hgu95GO). I'm concerned though that since the latter seems to be equal to ls(env=hgu95av2) only, I'm not getting the mapping that I desired from the union of all *5* collections of probe sets on the 5 Affy chips to my GO BP terms and indeed if I run: > ooc.Hgu95.GOBP<-makeOOC(goBPonto, ooMapHgu952GOBP) > print(OOmap(ooc.Hgu95.GOBP)) named sparse matrix of dim[1] 12625 7807 the first dimension is 12625, not 62906. Am I correct in my interpretation that I'm not getting what I was seeking? If so, how can I get around it? I.e. what would be the quickest way to build a mapping from the union of the probe sets on the 5 Affy Chips to my Go BP ontology? Thanks for any help you might give me, Elisabetta On Wed, 17 Dec 2003, Elisabetta Manduchi wrote: > > Robert, > thank you very much for such a prompt reply. > A question, with your notation below, shouldn't I do my gets on E5 rather > than on E1, if E1 is the ancestor? Was that a just typo or am I > misunderstanding? > In other words to deal with my case, I guess I could do the following > (calling hgu95GO my new combined environment): > > hgu95GO<-hgu95av2GO > parent.env(hgu95GO)<-hgu95bGO > parent.env(hgu95bGO)<-hgu95cGO > parent.env(hgu95cGO)<-hgu95dGO) > parent.env(hgu95dGO)<-hgu95eGO > > and then do my gets on hgu95GO, right? > Elisabetta > > On Wed, 17 Dec 2003, Robert Gentleman wrote: > > > On Wed, Dec 17, 2003 at 03:47:54PM -0500, Elisabetta Manduchi wrote: > > > > > > Hi, > > > I would like to create an environment that combines the hgu95av2GO, > > > hgu95bGO, ...hgu95eGO into just one environment, that I can subsequently > > > use as the otkvEnv argument in the ontoTools function otkvEnv2namedSparse. > > > Is there a quick and simple way to do this in R, without having to define > > > a new hash and the key-value mapping block by block (according to which > > > environment the key belongs to)? In other words, I'm asking if there is a > > > one-stop way in R to "union" the above 5 environments. > > > > Sort of, in R there are no hash tables but environments are close so > > we used them. They have a rather unique aspect which is the parent > > environment. So that if a value is not found in the first environment > > the parent is searched. So to solve your problem you could do some > > thing like > > > > parent.env(E2) <- E1 > > parent.env(E3) <- E2 > > ... > > > > then do your get's on E1 (with inherits=TRUE, which is the default > > in get) and you should be almost set. The one issue that is not > > easily solved is the if inherits is TRUE then you search up beyond > > the last of your environments (your work space and then the search > > list). But that should not be a problem in your case.... > > > > Robert > > > > > Thanks, > > > Elisabetta > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor@stat.math.ethz.ch > > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > > > -- Elisabetta Manduchi Computational Biology and Informatics Laboratory Center for Bioinformatics University of Pennsylvania 1428 Blockley Hall 423 Guardian Drive Philadelphia, PA 19104-6021 phone: 215-573-4408 fax: 215 573-3111 email: manduchi@pcbi.upenn.edu web: http://www.cbil.upenn.edu/~manduchi ---
GO probe affy graph ontoTools PROcess GO probe affy graph ontoTools PROcess • 1.2k views
ADD COMMENT
0
Entering edit mode
@elisabetta-manduchi-575
Last seen 9.7 years ago
Hi, I'm writing with a questions that follows up on a previous correspondence I had within this mailing list (copied below). Essentially I'm trying to use ontoTools to ge a mapping between the union of the probe sets on the HGU95 av2, b, c, d, e and GO biological process terms. To this end, I've first created an environment, named hgu95GO, which was defined as described below, using parent.env, following a suggestion by R. Gentleman: > hgu95GO<-hgu95av2GO > parent.env(hgu95GO)<-hgu95bGO > parent.env(hgu95bGO)<-hgu95cGO > parent.env(hgu95cGO)<-hgu95dGO) > parent.env(hgu95dGO)<-hgu95eGO For this we have > length(ls(env=hgu95GO)) [1] 12625 which is the same as the length for hgu95av2GO, rather then the length of the union of the 5 probe set collections from av2 to e (which is 62906). However if I look for a value of a key corresponding to a probe set from b, c,... indeed it gives me something by looking at the parent. I'm not too familiar with environments in R, but I guess this is the expected behavior. Now, I've built a mapping: ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO) where tms are the nodes of a go biological process graph that I've built and obs is ls(env=hgu95GO). I'm concerned though that since the latter seems to be equal to ls(env=hgu95av2) only, I'm not getting the mapping that I desired from the union of all *5* collections of probe sets on the 5 Affy chips to my GO BP terms and indeed if I run: > ooc.Hgu95.GOBP<-makeOOC(goBPonto, ooMapHgu952GOBP) > print(OOmap(ooc.Hgu95.GOBP)) named sparse matrix of dim[1] 12625 7807 the first dimension is 12625, not 62906. Am I correct in my interpretation that I'm not getting what I was seeking? If so, how can I get around it? I.e. what would be the quickest way to build a mapping from the union of the probe sets on the 5 Affy Chips to my Go BP ontology? Thanks for any help you might give me, Elisabetta On Wed, 17 Dec 2003, Elisabetta Manduchi wrote: > > Robert, > thank you very much for such a prompt reply. > A question, with your notation below, shouldn't I do my gets on E5 rather > than on E1, if E1 is the ancestor? Was that a just typo or am I > misunderstanding? > In other words to deal with my case, I guess I could do the following > (calling hgu95GO my new combined environment): > > hgu95GO<-hgu95av2GO > parent.env(hgu95GO)<-hgu95bGO > parent.env(hgu95bGO)<-hgu95cGO > parent.env(hgu95cGO)<-hgu95dGO) > parent.env(hgu95dGO)<-hgu95eGO > > and then do my gets on hgu95GO, right? > Elisabetta > > On Wed, 17 Dec 2003, Robert Gentleman wrote: > > > On Wed, Dec 17, 2003 at 03:47:54PM -0500, Elisabetta Manduchi wrote: > > > > > > Hi, > > > I would like to create an environment that combines the hgu95av2GO, > > > hgu95bGO, ...hgu95eGO into just one environment, that I can subsequently > > > use as the otkvEnv argument in the ontoTools function otkvEnv2namedSparse. > > > Is there a quick and simple way to do this in R, without having to define > > > a new hash and the key-value mapping block by block (according to which > > > environment the key belongs to)? In other words, I'm asking if there is a > > > one-stop way in R to "union" the above 5 environments. > > > > Sort of, in R there are no hash tables but environments are close so > > we used them. They have a rather unique aspect which is the parent > > environment. So that if a value is not found in the first environment > > the parent is searched. So to solve your problem you could do some > > thing like > > > > parent.env(E2) <- E1 > > parent.env(E3) <- E2 > > ... > > > > then do your get's on E1 (with inherits=TRUE, which is the default > > in get) and you should be almost set. The one issue that is not > > easily solved is the if inherits is TRUE then you search up beyond > > the last of your environments (your work space and then the search > > list). But that should not be a problem in your case.... > > > > Robert > > > > > Thanks, > > > Elisabetta > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor@stat.math.ethz.ch > > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > > > -- Elisabetta Manduchi Computational Biology and Informatics Laboratory Center for Bioinformatics University of Pennsylvania 1428 Blockley Hall 423 Guardian Drive Philadelphia, PA 19104-6021 phone: 215-573-4408 fax: 215 573-3111 email: manduchi@pcbi.upenn.edu web: http://www.cbil.upenn.edu/~manduchi ---
ADD COMMENT
0
Entering edit mode
> > Hi, > I'm writing with a questions that follows up on a previous correspondence > I had within this mailing list (copied below). > Essentially I'm trying to use ontoTools to ge a mapping between the union > of the probe sets on the HGU95 av2, b, c, d, e and GO biological process > terms. > To this end, I've first created an environment, named hgu95GO, which was > defined as described below, using parent.env, following a suggestion by R. > Gentleman: > > > hgu95GO<-hgu95av2GO > > parent.env(hgu95GO)<-hgu95bGO > > parent.env(hgu95bGO)<-hgu95cGO > > parent.env(hgu95cGO)<-hgu95dGO) > > parent.env(hgu95dGO)<-hgu95eGO > > For this we have > > > length(ls(env=hgu95GO)) > [1] 12625 > > which is the same as the length for hgu95av2GO, rather then the length of > the union of the 5 probe set collections from av2 to e (which is 62906). > However if I look for a value of a key corresponding to a probe set from > b, c,... indeed it gives me something by looking at the parent. I'm not > too familiar with environments in R, but I guess this is the expected > behavior. Now, I've built a mapping: > > ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO) the otkv function does use ls() and will be stymied by the strange behavior of the environment union created above. i note in the doc for parent.env that this function is regarded as dangerous and may be deprecated. i believe that this "one time" operation can be done, albeit inelegantly, with a manual approach: 1) use the contents function in annotate to convert environments to lists 2) use listUnion to create union of lists. i have written a function to do this; it basically concatenates all uniquely named elements of two lists and forms unions of elements that share names in the two lists listUnion <- function (x, y) { if (is.null(names(x)) || is.null(names(y))) { warning("unnamed lists imply union is concatenation") return(unlist(list(x, y), recurs = FALSE)) } comm <- intersect(names(x), names(y)) if (length(comm) == 0) return(unlist(list(x, y), recurs = FALSE)) u1 <- x[!(names(x) %in% comm)] u2 <- y[!(names(y) %in% comm)] comml <- list() for (i in 1:length(comm)) comml[[comm[i]]] <- union(x[[comm[i]]], y[[comm[i]]]) return(unlist(list(u1, comml, u2), recurs = FALSE)) } 3) use list2env to create the environment of the final result using code like l1 <- contents(hgu95av2GO) l2 <- contents(hgu95bGO) ... l5 <- contents(hgu95eGO) LU <- listUnion(l1,l2) LU <- listUnion(LU,l3) ... hgu95GO <- list2env(LU) leads to 62906 entries in the resulting environment. these operations concluded very rapidly on my laptop. whether listUnion is something to go along with list2env in Biobase, or other tools for merging environments should be provided, are topics open to discussion.
ADD REPLY
0
Entering edit mode
Hi Vince, thanks a lot for your suggestion, which I'll try. I had sent another email on this thread this morning which I'm not sure went through (albeit I see it in the thread archive on the web). Namely I thought of replacing my command: obs<-ls(env=hgu95GO) (which limited my observations to the 12625 probe sets on av2), with obs<-hgu95.probeset where the latter is the union of the probesets for all 5 chips. Then proceeding as before: > ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO) Things seemed to work as desired, at list in the short run. But from what you say I gather it might be dangerous to use environments in this way and using lists in the way you indicate might in any case be safer in the longer run. Thanks again, Elisabetta On Tue, 13 Jan 2004, Vincent Carey 525-2265 wrote: > > > > > Hi, > > I'm writing with a questions that follows up on a previous correspondence > > I had within this mailing list (copied below). > > Essentially I'm trying to use ontoTools to ge a mapping between the union > > of the probe sets on the HGU95 av2, b, c, d, e and GO biological process > > terms. > > To this end, I've first created an environment, named hgu95GO, which was > > defined as described below, using parent.env, following a suggestion by R. > > Gentleman: > > > > > hgu95GO<-hgu95av2GO > > > parent.env(hgu95GO)<-hgu95bGO > > > parent.env(hgu95bGO)<-hgu95cGO > > > parent.env(hgu95cGO)<-hgu95dGO) > > > parent.env(hgu95dGO)<-hgu95eGO > > > > For this we have > > > > > length(ls(env=hgu95GO)) > > [1] 12625 > > > > which is the same as the length for hgu95av2GO, rather then the length of > > the union of the 5 probe set collections from av2 to e (which is 62906). > > However if I look for a value of a key corresponding to a probe set from > > b, c,... indeed it gives me something by looking at the parent. I'm not > > too familiar with environments in R, but I guess this is the expected > > behavior. Now, I've built a mapping: > > > > ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO) > > the otkv function does use ls() and will be stymied by > the strange behavior of the environment union created > above. > > i note in the doc for parent.env that this function is > regarded as dangerous and may be deprecated. > > i believe that this "one time" operation can be done, > albeit inelegantly, with a manual approach: > > 1) use the contents function in annotate to convert > environments to lists > 2) use listUnion to create union of lists. i have > written a function to do this; it basically concatenates > all uniquely named elements of two lists and forms > unions of elements that share names in the two lists > > listUnion <- function (x, y) > { > if (is.null(names(x)) || is.null(names(y))) { > warning("unnamed lists imply union is concatenation") > return(unlist(list(x, y), recurs = FALSE)) > } > comm <- intersect(names(x), names(y)) > if (length(comm) == 0) > return(unlist(list(x, y), recurs = FALSE)) > u1 <- x[!(names(x) %in% comm)] > u2 <- y[!(names(y) %in% comm)] > comml <- list() > for (i in 1:length(comm)) comml[[comm[i]]] <- union(x[[comm[i]]], > y[[comm[i]]]) > return(unlist(list(u1, comml, u2), recurs = FALSE)) > } > > 3) use list2env to create the environment of the final result > > using code like > l1 <- contents(hgu95av2GO) > l2 <- contents(hgu95bGO) > ... > l5 <- contents(hgu95eGO) > LU <- listUnion(l1,l2) > LU <- listUnion(LU,l3) > ... > hgu95GO <- list2env(LU) > > leads to 62906 entries in the resulting environment. these operations > concluded very rapidly on my laptop. whether listUnion is something > to go along with list2env in Biobase, or other tools for merging > environments should be provided, are topics open to discussion. > > -- Elisabetta Manduchi Computational Biology and Informatics Laboratory Center for Bioinformatics University of Pennsylvania 1428 Blockley Hall 423 Guardian Drive Philadelphia, PA 19104-6021 phone: 215-573-4408 fax: 215 573-3111 email: manduchi@pcbi.upenn.edu web: http://www.cbil.upenn.edu/~manduchi ---
ADD REPLY
0
Entering edit mode
@elisabetta-manduchi-575
Last seen 9.7 years ago
I think I found an easy solution to the question I had posed below. Namely I was setting obs<-ls(env=hgu95GO) which limited my observations to the 12625 probe sets on av2. By resetting obs<-hgu95.probeset where the latter is the union of the probesets for all 5 chips and then proceeding as before: > ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO) things seem to work as desired. Sorry if I wasted anybody's time on this question. Elisabetta On Mon, 12 Jan 2004, Elisabetta Manduchi wrote: > > Hi, > I'm writing with a questions that follows up on a previous correspondence > I had within this mailing list (copied below). > Essentially I'm trying to use ontoTools to ge a mapping between the union > of the probe sets on the HGU95 av2, b, c, d, e and GO biological process > terms. > To this end, I've first created an environment, named hgu95GO, which was > defined as described below, using parent.env, following a suggestion by R. > Gentleman: > > > hgu95GO<-hgu95av2GO > > parent.env(hgu95GO)<-hgu95bGO > > parent.env(hgu95bGO)<-hgu95cGO > > parent.env(hgu95cGO)<-hgu95dGO) > > parent.env(hgu95dGO)<-hgu95eGO > > For this we have > > > length(ls(env=hgu95GO)) > [1] 12625 > > which is the same as the length for hgu95av2GO, rather then the length of > the union of the 5 probe set collections from av2 to e (which is 62906). > However if I look for a value of a key corresponding to a probe set from > b, c,... indeed it gives me something by looking at the parent. I'm not > too familiar with environments in R, but I guess this is the expected > behavior. Now, I've built a mapping: > > ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO) > > where tms are the nodes of a go biological process graph that I've built > and obs is ls(env=hgu95GO). > I'm concerned though that since the latter seems to be equal to > ls(env=hgu95av2) only, I'm not getting the mapping that I desired from > the union of all *5* collections of probe sets on the 5 Affy chips to my > GO BP terms and indeed if I run: > > > ooc.Hgu95.GOBP<-makeOOC(goBPonto, ooMapHgu952GOBP) > > print(OOmap(ooc.Hgu95.GOBP)) > named sparse matrix of dim[1] 12625 7807 > > the first dimension is 12625, not 62906. Am I correct in my > interpretation that I'm not getting what I was seeking? If so, how can I > get around it? I.e. what would be the quickest way to build a mapping from > the union of the probe sets on the 5 Affy Chips to my Go BP ontology? > Thanks for any help you might give me, > Elisabetta > > On Wed, 17 Dec 2003, Elisabetta Manduchi wrote: > > > > > Robert, > > thank you very much for such a prompt reply. > > A question, with your notation below, shouldn't I do my gets on E5 rather > > than on E1, if E1 is the ancestor? Was that a just typo or am I > > misunderstanding? > > In other words to deal with my case, I guess I could do the following > > (calling hgu95GO my new combined environment): > > > > hgu95GO<-hgu95av2GO > > parent.env(hgu95GO)<-hgu95bGO > > parent.env(hgu95bGO)<-hgu95cGO > > parent.env(hgu95cGO)<-hgu95dGO) > > parent.env(hgu95dGO)<-hgu95eGO > > > > and then do my gets on hgu95GO, right? > > Elisabetta > > > > On Wed, 17 Dec 2003, Robert Gentleman wrote: > > > > > On Wed, Dec 17, 2003 at 03:47:54PM -0500, Elisabetta Manduchi wrote: > > > > > > > > Hi, > > > > I would like to create an environment that combines the hgu95av2GO, > > > > hgu95bGO, ...hgu95eGO into just one environment, that I can subsequently > > > > use as the otkvEnv argument in the ontoTools function otkvEnv2namedSparse. > > > > Is there a quick and simple way to do this in R, without having to define > > > > a new hash and the key-value mapping block by block (according to which > > > > environment the key belongs to)? In other words, I'm asking if there is a > > > > one-stop way in R to "union" the above 5 environments. > > > > > > Sort of, in R there are no hash tables but environments are close so > > > we used them. They have a rather unique aspect which is the parent > > > environment. So that if a value is not found in the first environment > > > the parent is searched. So to solve your problem you could do some > > > thing like > > > > > > parent.env(E2) <- E1 > > > parent.env(E3) <- E2 > > > ... > > > > > > then do your get's on E1 (with inherits=TRUE, which is the default > > > in get) and you should be almost set. The one issue that is not > > > easily solved is the if inherits is TRUE then you search up beyond > > > the last of your environments (your work space and then the search > > > list). But that should not be a problem in your case.... > > > > > > Robert > > > > > > > Thanks, > > > > Elisabetta > > > > > > > > _______________________________________________ > > > > Bioconductor mailing list > > > > Bioconductor@stat.math.ethz.ch > > > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > > > > > > > > > -- Elisabetta Manduchi Computational Biology and Informatics Laboratory Center for Bioinformatics University of Pennsylvania 1428 Blockley Hall 423 Guardian Drive Philadelphia, PA 19104-6021 phone: 215-573-4408 fax: 215 573-3111 email: manduchi@pcbi.upenn.edu web: http://www.cbil.upenn.edu/~manduchi ---
ADD COMMENT

Login before adding your answer.

Traffic: 406 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6