Converting annotate lists to a matrix

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 10.8 years ago

Hi This is kind of an R problem, but on bioconductor data. For example, I have the hu6800PATH environment from the hu6800 annotation package. The example in the help is this: xx <- as.list(hu6800PATH) xx <- xx[!is.na(xx)] What I actually want is a matrix with two columns, the first being probe id and the second being pathway id - I'm going to do some relational joins with this data using merge(). I've got as far as: as.matrix(unlist(xx)) But that doesn't give me exactly what I want. The rownames of the resulting matrix are set to the probe_ids but where there are duplicate probe ids (where probes are in >1 pathway) then R appends a numerator on the end. Can anyone help me convert the list format from an annotation package to a matrix as I describe above? Thanks Mick

Annotation hu6800 convert Annotation hu6800 convert • 1.4k views

ADD COMMENT • link 20.4 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

John Zhang ★ 2.9k

@john-zhang-6

Last seen 10.8 years ago

>This is kind of an R problem, but on bioconductor data. For example, I >have the hu6800PATH environment from the hu6800 annotation package. The >example in the help is this: > >xx <- as.list(hu6800PATH) >xx <- xx[!is.na(xx)] > >What I actually want is a matrix with two columns, the first being probe >id and the second being pathway id - I'm going to do some relational >joins with this data using merge(). You may try: > xx <- as.list(hu6800PATH) > xx <- unlist(xx, use.names = TRUE) > xx <- cbind(names(xx), xx) The first column of xx will be probe ids with an integer appended to the end if a probe has multiple mappings. Use pattern match to remove the trailing integers from the first column then you are done. > >I've got as far as: > >as.matrix(unlist(xx)) > >But that doesn't give me exactly what I want. The rownames of the >resulting matrix are set to the probe_ids but where there are duplicate >probe ids (where probes are in >1 pathway) then R appends a numerator on >the end. > >Can anyone help me convert the list format from an annotation package to >a matrix as I describe above? > >Thanks >Mick > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084

ADD COMMENT • link 20.4 years ago John Zhang ★ 2.9k

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 4 months ago

EMBL European Molecular Biology Laborat…

Hi Michael, try this: res = do.call("rbind", args=lapply(seq(along=xx), function(i) cbind(names(xx)[i], xx[[i]]))) > res[1:5,] [,1] [,2] [1,] "Z22536_at" "04010" [2,] "Z22536_at" "04060" [3,] "Z22536_at" "04350" [4,] "X60221_at" "00190" [5,] "X60221_at" "00193" Michael watson (IAH-C) wrote: > Hi > > This is kind of an R problem, but on bioconductor data. For example, I > have the hu6800PATH environment from the hu6800 annotation package. The > example in the help is this: > > xx <- as.list(hu6800PATH) > xx <- xx[!is.na(xx)] > > What I actually want is a matrix with two columns, the first being probe > id and the second being pathway id - I'm going to do some relational > joins with this data using merge(). > > I've got as far as: > > as.matrix(unlist(xx)) > > But that doesn't give me exactly what I want. The rownames of the > resulting matrix are set to the probe_ids but where there are duplicate > probe ids (where probes are in >1 pathway) then R appends a numerator on > the end. > > Can anyone help me convert the list format from an annotation package to > a matrix as I describe above? -- Best regards Wolfgang ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Fax: +44 1223 494486 Http: www.ebi.ac.uk/huber

ADD COMMENT • link 20.4 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 10.8 years ago

Hi Thanks for that :-) It was actually an easy and quick way to do the latter that I was looking for. I can't just undiscrinately get rid of all integers if they appear at the end of an id in case there are ids that have integers at the end and are perfectly valid. So I am left faced with writing some kind of loop(), which is what I wanted to avoid in the first place. I don't want to annoy anyone, but am I the only person who finds the lists from bioconductor annotation packages a little unhelpful and hard to work with? Every example in the help, the first thing they do is unlist() the list; so why is it a list in the first place??? Thanks Mick -----Original Message----- From: John Zhang [mailto:jzhang@jimmy.harvard.edu] Sent: 10 February 2005 13:44 To: michael watson (IAH-C) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Converting annotate lists to a matrix >This is kind of an R problem, but on bioconductor data. For example, I >have the hu6800PATH environment from the hu6800 annotation package. >The example in the help is this: > >xx <- as.list(hu6800PATH) >xx <- xx[!is.na(xx)] > >What I actually want is a matrix with two columns, the first being >probe id and the second being pathway id - I'm going to do some >relational joins with this data using merge(). You may try: > xx <- as.list(hu6800PATH) > xx <- unlist(xx, use.names = TRUE) > xx <- cbind(names(xx), xx) The first column of xx will be probe ids with an integer appended to the end if a probe has multiple mappings. Use pattern match to remove the trailing integers from the first column then you are done. > >I've got as far as: > >as.matrix(unlist(xx)) > >But that doesn't give me exactly what I want. The rownames of the >resulting matrix are set to the probe_ids but where there are duplicate >probe ids (where probes are in >1 pathway) then R appends a numerator >on the end. > >Can anyone help me convert the list format from an annotation package >to a matrix as I describe above? > >Thanks >Mick > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084

ADD COMMENT • link 20.4 years ago michael watson IAH-C ★ 3.4k

Login before adding your answer.