colnames and get means for the columns with the "same" names

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 4 months ago

United States

Hi, Weiwei. You probably want to look at a combination of merge() to combine your data with your conversion table followed by aggregate(). Read up on the help for those two functions and that should do it, if I understand what you want to do. However, keep in mind that "averaging" the probesets representing the same gene may not represent the best solution. Also, if you search the archive a bit, I know this question has come up before. Sean -----Original Message----- From: Weiwei Shi [mailto:helprhelp@gmail.com] Sent: Mon 11/6/2006 4:53 PM To: r-help Cc: bioconductor Subject: [BioC] colnames and get means for the columns with the "same" names hi, I have a conversion table for colnames like this: Probe_ID HUMAN_LLID 1 AF106325_PROBE1 7052 2 NM_019386_PROBE1 7052 3 NM_012907_PROBE1 339 4 AW917796_PROBE1 84196 5 L27651_PROBE1 10864 The Probe_ID contains a list of colnames for another data.frame, say x1. I need to convert such colnames to another ID's system, HUMAN_LLID by using the table. The colnames of x1 with the same names (in HUMAN_LLID) need to be averaged. Is there a good way to do it? I also put this question in bioconductor since I believe it might be solved by some package. thanks. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

convert convert • 631 views

ADD COMMENT • link updated 17.6 years ago by Weiwei Shi ★ 1.2k • written 17.6 years ago by Sean Davis 21k

0

Entering edit mode

Weiwei Shi ★ 1.2k

@weiwei-shi-1407

Last seen 9.7 years ago

hi, I played around with these two functions but did not get what i want. So I wrote a function by using a loop to do it and it is done in a reasonable time: > system.time(t3 <- iconix.convert(processed, 9, 7486, probes2llid.genego[,c(2,5)])) [1] 12.356 4.494 16.836 0.000 0.000 > dim(t3) [1] 129 4255 I am more interested in the approach instead of "averaging". I will look into the archive since it is a very common problem Microarray analysis has. I post my function here in case someone needs it in the future. iconix.convert <- function(orig, st=9, ed=7486, c.table){ t1 <- orig[, st:ed] # treat missing t1 <- sapply(t1, function(x){ x[is.na(x)]<-0; x}) x0 <- unique(c.table[,2]) out <- matrix(0, dim(t1)[1], length(x0)) j = 1 for (i in x0){ avg.col <- c.table[c.table[,2]==i, 1] if (length(avg.col) > 1){ # has 1:multiple ids t2 <- apply(t1[, avg.col], 1, mean) } else{ t2 <- t1[, avg.col] } out[,j] <- t2 j <- j + 1 } out <- as.data.frame(out) colnames(out) <- x0 out2 <- cbind(orig[, c(1:(st-1))], out, orig[,c((ed+1):dim(orig)[2])]) colnames(out2)[dim(out2)[2]] <- "Group" out2 } On 11/6/06, Davis, Sean (NIH/NCI) [E] <sdavis2 at="" mail.nih.gov=""> wrote: > Hi, Weiwei. > > You probably want to look at a combination of merge() to combine your data with your conversion table followed by aggregate(). Read up on the help for those two functions and that should do it, if I understand what you want to do. However, keep in mind that "averaging" the probesets representing the same gene may not represent the best solution. Also, if you search the archive a bit, I know this question has come up before. > > Sean > > > > -----Original Message----- > From: Weiwei Shi [mailto:helprhelp at gmail.com] > Sent: Mon 11/6/2006 4:53 PM > To: r-help > Cc: bioconductor > Subject: [BioC] colnames and get means for the columns with the "same" names > > hi, > I have a conversion table for colnames like this: > Probe_ID HUMAN_LLID > 1 AF106325_PROBE1 7052 > 2 NM_019386_PROBE1 7052 > 3 NM_012907_PROBE1 339 > 4 AW917796_PROBE1 84196 > 5 L27651_PROBE1 10864 > > The Probe_ID contains a list of colnames for another data.frame, say x1. > I need to convert such colnames to another ID's system, HUMAN_LLID by > using the table. The colnames of x1 with the same names (in > HUMAN_LLID) need to be averaged. Is there a good way to do it? > > I also put this question in bioconductor since I believe it might be > solved by some package. > > thanks. > > -- > Weiwei Shi, Ph.D > Research Scientist > GeneGO, Inc. > > "Did you always know?" > "No, I did not. But I believed..." > ---Matrix III > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III

ADD COMMENT • link 17.6 years ago Weiwei Shi ★ 1.2k

Login before adding your answer.