average replicate columns in a matrix

1

Entering edit mode

Wendy Qiao ▴ 360

@wendy-qiao-4501

Last seen 9.6 years ago

I have a matrix like following > test=matrix(c(1:15),nrow=3,ncol=5) > colnames(test)=c("A","A","B","B","B") > test A A B B B [1,] 1 4 7 10 13 [2,] 2 5 8 11 14 [3,] 3 6 9 12 15 I want to calculate the average of each replicates, ie. I want the output to be A B [1,] 2.5 10 [2,] 3.5 11 [3,] 4.5 12 I can do this by looping through each level of the column name, but I was wondering if there is a function for calculating the average of replicates in one step. Thank you in advance. Wendy [[alternative HTML version deleted]]

• 4.7k views

ADD COMMENT • link updated 13.1 years ago by Amos Folarin ▴ 80 • written 13.1 years ago by Wendy Qiao ▴ 360

2

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 11 days ago

EMBL European Molecular Biology Laborat…

Hi Wendy this is by looping, but the R code is so compact that the added value of a specialised function might be marginal: cn = colnames(test) sapply(unique(cn), function(g) rowMeans(test[,cn==g,drop=FALSE])) A B [1,] 2.5 10 [2,] 3.5 11 [3,] 4.5 12 Or also: by(t(test), colnames(test), mean) INDICES: A V1 V2 V3 2.5 3.5 4.5 ------------------------------------------------------------ INDICES: B V1 V2 V3 10 11 12 Best wishes Wolfgang Il Mar/28/11 1:25 AM, Wendy Qiao ha scritto: > I have a matrix like following > >> test=matrix(c(1:15),nrow=3,ncol=5) >> colnames(test)=c("A","A","B","B","B") >> test > A A B B B > [1,] 1 4 7 10 13 > [2,] 2 5 8 11 14 > [3,] 3 6 9 12 15 > > I want to calculate the average of each replicates, ie. I want the output to > be > A B > [1,] 2.5 10 > [2,] 3.5 11 > [3,] 4.5 12 > > I can do this by looping through each level of the column name, but I was > wondering if there is a function for calculating the average of replicates > in one step. > > Thank you in advance. > Wendy > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber

ADD COMMENT • link 13.1 years ago Wolfgang Huber ★ 13k

1

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 5 minutes ago

WEHI, Melbourne, Australia

Dear Wendy, The function avearrays() in the limma package does exactly this: > library(limma) > avearrays(test) A B [1,] 2.5 10 [2,] 3.5 11 [3,] 4.5 12 Best wishes Gordon > Date: Sun, 27 Mar 2011 19:25:29 -0400 > From: Wendy Qiao <wendy2.qiao at="" gmail.com=""> > To: bioconductor at r-project.org > Subject: [BioC] average replicate columns in a matrix > Message-ID: > <aanlktin7zktsax4+t+x0znmaryzzng2kbtxxa-d-98db at="" mail.gmail.com=""> > Content-Type: text/plain > > I have a matrix like following > >> test=matrix(c(1:15),nrow=3,ncol=5) >> colnames(test)=c("A","A","B","B","B") >> test > A A B B B > [1,] 1 4 7 10 13 > [2,] 2 5 8 11 14 > [3,] 3 6 9 12 15 > > I want to calculate the average of each replicates, ie. I want the output to > be > A B > [1,] 2.5 10 > [2,] 3.5 11 > [3,] 4.5 12 > > I can do this by looping through each level of the column name, but I was > wondering if there is a function for calculating the average of replicates > in one step. > > Thank you in advance. > Wendy ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 13.1 years ago Gordon Smyth 50k

0

Entering edit mode

Dear Gordon, Is there a corresponding method for sum over replicate arrays? Thank you so much for your help! Best regards, Julie On 3/28/11 6:08 PM, "Gordon K Smyth" <smyth at="" wehi.edu.au=""> wrote: > Dear Wendy, > > The function avearrays() in the limma package does exactly this: > >> library(limma) >> avearrays(test) > A B > [1,] 2.5 10 > [2,] 3.5 11 > [3,] 4.5 12 > > Best wishes > Gordon > >> Date: Sun, 27 Mar 2011 19:25:29 -0400 >> From: Wendy Qiao <wendy2.qiao at="" gmail.com=""> >> To: bioconductor at r-project.org >> Subject: [BioC] average replicate columns in a matrix >> Message-ID: >> <aanlktin7zktsax4+t+x0znmaryzzng2kbtxxa-d-98db at="" mail.gmail.com=""> >> Content-Type: text/plain >> >> I have a matrix like following >> >>> test=matrix(c(1:15),nrow=3,ncol=5) >>> colnames(test)=c("A","A","B","B","B") >>> test >> A A B B B >> [1,] 1 4 7 10 13 >> [2,] 2 5 8 11 14 >> [3,] 3 6 9 12 15 >> >> I want to calculate the average of each replicates, ie. I want the output to >> be >> A B >> [1,] 2.5 10 >> [2,] 3.5 11 >> [3,] 4.5 12 >> >> I can do this by looping through each level of the column name, but I was >> wondering if there is a function for calculating the average of replicates >> in one step. >> >> Thank you in advance. >> Wendy > > ______________________________________________________________________ > The information in this email is confidential and intend...{{dropped:4}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 13.1 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Dear Julie, Do you mean computing the sum of the replicate columns rather than the mean? You can essentially do that with a call to rowsum() on the transposed matrix. Best wishes Gordon On Tue, 29 Mar 2011, Zhu, Lihua (Julie) wrote: > Dear Gordon, > > Is there a corresponding method for sum over replicate arrays? Thank you so > much for your help! > > Best regards, > > Julie > > > On 3/28/11 6:08 PM, "Gordon K Smyth" <smyth at="" wehi.edu.au=""> wrote: > >> Dear Wendy, >> >> The function avearrays() in the limma package does exactly this: >> >>> library(limma) >>> avearrays(test) >> A B >> [1,] 2.5 10 >> [2,] 3.5 11 >> [3,] 4.5 12 >> >> Best wishes >> Gordon >> >>> Date: Sun, 27 Mar 2011 19:25:29 -0400 >>> From: Wendy Qiao <wendy2.qiao at="" gmail.com=""> >>> To: bioconductor at r-project.org >>> Subject: [BioC] average replicate columns in a matrix >>> Message-ID: >>> >>> I have a matrix like following >>> >>>> test=matrix(c(1:15),nrow=3,ncol=5) >>>> colnames(test)=c("A","A","B","B","B") >>>> test >>> A A B B B >>> [1,] 1 4 7 10 13 >>> [2,] 2 5 8 11 14 >>> [3,] 3 6 9 12 15 >>> >>> I want to calculate the average of each replicates, ie. I want the output to >>> be >>> A B >>> [1,] 2.5 10 >>> [2,] 3.5 11 >>> [3,] 4.5 12 >>> >>> I can do this by looping through each level of the column name, but I was >>> wondering if there is a function for calculating the average of replicates >>> in one step. >>> >>> Thank you in advance. >>> Wendy ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD REPLY • link 13.1 years ago Gordon Smyth 50k

0

Entering edit mode

Dear Dr. Gordon: I saw your reply as below to somebody's question in the BioC mailing list. I already get the password to the list, which is why I can see the list's email, but I still have to wait on the list moderator to review my request for approval and my message is being held because of that. So I copied and pasted my original message send to the list as below and hope you can give me a quick answer or suggestion. Thanks a lot in advance! Hi, List: I have a question regarding paired samples test in limma package. Here is what I got ####W and B are race ####T and N are tumor and normal(paired by same Samples), respectively ####Samples are individual patients: tar$AccNum######## #### > library(limma); > tar<-readTargets("Test_Desc.txt"); > mydata<-read.delim("TestData.txt",as.is=TRUE,row.names=1); > mydata<-mydata[,tar$mRNASampleNames]; > tar$Type<-gsub("Tumor","T",tar$Type) > tar$Type<-gsub("Normal","N",tar$Type) > group1<-paste(tar$RACE,tar$Type,sep="."); > group<-factor(group1, levels=c( "W.T","W.N","B.T","B.N")) > Samples<-factor(tar$AccNum); ####first option to set up the design matrix######### > design<-model.matrix(~Samples+group); > colnames(design)<-sub("group","",colnames(design)); > colnames(design)<-sub("Samples","",colnames(design)); > colnames(design)[1] <- "Intercept"; > con.matrix<-makeContrasts(AA.T_AA.N=B.T-B.N,EA.T_EA.N=-W.N,levels=desi > gn); > lmFit(mydata,design)->fit1; Coefficients not estimable: B.N Warning message: Partial NA coefficients for 26804 probe(s) > contrasts.fit(fit1, con.matrix)->fit2 Error in contrasts.fit(fit1, con.matrix) : trying to take contrast of non-estimable coefficient ################it did not work!!! not sure why??######### ##########then I tried the 2nd option to set up the design matrix#### > design<-model.matrix(~-1+group+Samples); > colnames(design)<-sub("group","",colnames(design)); > colnames(design)<-sub("Samples","",colnames(design)); > con.matrix<-makeContrasts(AA.T_AA.N=B.T-B.N,EA.T_EA.N=W.T-W.N,levels=d > esign); > lmFit(mydata,design)->fit1; Coefficients not estimable: S14810 Warning message: Partial NA coefficients for 26804 probe(s) > contrasts.fit(fit1, con.matrix)->fit2 > eBayes(fit2)->fit3 > #######it works somehow in terms of generating lists, but leave an warning message when I did the fit1 using > lmFit(mydata,design)->fit1 (...Coefficients not estimable: S14810) I am sure why is like this? Using some other dataset, I didn't see such warning at all. It seems dataset specific, I am not sure why there is warning here and whether the list come out is good or not. Could someone please have some suggestions? Thanks in advance! Ming Ming Yi ABCC/ISP National Cancer Institute at Frederick Post Office Box B, Frederick, MD 21702 Phone: 301-846-5764 Fax: 301-846-7070 myi at ncifcrf.gov -----Original Message----- From: Gordon K Smyth [mailto:smyth@wehi.edu.au] Sent: Tuesday, March 29, 2011 6:15 PM To: Zhu, Lihua (Julie) Cc: Wendy Qiao; Bioconductor mailing list Subject: Re: [BioC] average replicate columns in a matrix Dear Julie, Do you mean computing the sum of the replicate columns rather than the mean? You can essentially do that with a call to rowsum() on the transposed matrix. Best wishes Gordon On Tue, 29 Mar 2011, Zhu, Lihua (Julie) wrote: > Dear Gordon, > > Is there a corresponding method for sum over replicate arrays? Thank you so > much for your help! > > Best regards, > > Julie > > > On 3/28/11 6:08 PM, "Gordon K Smyth" <smyth at="" wehi.edu.au=""> wrote: > >> Dear Wendy, >> >> The function avearrays() in the limma package does exactly this: >> >>> library(limma) >>> avearrays(test) >> A B >> [1,] 2.5 10 >> [2,] 3.5 11 >> [3,] 4.5 12 >> >> Best wishes >> Gordon >> >>> Date: Sun, 27 Mar 2011 19:25:29 -0400 >>> From: Wendy Qiao <wendy2.qiao at="" gmail.com=""> >>> To: bioconductor at r-project.org >>> Subject: [BioC] average replicate columns in a matrix >>> Message-ID: >>> >>> I have a matrix like following >>> >>>> test=matrix(c(1:15),nrow=3,ncol=5) >>>> colnames(test)=c("A","A","B","B","B") >>>> test >>> A A B B B >>> [1,] 1 4 7 10 13 >>> [2,] 2 5 8 11 14 >>> [3,] 3 6 9 12 15 >>> >>> I want to calculate the average of each replicates, ie. I want the output to >>> be >>> A B >>> [1,] 2.5 10 >>> [2,] 3.5 11 >>> [3,] 4.5 12 >>> >>> I can do this by looping through each level of the column name, but I was >>> wondering if there is a function for calculating the average of replicates >>> in one step. >>> >>> Thank you in advance. >>> Wendy ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:9}}

ADD REPLY • link 13.1 years ago Yi, Ming NIH/NCI [C] ▴ 100

0

Entering edit mode

Amos Folarin ▴ 80

@amos-folarin-4200

Last seen 9.6 years ago

Wendy, Depending on what you want i.e. row based summing or summing the whole colum you should sue colSum or sum as for FUN. You can probably achieve what you want with something like this: > t <- by(t(test), INDICES=list(colnames(test)), FUN=colSums, simplify=TRUE) > t : A V1 V2 V3 5 7 9 --------------------------------------------------------------------- : B V1 V2 V3 30 33 36 > as.data.frame(c(t)) ##return back to a A B V1 5 30 V2 7 33 V3 9 36 ---------- Forwarded message ---------- > From: "Zhu, Lihua (Julie)" <julie.zhu@umassmed.edu> > To: "Gordon K Smyth" <smyth@wehi.edu.au>, "Wendy Qiao" < > wendy2.qiao@gmail.com> > Date: Tue, 29 Mar 2011 10:28:14 -0400 > Subject: Re: [BioC] average replicate columns in a matrix > Dear Gordon, > > Is there a corresponding method for sum over replicate arrays? Thank you so > much for your help! > > Best regards, > > Julie > > > On 3/28/11 6:08 PM, "Gordon K Smyth" <smyth@wehi.edu.au> wrote: > > > Dear Wendy, > > > > The function avearrays() in the limma package does exactly this: > > > >> library(limma) > >> avearrays(test) > > A B > > [1,] 2.5 10 > > [2,] 3.5 11 > > [3,] 4.5 12 > > > > Best wishes > > Gordon > [[alternative HTML version deleted]]

ADD COMMENT • link 13.1 years ago Amos Folarin ▴ 80

Login before adding your answer.