Question

collapseRows function

0

Entering edit mode

Alyaa Mahmoud ▴ 10

@alyaa-mahmoud-4888

Last seen 9.6 years ago

Dear BioC group I am using the collapseRows function from the WGCNA package, I need to average expression data from probe sets that map to the same EntrezID. Its stated that I have to idnetify two arguments rowID (corresponding to probe set identifiers) and rowGroup (corresponding to the EntrezIDs) which I did below: summ=collapseRows(datET=ww, rowGroup=ww[,1], rowID=rownames(ww), method="Average") the output should be an expression matrix with expression data averaged for multiple probes tha map to same geneID and rows corresponding to geneIDs. My problem is in this later part, I don't get gene IDs in the rows, instead this is what I get. GSM97965.CEL.gz GSM97966.CEL.gz GSM97969.CEL.gz GSM97970.CEL.gz 1 2.176576 2.176576 2.176576 2.176576 10 2.176576 2.176576 2.176576 2.176576 100 4.517431 3.414851 3.225376 3.113286 1000 12.330020 12.666929 10.414347 10.479440 10000 7.051655 6.951304 7.111559 8.566242 100009676 2.176576 2.176576 2.359885 2.176576 GSM97971.CEL.gz GSM97972.CEL.gz 1 2.176576 2.176576 10 2.176576 2.176576 100 3.170813 3.093058 1000 10.300492 10.425190 10000 7.494686 7.415322 100009676 2.176576 2.176576 any help would be much appreciated Thankss Alyaa [[alternative HTML version deleted]]

probe probe • 2.1k views

ADD COMMENT • link updated 12.6 years ago by Alex Gutteridge ▴ 650 • written 12.6 years ago by Alyaa Mahmoud ▴ 10

score 0 · Answer 1 · 2011-09-29

On Thu, 29 Sep 2011 10:07:06 +0200, Alyaa Mahmoud wrote: > Dear BioC group > > I am using the collapseRows function from the WGCNA package, I need > to > average expression data from probe sets that map to the same > EntrezID. Its > stated that I have to idnetify two arguments rowID (corresponding to > probe > set identifiers) and rowGroup (corresponding to the EntrezIDs) > > which I did below: > > summ=collapseRows(datET=ww, rowGroup=ww[,1], rowID=rownames(ww), > method="Average") > > the output should be an expression matrix with expression data > averaged for > multiple probes tha map to same geneID and rows corresponding to > geneIDs. My > problem is in this later part, I don't get gene IDs in the rows, > instead > this is what I get. > > GSM97965.CEL.gz GSM97966.CEL.gz GSM97969.CEL.gz GSM97970.CEL.gz > 1 2.176576 2.176576 2.176576 > 2.176576 > 10 2.176576 2.176576 2.176576 > 2.176576 > 100 4.517431 3.414851 3.225376 > 3.113286 > 1000 12.330020 12.666929 10.414347 > 10.479440 > 10000 7.051655 6.951304 7.111559 > 8.566242 > 100009676 2.176576 2.176576 2.359885 > 2.176576 > GSM97971.CEL.gz GSM97972.CEL.gz > 1 2.176576 2.176576 > 10 2.176576 2.176576 > 100 3.170813 3.093058 > 1000 10.300492 10.425190 > 10000 7.494686 7.415322 > 100009676 2.176576 2.176576 > > any help would be much appreciated > Thankss > Alyaa Hi Alyaa, What do you mean by 'gene IDs in the rows'? It looks like the function has worked fine to me. The rownames of the matrix you show look like they are Entrez Gene IDs and the columns are the (I assume) mean expression values for each gene. -- Alex Gutteridge

score 0 · Answer 2 · 2011-09-29

On Thu, 29 Sep 2011 12:27:04 +0200, Alyaa Mahmoud wrote: > Hi Alex > > I mean the (1, 10, 1000, 10000) that I get in the rows instead of > gene IDs), i.e. yes the columns are the mean of expression values but > don't know of which genes ?? I think we are using different terminology. By 'gene ID' I am referring to Entrez Gene IDs which are the numeric identifiers you are seeing. You can use the org.Hs.eg.db package to map those identifiers to others if you want. Perhaps you would prefer gene symbols, in which case: library(annotate) library(org.Hs.eg.db) rownames(summ) = getSYMBOL(rownames(summ),"org.Hs.eg.db") Will replace the numeric IDs from the summ matrix with gene symbols instead. -- Alex Gutteridge

score 0 · Answer 3 · 2011-09-29

On Thu, Sep 29, 2011 at 12:27 PM, Alyaa Mahmoud <alyamahmoud@gmail.com>wrote: > Hi Alex > > I mean the (1, 10, 1000, 10000) that I get in the rows instead of gene > IDs), i.e. yes the columns are the mean of expression values but don't know > of which genes ?? > > > > > On Thu, Sep 29, 2011 at 12:15 PM, Alex Gutteridge <alexg@ruggedtextile.com> > wrote: > >> On Thu, 29 Sep 2011 10:07:06 +0200, Alyaa Mahmoud wrote: >> >>> Dear BioC group >>> >>> I am using the collapseRows function from the WGCNA package, I need to >>> average expression data from probe sets that map to the same EntrezID. >>> Its >>> stated that I have to idnetify two arguments rowID (corresponding to >>> probe >>> set identifiers) and rowGroup (corresponding to the EntrezIDs) >>> >>> which I did below: >>> >>> summ=collapseRows(datET=ww, rowGroup=ww[,1], rowID=rownames(ww), >>> method="Average") >>> >>> the output should be an expression matrix with expression data averaged >>> for >>> multiple probes tha map to same geneID and rows corresponding to geneIDs. >>> My >>> problem is in this later part, I don't get gene IDs in the rows, instead >>> this is what I get. >>> >>> GSM97965.CEL.gz GSM97966.CEL.gz GSM97969.CEL.gz GSM97970.CEL.gz >>> 1 2.176576 2.176576 2.176576 2.176576 >>> 10 2.176576 2.176576 2.176576 2.176576 >>> 100 4.517431 3.414851 3.225376 3.113286 >>> 1000 12.330020 12.666929 10.414347 10.479440 >>> 10000 7.051655 6.951304 7.111559 8.566242 >>> 100009676 2.176576 2.176576 2.359885 2.176576 >>> GSM97971.CEL.gz GSM97972.CEL.gz >>> 1 2.176576 2.176576 >>> 10 2.176576 2.176576 >>> 100 3.170813 3.093058 >>> 1000 10.300492 10.425190 >>> 10000 7.494686 7.415322 >>> 100009676 2.176576 2.176576 >>> >>> any help would be much appreciated >>> Thankss >>> Alyaa >>> >> >> Hi Alyaa, >> >> What do you mean by 'gene IDs in the rows'? It looks like the function has >> worked fine to me. The rownames of the matrix you show look like they are >> Entrez Gene IDs and the columns are the (I assume) mean expression values >> for each gene. >> >> -- >> Alex Gutteridge >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > > > -- > Alyaa Mahmoud > > "Love all, trust a few, do wrong to none"- Shakespeare > > -- Alyaa Mahmoud "Love all, trust a few, do wrong to none"- Shakespeare [[alternative HTML version deleted]]