Question: Problems selecting rows from dataframe (exprs) of GNF Atlas data....
0
gravatar for Bas Jansen
7.9 years ago by
Bas Jansen150
Bas Jansen150 wrote:
Dear fellow Bioconductor users: Happy New Year! At the moment I am analyzing the GNF Atlas data. I retrieved the data from the Gene Expression Omnibus using the package GEOquery, converted it to an expressionSet and extracted the expression values. So now I have a data frame from which I would like to extract the expression values of > 100 probe IDs for 79 tissues. Thing is, if I use a single probe ID, things go fine. However, whenever I use a string of probe IDs, things go awry. See below: *** > exprs[c("gnf1h00499_at"),] GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 (abbreviated for reasons of clarity) *** As stated above: whenever I use a string of probe IDs (say, like 2 probe IDs), things go awry: *** > exprs[c("gnf1h00499_at","gnf1h500_at"),] GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 NA NA NA NA NA NA NA NA etc. *** The gnf1h00500 probe is reported as NA, and I'm pretty sure it has real expression values associated with it. The following just works fine: *** > exprs[c(1:20,30:70),] GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 200000_s_at 0 0 0 0 0 0 0 200001_at 0 0 0 0 0 0 0 200002_at 0 0 0 0 0 0 0 200003_s_at 0 0 0 0 0 0 0 etc. *** So, how do I select rows on the basis of probe IDs? Or better yet: what am I overlooking???? Thanks & kind regards, Bas
go probe geoquery • 604 views
ADD COMMENTlink modified 7.9 years ago by Sebastian Thieme60 • written 7.9 years ago by Bas Jansen150
Answer: Problems selecting rows from dataframe (exprs) of GNF Atlas data....
0
gravatar for Sebastian Thieme
7.9 years ago by
Sebastian Thieme60 wrote:
Hello, happy new year too =) you can use exprs[ rownames(exprs) %in% "gnf1h00499_at",] or exprs[ rownames(exprs) %in% vectorOfNames,], where vectorOfNames is a list or a vector of the names you are looking for. Important is that the object you are search in has to be the first argument. If you want requesting a high number of names use lists instead of dataframes. best Basti 2012/1/3 Bas Jansen <bjhjansen at="" gmail.com="">: > Dear fellow Bioconductor users: > > Happy New Year! > At the moment I am analyzing the GNF Atlas data. I retrieved the data > from the Gene Expression Omnibus using the package GEOquery, converted > it to an expressionSet and extracted the expression values. So now I > have a data frame from which I would like to extract the expression > values of > 100 probe IDs for 79 tissues. Thing is, if I use a single > probe ID, things go fine. However, whenever I use a string of probe > IDs, things go awry. > > See below: > > *** >> exprs[c("gnf1h00499_at"),] > ? ? ? ? ? ? ?GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 > gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 > (abbreviated for reasons of clarity) > *** > > As stated above: whenever I use a string of probe IDs (say, like 2 > probe IDs), things go awry: > > *** >> exprs[c("gnf1h00499_at","gnf1h500_at"),] > ? ? ? ? ? ? ?GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 > gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 > NA ? ? ? ? ? ? ? ? ?NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA > etc. > *** > > The gnf1h00500 probe is reported as NA, and I'm pretty sure it has > real expression values associated with it. > The following just works fine: > > *** >> exprs[c(1:20,30:70),] > ? ? ? ? ? ?GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 > 200000_s_at ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 > 200001_at ? ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 > 200002_at ? ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 > 200003_s_at ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 > etc. > *** > > So, how do I select rows on the basis of probe IDs? Or better yet: > what am I overlooking???? > > Thanks & kind regards, > Bas > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENTlink written 7.9 years ago by Sebastian Thieme60
Dear Sebastian: Thanks for your swift reply. It works, but only for the probe ID that start with a character (only ~15 out of the > 100 probe IDs I want to investigate). Those that start with a number report back with "<0 rows> (or 0-length row.names)". The motto for the New Year seems to be 'Solve a problem, only to find new ones'. Phew. Kind regards, Bas On Tue, Jan 3, 2012 at 11:19 AM, Sebastian Thieme <thieme at="" mi.fu-berlin.de=""> wrote: > Hello, > > happy new year too =) > > you can use exprs[ rownames(exprs) %in% "gnf1h00499_at",] or exprs[ > rownames(exprs) %in% vectorOfNames,], where vectorOfNames is a list or > a vector of the names you are looking for. Important is that the > object you are search in has to be the first argument. If you want > requesting a high number of names use lists instead of dataframes. > > best > > Basti > > 2012/1/3 Bas Jansen <bjhjansen at="" gmail.com="">: >> Dear fellow Bioconductor users: >> >> Happy New Year! >> At the moment I am analyzing the GNF Atlas data. I retrieved the data >> from the Gene Expression Omnibus using the package GEOquery, converted >> it to an expressionSet and extracted the expression values. So now I >> have a data frame from which I would like to extract the expression >> values of > 100 probe IDs for 79 tissues. Thing is, if I use a single >> probe ID, things go fine. However, whenever I use a string of probe >> IDs, things go awry. >> >> See below: >> >> *** >>> exprs[c("gnf1h00499_at"),] >> ? ? ? ? ? ? ?GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 >> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 >> (abbreviated for reasons of clarity) >> *** >> >> As stated above: whenever I use a string of probe IDs (say, like 2 >> probe IDs), things go awry: >> >> *** >>> exprs[c("gnf1h00499_at","gnf1h500_at"),] >> ? ? ? ? ? ? ?GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 >> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 >> NA ? ? ? ? ? ? ? ? ?NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA >> etc. >> *** >> >> The gnf1h00500 probe is reported as NA, and I'm pretty sure it has >> real expression values associated with it. >> The following just works fine: >> >> *** >>> exprs[c(1:20,30:70),] >> ? ? ? ? ? ?GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 >> 200000_s_at ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 >> 200001_at ? ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 >> 200002_at ? ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 >> 200003_s_at ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 ? ? ? ?0 >> etc. >> *** >> >> So, how do I select rows on the basis of probe IDs? Or better yet: >> what am I overlooking???? >> >> Thanks & kind regards, >> Bas >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 7.9 years ago by Bas Jansen150
Dear Bas, I think you'll need to show us your original code, in particular what your 'exprs' is and how you have obtained it. If you have "extracted the expression values" from an ExpressionSet ES like x <- exprs(ES) then x is a matrix and not a data.frame -- but then your output would look slightly different. If you have done something like x <- data.frame(exprs(ES)) I can reproduce your output, including rows that are all NA -- for rownames that do not exist. So: how did you create 'exprs' and are you sure your rownames are ok? Cheers, - axel BTW: try install.packages("fortunes") library("fortunes") fortune("dog") to see why 'exprs' may not be a good name for your object... :-) Axel Klenk Research Informatician Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil / Switzerland From: Bas Jansen <bjhjansen at="" gmail.com=""> To: Sebastian Thieme <thieme at="" mi.fu-berlin.de=""> Cc: bioconductor at r-project.org Date: 03.01.2012 13:48 Subject: Re: [BioC] Problems selecting rows from dataframe (exprs) of GNF Atlas data.... Sent by: bioconductor-bounces at r-project.org Dear Sebastian: Thanks for your swift reply. It works, but only for the probe ID that start with a character (only ~15 out of the > 100 probe IDs I want to investigate). Those that start with a number report back with "<0 rows> (or 0-length row.names)". The motto for the New Year seems to be 'Solve a problem, only to find new ones'. Phew. Kind regards, Bas On Tue, Jan 3, 2012 at 11:19 AM, Sebastian Thieme <thieme at="" mi.fu-berlin.de=""> wrote: > Hello, > > happy new year too =) > > you can use exprs[ rownames(exprs) %in% "gnf1h00499_at",] or exprs[ > rownames(exprs) %in% vectorOfNames,], where vectorOfNames is a list or > a vector of the names you are looking for. Important is that the > object you are search in has to be the first argument. If you want > requesting a high number of names use lists instead of dataframes. > > best > > Basti > > 2012/1/3 Bas Jansen <bjhjansen at="" gmail.com="">: >> Dear fellow Bioconductor users: >> >> Happy New Year! >> At the moment I am analyzing the GNF Atlas data. I retrieved the data >> from the Gene Expression Omnibus using the package GEOquery, converted >> it to an expressionSet and extracted the expression values. So now I >> have a data frame from which I would like to extract the expression >> values of > 100 probe IDs for 79 tissues. Thing is, if I use a single >> probe ID, things go fine. However, whenever I use a string of probe >> IDs, things go awry. >> >> See below: >> >> *** >>> exprs[c("gnf1h00499_at"),] >> GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 >> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 >> (abbreviated for reasons of clarity) >> *** >> >> As stated above: whenever I use a string of probe IDs (say, like 2 >> probe IDs), things go awry: >> >> *** >>> exprs[c("gnf1h00499_at","gnf1h500_at"),] >> GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 >> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488 >> NA NA NA NA NA NA NA NA >> etc. >> *** >> >> The gnf1h00500 probe is reported as NA, and I'm pretty sure it has >> real expression values associated with it. >> The following just works fine: >> >> *** >>> exprs[c(1:20,30:70),] >> GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774 >> 200000_s_at 0 0 0 0 0 0 0 >> 200001_at 0 0 0 0 0 0 0 >> 200002_at 0 0 0 0 0 0 0 >> 200003_s_at 0 0 0 0 0 0 0 >> etc. >> *** >> >> So, how do I select rows on the basis of probe IDs? Or better yet: >> what am I overlooking???? >> >> Thanks & kind regards, >> Bas >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email. The content of this email is not legally binding unless confirmed by letter. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. For further information about Actelion please see our website at http://www.actelion.com
ADD REPLYlink written 7.9 years ago by Axel Klenk940
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 405 users visited in the last hour