0
5.7 years ago by
United States
Hi Bioconductors, I recently noticed some unexpected behavior for AnnotatedDataFrame metadata. If you have an annotated data frame with metadata and add a new variable whose name is a shortened version of an existing variable, the metadata for the newly-added variable is set to the metadata of the existing variable. Here's an example: library(Biobase) annot <- AnnotatedDataFrame(data.frame(myvariable=runif(10))) varMetadata(annot)["myvariable", "labelDescription"] <- "random samples from a uniform distribution" annot$myvar <- rnorm(10) annot$newvar <- rnorm(10) annot[["myvari"]] <- rnorm(10) varMetadata(annot) For me, this last step prints out: labelDescription myvariable random samples from a uniform distribution myvar random samples from a uniform distribution newvar <na> myvari random samples from a uniform distribution even though I've only set the metadata for myvariable. I would expect that any new variables have NA for metadata, which is true for "newvar" above, but is not the case for the variables whose names are a shortened version of "myvariable" ("myvar" and "myvari"). I end up with misleading or incorrect metadata for the new variables "myvar" and "myvari". The can always be changed later, but I often see what metadata I need to update at the end by checking which have NA labelDescriptions, so these new variables wouldn't show up. I'm using bioc-devel. Here's the sessionInfo() output: R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] Biobase_2.22.0 BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] tools_3.0.2 Thanks, Adrienne
• 485 views
modified 5.7 years ago by Martin Morgan ♦♦ 24k • written 5.7 years ago by Adrienne Stilp30
0
5.7 years ago by
Martin Morgan ♦♦ 24k
United States
Martin Morgan ♦♦ 24k wrote:
On 02/19/2014 03:03 PM, Adrienne Stilp wrote: > Hi Bioconductors, > > I recently noticed some unexpected behavior for AnnotatedDataFrame metadata. If you have an annotated data frame with metadata and add a new variable whose name is a shortened version of an existing variable, the metadata for the newly-added variable is set to the metadata of the existing variable. Here's an example: > > library(Biobase) > > annot <- AnnotatedDataFrame(data.frame(myvariable=runif(10))) > varMetadata(annot)["myvariable", "labelDescription"] <- "random samples from a uniform distribution" > > annot$myvar <- rnorm(10) > annot$newvar <- rnorm(10) > annot[["myvari"]] <- rnorm(10) > > varMetadata(annot) > > > For me, this last step prints out: > > labelDescription > myvariable random samples from a uniform distribution > myvar random samples from a uniform distribution > newvar <na> > myvari random samples from a uniform distribution > Thanks for the bug report; this is fixed in Biobase 2.23.5, which will appear in the devel branch probably on Friday. Martin > > even though I've only set the metadata for myvariable. I would expect that any new variables have NA for metadata, which is true for "newvar" above, but is not the case for the variables whose names are a shortened version of "myvariable" ("myvar" and "myvari"). I end up with misleading or incorrect metadata for the new variables "myvar" and "myvari". The can always be changed later, but I often see what metadata I need to update at the end by checking which have NA labelDescriptions, so these new variables wouldn't show up. > > > I'm using bioc-devel. Here's the sessionInfo() output: > > R version 3.0.2 (2013-09-25) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] Biobase_2.22.0 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] tools_3.0.2 > > > Thanks, > Adrienne > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793