likely bug in cbind() for DataFrame
1
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 18 months ago
United States
> df1 = DataFrame(A = c(1,2)) > df2 = DataFrame(B = c(1,2)) > rownames(df1) = c("a", "b") > df1 DataFrame with 2 rows and 1 column A <numeric> a 1 b 2 > cbind(df1, df2) DataFrame with 2 rows and 2 columns A B <numeric> <numeric> 1 1 1 2 2 2 rownames are removed. This does not happen for data.frame's. Kasper [[alternative HTML version deleted]]
• 1.1k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States
In general, DataFrame does less with rownames compared to data.frame. This was for simplicity and performance. So we could add that for the sake of consistency; it was just ignored at the beginning. On Thu, Aug 1, 2013 at 12:35 PM, Kasper Daniel Hansen < kasperdanielhansen@gmail.com> wrote: > > df1 = DataFrame(A = c(1,2)) > > df2 = DataFrame(B = c(1,2)) > > rownames(df1) = c("a", "b") > > df1 > DataFrame with 2 rows and 1 column > A > <numeric> > a 1 > b 2 > > cbind(df1, df2) > DataFrame with 2 rows and 2 columns > A B > <numeric> <numeric> > 1 1 1 > 2 2 2 > > rownames are removed. This does not happen for data.frame's. > > Kasper > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Well, I was bitten by this by some custom code for manipulating the colData slot of a SummarizedExperiment - for this class, the sampleNames are stored exactly as the rownames of @colData so it becomes pretty important to keep. Now that I know, it is easy to work around in the code I had - I just save the rownames. Still, I was very surprised by this, and I think we should keep the rownames. Best, Kasper On Thu, Aug 1, 2013 at 3:58 PM, Michael Lawrence <lawrence.michael@gene.com>wrote: > In general, DataFrame does less with rownames compared to data.frame. This > was for simplicity and performance. So we could add that for the sake of > consistency; it was just ignored at the beginning. > > > On Thu, Aug 1, 2013 at 12:35 PM, Kasper Daniel Hansen < > kasperdanielhansen@gmail.com> wrote: > >> > df1 = DataFrame(A = c(1,2)) >> > df2 = DataFrame(B = c(1,2)) >> > rownames(df1) = c("a", "b") >> > df1 >> DataFrame with 2 rows and 1 column >> A >> <numeric> >> a 1 >> b 2 >> > cbind(df1, df2) >> DataFrame with 2 rows and 2 columns >> A B >> <numeric> <numeric> >> 1 1 1 >> 2 2 2 >> >> rownames are removed. This does not happen for data.frame's. >> >> Kasper >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Please try 1.19.20. This is a pretty big change, so hopefully it plays well with everyone's code. There's some chance that duplicate row names can be introduced (and throw an error) while before they were simply dropped. This is because we take the names from any vector argument as potential rownames, just like data.frame. On Thu, Aug 1, 2013 at 1:07 PM, Kasper Daniel Hansen < kasperdanielhansen@gmail.com> wrote: > Well, I was bitten by this by some custom code for manipulating the > colData slot of a SummarizedExperiment - for this class, the sampleNames > are stored exactly as the rownames of @colData so it becomes pretty > important to keep. Now that I know, it is easy to work around in the code > I had - I just save the rownames. > > Still, I was very surprised by this, and I think we should keep the > rownames. > > Best, > Kasper > > > On Thu, Aug 1, 2013 at 3:58 PM, Michael Lawrence < > lawrence.michael@gene.com> wrote: > >> In general, DataFrame does less with rownames compared to data.frame. >> This was for simplicity and performance. So we could add that for the sake >> of consistency; it was just ignored at the beginning. >> >> >> On Thu, Aug 1, 2013 at 12:35 PM, Kasper Daniel Hansen < >> kasperdanielhansen@gmail.com> wrote: >> >>> > df1 = DataFrame(A = c(1,2)) >>> > df2 = DataFrame(B = c(1,2)) >>> > rownames(df1) = c("a", "b") >>> > df1 >>> DataFrame with 2 rows and 1 column >>> A >>> <numeric> >>> a 1 >>> b 2 >>> > cbind(df1, df2) >>> DataFrame with 2 rows and 2 columns >>> A B >>> <numeric> <numeric> >>> 1 1 1 >>> 2 2 2 >>> >>> rownames are removed. This does not happen for data.frame's. >>> >>> Kasper >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6