Search
Question: Comparison of Column of AnnotatedDataFrame
1
gravatar for Dario Strbenac
14 months ago by
Dario Strbenac1.4k
Australia
Dario Strbenac1.4k wrote:

The documentation example of AnnotatedDataFrame are quite limited and only show how to coerce between data.frame and AnnotatedDataFrame. How can I subset a column of an AnnotatedDataFrame and check for equality to a particular value, without converting it to a data.frame first, for example ?

ADD COMMENTlink modified 14 months ago • written 14 months ago by Dario Strbenac1.4k

Sorry if I am missing the point but do you mean something different from this?:

library(Biobase)

tmp <- AnnotatedDataFrame(iris)

tmp[, 1] # select first column.
#tmp[, 1]@data # this is bad idea- see comments below.
pData(tmp)[, 1] # check.

# select rows based on value of one column:
tmp[tmp$Sepal.Length > 7, 1]@data

    Sepal.Length
103          7.1
106          7.6
108          7.3
110          7.2
118          7.7
119          7.7
123          7.7
126          7.2
130          7.2
131          7.4
132          7.9
136          7.7
ADD REPLYlink modified 13 months ago • written 14 months ago by Diego Diez700

Yes, but shouldn't the usual column accessor work?

> tmp[, "Sepal.Length"] > 7
Error in tmp[, "Sepal.Length"] > 7 :
  comparison (6) is possible only for atomic and list types
ADD REPLYlink written 14 months ago by Dario Strbenac1.4k

Ah I see. This is what I get:

> tmp[, "Sepal.Length"]
An object of class 'AnnotatedDataFrame'
  rowNames: 1 2 ... 150 (150 total)
  varLabels: Sepal.Length
  varMetadata: labelDescription
Which explains at least why it is not working (you get similar for for tmp[1,]). I guess the idea is that subsetting with `[` should return an `AnnotatedDataFrame` object, but accessing directly with `$` gets you the values. I have no idea if this is the intended behavior.
ADD REPLYlink written 14 months ago by Diego Diez700
1

The philosophy is that [ is an 'endomorphism' -- it returns the class as it is applied to. $ and [[ are not. Also, use pData() rather than slot access, and (strongly) consider S4Vectors::DataFrame for a more modern implementation of the AnnotatedDataFrame concept.

ADD REPLYlink written 14 months ago by Martin Morgan ♦♦ 20k

Thanks! Why is pData() preferred over slot access?

ADD REPLYlink written 14 months ago by Diego Diez700
1

The 'usual' reasons for object-oriented programming -- it separates the user-oriented interface from design considerations employed by the developer. Often not much divergence but for instance the slots (internal developer business) of a DNAStringSet have little to do with the interface designed for the user.

ADD REPLYlink written 14 months ago by Martin Morgan ♦♦ 20k

Sorry if I look persistent on this but I didn't consider using $ or [[ as slot access. But maybe I am mistaken? I see (at least) three ways to access the data in the example above:

tmp$Sepal.Length # use subsetting method.
pData(tmp)$Sepal.Length # use accessor method then subsetting method.
tmp@data$Sepal.Length # use slot- bad.

Maybe I misunderstood and when you said "pData() rather than slot access" you meant example 3 here?

ADD REPLYlink written 14 months ago by Diego Diez700
2

Yes, I meant example 3; the @data in C: Comparison of Column of AnnotatedDataFrame is slot access. tmp$Sepal.Length; pData(tmp)$Sepal.Legnth; pData(tmp[,"Sepal.Length"])$Sepal.Length etc would be acceptable, as with [[.

ADD REPLYlink written 14 months ago by Martin Morgan ♦♦ 20k

I see! I completely forgot I used the slot to access the data for checking in my original example. Now I have no idea why I did that on the first place. I have updated that comment to avoid misleading potential readers. Thank you!

ADD REPLYlink written 13 months ago by Diego Diez700

And this works also:

tmp[["Sepal.Length"]] > 7
ADD REPLYlink written 14 months ago by Diego Diez700

One more thought. I guess this behavior is also consistent with that of `data.frame(..., drop = FALSE)`. `[` will always return a data.frame, whereas `$` and `[[` return a vector.

ADD REPLYlink modified 14 months ago • written 14 months ago by Diego Diez700

It would be nice if there was a section of documentation titled Accessors.

ADD REPLYlink written 14 months ago by Dario Strbenac1.4k
1
gravatar for Dario Strbenac
14 months ago by
Dario Strbenac1.4k
Australia
Dario Strbenac1.4k wrote:
anAnnotatedDataFrame[["columnName"]] == value
ADD COMMENTlink written 14 months ago by Dario Strbenac1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 288 users visited in the last hour