Convert DataFrame to data.frame While Keeping Column Name Syntax
1
0
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 2 days ago
Australia

If I have a DataFrame object, but I want to use a classification algorithm like randomForest from CRAN, which complains about S4 objects being input, what is the best way to coerce to a data.frame? I would like to keep column names in the current gene symbol format. They might have unusual symbols like HLA-A, for example. as.data.frame automatically converts column names to by syntactically valid, by doing things such as replancing hyphens by periods. data.frame(myDataFrame, check.names = FALSE) effectively does what I want. But, it's a constructor rather than a function to convert between types. Anything better to use?

S4Vectors • 97 views
2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
> df <- DataFrame("HLA-A" = 1:5, "HLA-B" = 2:6, check.names = FALSE)
> df
DataFrame with 5 rows and 2 columns
HLA-A     HLA-B
<integer> <integer>
1         1         2
2         2         3
3         3         4
4         4         5
5         5         6
> as(df, "data.frame")
HLA-A HLA-B
1     1     2
2     2     3
3     3     4
4     4     5
5     5     6


Doesn't seem like the names are run through make.names?

1
Entering edit mode

Oh. You were using as.data.frame. That just ends up re-creating the data.frame, and on top of it all there's an ... argument that is ignored! LOL

> as.data.frame(df, check.names = FALSE)
HLA.A HLA.B
1     1     2
2     2     3
3     3     4
4     4     5
5     5     6
Warning message:
In .local(x, row.names, optional, ...) : Arguments in '...' ignored


Howeva, there is the 'optional' argument that you could use, and which is documented, so hypothetically you could have just figured this out yourself.

From ?as.data.frame, and then following to ?base::as.data.frame you will see

Arguments:

x: any R object.

row.names: 'NULL' or a character vector giving the row names for the
data frame.  Missing values are not allowed.

optional: logical. If 'TRUE', setting row names and converting column
names (to syntactic names: see 'make.names') is optional.
Note that all of R's 'base' package 'as.data.frame()' methods
use 'optional' only for column names treatment, basically
with the meaning of 'data.frame(*, check.names = !optional)'.

> as.data.frame(df, optional = TRUE)
HLA-A HLA-B
1     1     2
2     2     3
3     3     4
4     4     5
5     5     6