DataFrame after splitting and back
2
1
Entering edit mode
@laurent-gatto-5645
Last seen 17 days ago
Belgium

I have the following DataFrame

> k <- sample(3, 10, replace = TRUE)
> df <- DataFrame(k = k,
+                 x = round(rnorm(length(k)), 2),
+                 y = seq_len(length(k)),
+                 z = sample(LETTERS, length(k), replace = TRUE),
+                 ir = IRanges(seq_along(k), width = 10),
+                 r = Rle(sample(5, length(k), replace = TRUE)))
> df
DataFrame with 10 rows and 6 columns
           k         x         y           z        ir     r
   <integer> <numeric> <integer> <character> <IRanges> <Rle>
1          2      -0.8         1           E      1-10     1
2          2     -0.43         2           U      2-11     5
3          2     -0.67         3           U      3-12     4
4          1      0.58         4           L      4-13     2
5          2     -0.95         5           K      5-14     1
6          1      0.47         6           J      6-15     1
7          2      1.24         7           S      7-16     3
8          2     -1.73         8           M      8-17     5
9          1     -0.89         9           F      9-18     5
10         2      -1.3        10           D     10-19     4

that stores information about three subgroups (defined by column k).

I can very efficiently group the rows into a new DataFrame by first splitting df based on k, then creating an new compressed one:

> df2 <- DataFrame(split(df, df$k))
> df2
DataFrame with 2 rows and 6 columns
              k                    x             y               z
  <IntegerList>        <NumericList> <IntegerList> <CharacterList>
1         1,1,1      0.58,0.47,-0.89         4,6,9           L,J,F
2     2,2,2,... -0.8,-0.43,-0.67,...     1,2,3,...       E,U,U,...
                  ir         r
       <IRangesList> <RleList>
1     4-13,6-15,9-18     2,1,5
2 1-10,2-11,3-12,... 1,5,4,...

Is there an easy and fast way to get back to df from df2?

DataFrame S4Vectors • 1.1k views
ADD COMMENT
2
Entering edit mode
@martin-morgan-1513
Last seen 5 months ago
United States

Maybe

DataFrame(lapply(df2, unsplit, df$k))

?

ADD COMMENT
2
Entering edit mode
@michael-lawrence-3846
Last seen 3.1 years ago
United States

In devel there is a new recursive=TRUE argument on expand() that if FALSE will expand columns in parallel, so you can now do:

expand(df2, recursive=FALSE)

as long as you don't care that the data are sorted by "k".

ADD COMMENT

Login before adding your answer.

Traffic: 644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6