w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
Adding multiple columns from a function to data.table using := within groups - without specifying the LHS

My error when I tried to use := within a list on the RHS produced the advice at the console to read: help(":=") and once I did that I achieved enlightenment:

help(":=")  # advice about  `multiple :=`

dt[ , `:=`(y1.lag = c(NA, head(y1,-1) ),
              y2.lag = c(NA, head(y2,-1) ),
              y1y2 = y1*y2) ,by=grp]
#------------    
dt
    grp period         y1          y2     y1.lag      y2.lag        y1y2
 1:   a      1 -0.2127395  1.33549660         NA          NA -0.28411285
 2:   a      2 -2.2005742 -0.07679158 -0.2127395  1.33549660  0.16898556
 3:   a      3  0.3857444 -0.47996397 -2.2005742 -0.07679158 -0.18514341
 4:   a      4 -1.5117554  0.50728778  0.3857444 -0.47996397 -0.76689506
 5:   a      5  1.7713902 -0.03092824 -1.5117554  0.50728778 -0.05478598
 6:   b      1  0.5033163  0.69815100         NA          NA  0.35139079
 7:   b      2  0.1125835 -2.19959623  0.5033163  0.69815100 -0.24763815
 8:   b      3  1.0252230 -1.76477546  0.1125835 -2.19959623 -1.80928832
 9:   b      4 -0.5484611 -1.35167910  1.0252230 -1.76477546  0.74134341
10:   b      5  1.3801637  0.67293665 -0.5484611 -1.35167910  0.92876276

It's rather neat to see that calling := as a function with a pairlist works like do.call might with a regular list. [Edit} To address the request to "externalize" the lagging specification but still coding to work on particular columns:

lag.y1.y2.expr =  expression(`:=`(
    y1.lag = c(NA, head(y1, -1) ),
    y2.lag = c(NA, head(y2, -1)),
    y1y2 = y1 * y2  ) )

dt[, eval( lag.y1.y2.expr ), , by='grp' ]

I don't see this as having any deficiencies relative to your code, since your code did not allow (or even hint at) programmatic substitution of column names. If you wanted a somewhat more maintainable arrangement with a single entry point for the possible modification of the names of the columns this also succeeds:

my.expr =  substitute(`:=`(
                          y1.lag = c(NA, head(X1, -1) ),
                          y2.lag = c(NA, head(X2, -1) ),
                          y1y2 = y1 * y2 
                           ) ,
                      list(X1=quote(y1),X2=quote(y2) ) )

dt[, eval(my.expr), , by='grp' ]

And I suspect that you could expect success using bquote which sometimes simplifies working with R expression objects..





© Copyright 2018 w3hello.com Publishing Limited. All rights reserved.