Grouping Set aggregation for data tables
groupingsets.Rd
Calculate aggregates at various levels of groupings producing multiple (sub-)totals. Reflects SQLs GROUPING SETS operations.
Usage
rollup(x, ...)
# S3 method for class 'data.table'
rollup(x, j, by, .SDcols, id = FALSE, label = NULL, ...)
cube(x, ...)
# S3 method for class 'data.table'
cube(x, j, by, .SDcols, id = FALSE, label = NULL, ...)
groupingsets(x, ...)
# S3 method for class 'data.table'
groupingsets(x, j, by, sets, .SDcols,
id = FALSE, jj, label = NULL, enclos = parent.frame(), ...)
Arguments
- x
data.table
.- ...
argument passed to custom user methods. Ignored for
data.table
methods.- j
expression passed to data.table
j
.- by
character column names by which we are grouping.
- sets
list of character vector reflecting grouping sets, used in
groupingsets
for flexibility.- .SDcols
columns to be used in
j
expression in.SD
object.- id
logical default
FALSE
. IfTRUE
it will add leading column with bit mask of grouping sets.- jj
quoted version of
j
argument, for convenience. When provided function will ignorej
argument.- label
label(s) to be used in the 'total' rows in the grouping variable columns of the output, that is, in rows where the grouping variable has been aggregated. Can be a named list of scalars, or a scalar, or
NULL
. Defaults toNULL
, which results in the grouping variables havingNA
in their 'total' rows. See Details.- enclos
the environment containing the symbols referenced by
jj
. When writing functions that accept aj
environment for non-standard evaluation by data.table,substitute()
it and forward it togroupingsets
using thejj
argument, set this to theparent.frame()
of the function that capturesj
.
Details
All three functions rollup, cube, groupingsets
are generic methods, data.table
methods are provided.
The label
argument can be a named list of scalars, or a scalar, or NULL
. When label
is a list, each element name must be (1) a variable name in by
, or (2) the first element of the class in the data.table x
of a variable in by
, or (3) one of 'character', 'integer', 'numeric', 'factor', 'Date', 'IDate'. The order of the list elements is not important. A label specified by variable name will apply only to that variable, while a label specified by first element of a class will apply to all variables in by
for which the first element of the class of the variable in x
matches the label
element name, except for variables that have a label specified by variable name (that is, specification by variable name takes precedence over specification by class). For label
elements with name in by
, the class of the label value must be the same as the class of the variable in x
. For label
elements with name not in by
, the first element of the class of the label value must be the same as the label
element name. For example, label = list(integer = 999, IDate = as.Date("3000-01-01"))
would produce an error because class(999)[1]
is not "integer"
and class(as.Date("3000-01-01"))[1]
is not "IDate"
. A corrected specification would be label = list(integer = 999L, IDate = as.IDate("3000-01-01"))
.
The label = <scalar>
option provides a shorter alternative in the case where only one class of grouping variable requires a label. For example, label = list(character = "Total")
can be shortened to label = "Total"
. When this option is used, the label will be applied to all variables in by
for which the first element of the class of the variable in x
matches the first element of the class of the scalar.
References
https://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-GROUPING-SETS https://www.postgresql.org/docs/9.5/static/functions-aggregate.html#FUNCTIONS-GROUPING-TABLE
Examples
n = 24L
set.seed(25)
DT <- data.table(
color = sample(c("green","yellow","red"), n, TRUE),
year = as.Date(sample(paste0(2011:2015,"-01-01"), n, TRUE)),
status = as.factor(sample(c("removed","active","inactive","archived"), n, TRUE)),
amount = sample(1:5, n, TRUE),
value = sample(c(3, 3.5, 2.5, 2), n, TRUE)
)
# rollup
by_vars = c("color", "year", "status")
rollup(DT, j=sum(value), by=by_vars) # default id=FALSE
#> color year status V1
#> <char> <Date> <fctr> <num>
#> 1: red 2015-01-01 active 3.5
#> 2: green 2015-01-01 inactive 5.5
#> 3: green 2014-01-01 archived 3.5
#> 4: green 2015-01-01 archived 2.0
#> 5: yellow 2014-01-01 active 4.5
#> 6: red 2013-01-01 inactive 2.0
#> 7: green 2011-01-01 active 6.0
#> 8: red 2014-01-01 inactive 2.5
#> 9: green 2011-01-01 archived 2.5
#> 10: yellow 2015-01-01 active 2.0
#> 11: red 2012-01-01 archived 2.0
#> 12: red 2011-01-01 removed 3.5
#> 13: green 2014-01-01 inactive 8.0
#> 14: green 2011-01-01 removed 2.0
#> 15: yellow 2012-01-01 archived 2.5
#> 16: red 2013-01-01 removed 3.5
#> 17: green 2013-01-01 active 3.0
#> 18: green 2014-01-01 removed 2.5
#> 19: red 2011-01-01 archived 3.0
#> 20: red 2015-01-01 <NA> 3.5
#> 21: green 2015-01-01 <NA> 7.5
#> 22: green 2014-01-01 <NA> 14.0
#> 23: yellow 2014-01-01 <NA> 4.5
#> 24: red 2013-01-01 <NA> 5.5
#> 25: green 2011-01-01 <NA> 10.5
#> 26: red 2014-01-01 <NA> 2.5
#> 27: yellow 2015-01-01 <NA> 2.0
#> 28: red 2012-01-01 <NA> 2.0
#> 29: red 2011-01-01 <NA> 6.5
#> 30: yellow 2012-01-01 <NA> 2.5
#> 31: green 2013-01-01 <NA> 3.0
#> 32: red <NA> <NA> 20.0
#> 33: green <NA> <NA> 35.0
#> 34: yellow <NA> <NA> 9.0
#> 35: <NA> <NA> <NA> 64.0
#> color year status V1
#> <char> <Date> <fctr> <num>
rollup(DT, j=sum(value), by=by_vars, id=TRUE)
#> grouping color year status V1
#> <int> <char> <Date> <fctr> <num>
#> 1: 0 red 2015-01-01 active 3.5
#> 2: 0 green 2015-01-01 inactive 5.5
#> 3: 0 green 2014-01-01 archived 3.5
#> 4: 0 green 2015-01-01 archived 2.0
#> 5: 0 yellow 2014-01-01 active 4.5
#> 6: 0 red 2013-01-01 inactive 2.0
#> 7: 0 green 2011-01-01 active 6.0
#> 8: 0 red 2014-01-01 inactive 2.5
#> 9: 0 green 2011-01-01 archived 2.5
#> 10: 0 yellow 2015-01-01 active 2.0
#> 11: 0 red 2012-01-01 archived 2.0
#> 12: 0 red 2011-01-01 removed 3.5
#> 13: 0 green 2014-01-01 inactive 8.0
#> 14: 0 green 2011-01-01 removed 2.0
#> 15: 0 yellow 2012-01-01 archived 2.5
#> 16: 0 red 2013-01-01 removed 3.5
#> 17: 0 green 2013-01-01 active 3.0
#> 18: 0 green 2014-01-01 removed 2.5
#> 19: 0 red 2011-01-01 archived 3.0
#> 20: 1 red 2015-01-01 <NA> 3.5
#> 21: 1 green 2015-01-01 <NA> 7.5
#> 22: 1 green 2014-01-01 <NA> 14.0
#> 23: 1 yellow 2014-01-01 <NA> 4.5
#> 24: 1 red 2013-01-01 <NA> 5.5
#> 25: 1 green 2011-01-01 <NA> 10.5
#> 26: 1 red 2014-01-01 <NA> 2.5
#> 27: 1 yellow 2015-01-01 <NA> 2.0
#> 28: 1 red 2012-01-01 <NA> 2.0
#> 29: 1 red 2011-01-01 <NA> 6.5
#> 30: 1 yellow 2012-01-01 <NA> 2.5
#> 31: 1 green 2013-01-01 <NA> 3.0
#> 32: 3 red <NA> <NA> 20.0
#> 33: 3 green <NA> <NA> 35.0
#> 34: 3 yellow <NA> <NA> 9.0
#> 35: 7 <NA> <NA> <NA> 64.0
#> grouping color year status V1
#> <int> <char> <Date> <fctr> <num>
rollup(DT, j=lapply(.SD, sum), by=by_vars, id=TRUE, .SDcols="value")
#> grouping color year status value
#> <int> <char> <Date> <fctr> <num>
#> 1: 0 red 2015-01-01 active 3.5
#> 2: 0 green 2015-01-01 inactive 5.5
#> 3: 0 green 2014-01-01 archived 3.5
#> 4: 0 green 2015-01-01 archived 2.0
#> 5: 0 yellow 2014-01-01 active 4.5
#> 6: 0 red 2013-01-01 inactive 2.0
#> 7: 0 green 2011-01-01 active 6.0
#> 8: 0 red 2014-01-01 inactive 2.5
#> 9: 0 green 2011-01-01 archived 2.5
#> 10: 0 yellow 2015-01-01 active 2.0
#> 11: 0 red 2012-01-01 archived 2.0
#> 12: 0 red 2011-01-01 removed 3.5
#> 13: 0 green 2014-01-01 inactive 8.0
#> 14: 0 green 2011-01-01 removed 2.0
#> 15: 0 yellow 2012-01-01 archived 2.5
#> 16: 0 red 2013-01-01 removed 3.5
#> 17: 0 green 2013-01-01 active 3.0
#> 18: 0 green 2014-01-01 removed 2.5
#> 19: 0 red 2011-01-01 archived 3.0
#> 20: 1 red 2015-01-01 <NA> 3.5
#> 21: 1 green 2015-01-01 <NA> 7.5
#> 22: 1 green 2014-01-01 <NA> 14.0
#> 23: 1 yellow 2014-01-01 <NA> 4.5
#> 24: 1 red 2013-01-01 <NA> 5.5
#> 25: 1 green 2011-01-01 <NA> 10.5
#> 26: 1 red 2014-01-01 <NA> 2.5
#> 27: 1 yellow 2015-01-01 <NA> 2.0
#> 28: 1 red 2012-01-01 <NA> 2.0
#> 29: 1 red 2011-01-01 <NA> 6.5
#> 30: 1 yellow 2012-01-01 <NA> 2.5
#> 31: 1 green 2013-01-01 <NA> 3.0
#> 32: 3 red <NA> <NA> 20.0
#> 33: 3 green <NA> <NA> 35.0
#> 34: 3 yellow <NA> <NA> 9.0
#> 35: 7 <NA> <NA> <NA> 64.0
#> grouping color year status value
#> <int> <char> <Date> <fctr> <num>
rollup(DT, j=c(list(count=.N), lapply(.SD, sum)), by=by_vars, id=TRUE)
#> grouping color year status count amount value
#> <int> <char> <Date> <fctr> <int> <int> <num>
#> 1: 0 red 2015-01-01 active 1 4 3.5
#> 2: 0 green 2015-01-01 inactive 2 5 5.5
#> 3: 0 green 2014-01-01 archived 1 3 3.5
#> 4: 0 green 2015-01-01 archived 1 4 2.0
#> 5: 0 yellow 2014-01-01 active 2 5 4.5
#> 6: 0 red 2013-01-01 inactive 1 1 2.0
#> 7: 0 green 2011-01-01 active 2 9 6.0
#> 8: 0 red 2014-01-01 inactive 1 5 2.5
#> 9: 0 green 2011-01-01 archived 1 4 2.5
#> 10: 0 yellow 2015-01-01 active 1 4 2.0
#> 11: 0 red 2012-01-01 archived 1 4 2.0
#> 12: 0 red 2011-01-01 removed 1 1 3.5
#> 13: 0 green 2014-01-01 inactive 3 7 8.0
#> 14: 0 green 2011-01-01 removed 1 4 2.0
#> 15: 0 yellow 2012-01-01 archived 1 1 2.5
#> 16: 0 red 2013-01-01 removed 1 3 3.5
#> 17: 0 green 2013-01-01 active 1 2 3.0
#> 18: 0 green 2014-01-01 removed 1 5 2.5
#> 19: 0 red 2011-01-01 archived 1 1 3.0
#> 20: 1 red 2015-01-01 <NA> 1 4 3.5
#> 21: 1 green 2015-01-01 <NA> 3 9 7.5
#> 22: 1 green 2014-01-01 <NA> 5 15 14.0
#> 23: 1 yellow 2014-01-01 <NA> 2 5 4.5
#> 24: 1 red 2013-01-01 <NA> 2 4 5.5
#> 25: 1 green 2011-01-01 <NA> 4 17 10.5
#> 26: 1 red 2014-01-01 <NA> 1 5 2.5
#> 27: 1 yellow 2015-01-01 <NA> 1 4 2.0
#> 28: 1 red 2012-01-01 <NA> 1 4 2.0
#> 29: 1 red 2011-01-01 <NA> 2 2 6.5
#> 30: 1 yellow 2012-01-01 <NA> 1 1 2.5
#> 31: 1 green 2013-01-01 <NA> 1 2 3.0
#> 32: 3 red <NA> <NA> 7 19 20.0
#> 33: 3 green <NA> <NA> 13 43 35.0
#> 34: 3 yellow <NA> <NA> 4 10 9.0
#> 35: 7 <NA> <NA> <NA> 24 72 64.0
#> grouping color year status count amount value
#> <int> <char> <Date> <fctr> <int> <int> <num>
rollup(DT, j=sum(value), by=by_vars,
# specify label by variable name
label=list(color="total", year=as.Date("3000-01-01"), status=factor("total")))
#> color year status V1
#> <char> <Date> <fctr> <num>
#> 1: red 2015-01-01 active 3.5
#> 2: green 2015-01-01 inactive 5.5
#> 3: green 2014-01-01 archived 3.5
#> 4: green 2015-01-01 archived 2.0
#> 5: yellow 2014-01-01 active 4.5
#> 6: red 2013-01-01 inactive 2.0
#> 7: green 2011-01-01 active 6.0
#> 8: red 2014-01-01 inactive 2.5
#> 9: green 2011-01-01 archived 2.5
#> 10: yellow 2015-01-01 active 2.0
#> 11: red 2012-01-01 archived 2.0
#> 12: red 2011-01-01 removed 3.5
#> 13: green 2014-01-01 inactive 8.0
#> 14: green 2011-01-01 removed 2.0
#> 15: yellow 2012-01-01 archived 2.5
#> 16: red 2013-01-01 removed 3.5
#> 17: green 2013-01-01 active 3.0
#> 18: green 2014-01-01 removed 2.5
#> 19: red 2011-01-01 archived 3.0
#> 20: red 2015-01-01 total 3.5
#> 21: green 2015-01-01 total 7.5
#> 22: green 2014-01-01 total 14.0
#> 23: yellow 2014-01-01 total 4.5
#> 24: red 2013-01-01 total 5.5
#> 25: green 2011-01-01 total 10.5
#> 26: red 2014-01-01 total 2.5
#> 27: yellow 2015-01-01 total 2.0
#> 28: red 2012-01-01 total 2.0
#> 29: red 2011-01-01 total 6.5
#> 30: yellow 2012-01-01 total 2.5
#> 31: green 2013-01-01 total 3.0
#> 32: red 3000-01-01 total 20.0
#> 33: green 3000-01-01 total 35.0
#> 34: yellow 3000-01-01 total 9.0
#> 35: total 3000-01-01 total 64.0
#> color year status V1
#> <char> <Date> <fctr> <num>
rollup(DT, j=sum(value), by=by_vars,
# specify label by variable name and first element of class
label=list(color="total", Date=as.Date("3000-01-01"), factor=factor("total")))
#> color year status V1
#> <char> <Date> <fctr> <num>
#> 1: red 2015-01-01 active 3.5
#> 2: green 2015-01-01 inactive 5.5
#> 3: green 2014-01-01 archived 3.5
#> 4: green 2015-01-01 archived 2.0
#> 5: yellow 2014-01-01 active 4.5
#> 6: red 2013-01-01 inactive 2.0
#> 7: green 2011-01-01 active 6.0
#> 8: red 2014-01-01 inactive 2.5
#> 9: green 2011-01-01 archived 2.5
#> 10: yellow 2015-01-01 active 2.0
#> 11: red 2012-01-01 archived 2.0
#> 12: red 2011-01-01 removed 3.5
#> 13: green 2014-01-01 inactive 8.0
#> 14: green 2011-01-01 removed 2.0
#> 15: yellow 2012-01-01 archived 2.5
#> 16: red 2013-01-01 removed 3.5
#> 17: green 2013-01-01 active 3.0
#> 18: green 2014-01-01 removed 2.5
#> 19: red 2011-01-01 archived 3.0
#> 20: red 2015-01-01 total 3.5
#> 21: green 2015-01-01 total 7.5
#> 22: green 2014-01-01 total 14.0
#> 23: yellow 2014-01-01 total 4.5
#> 24: red 2013-01-01 total 5.5
#> 25: green 2011-01-01 total 10.5
#> 26: red 2014-01-01 total 2.5
#> 27: yellow 2015-01-01 total 2.0
#> 28: red 2012-01-01 total 2.0
#> 29: red 2011-01-01 total 6.5
#> 30: yellow 2012-01-01 total 2.5
#> 31: green 2013-01-01 total 3.0
#> 32: red 3000-01-01 total 20.0
#> 33: green 3000-01-01 total 35.0
#> 34: yellow 3000-01-01 total 9.0
#> 35: total 3000-01-01 total 64.0
#> color year status V1
#> <char> <Date> <fctr> <num>
# label is character scalar so applies to color only
rollup(DT, j=sum(value), by=by_vars, label="total")
#> color year status V1
#> <char> <Date> <fctr> <num>
#> 1: red 2015-01-01 active 3.5
#> 2: green 2015-01-01 inactive 5.5
#> 3: green 2014-01-01 archived 3.5
#> 4: green 2015-01-01 archived 2.0
#> 5: yellow 2014-01-01 active 4.5
#> 6: red 2013-01-01 inactive 2.0
#> 7: green 2011-01-01 active 6.0
#> 8: red 2014-01-01 inactive 2.5
#> 9: green 2011-01-01 archived 2.5
#> 10: yellow 2015-01-01 active 2.0
#> 11: red 2012-01-01 archived 2.0
#> 12: red 2011-01-01 removed 3.5
#> 13: green 2014-01-01 inactive 8.0
#> 14: green 2011-01-01 removed 2.0
#> 15: yellow 2012-01-01 archived 2.5
#> 16: red 2013-01-01 removed 3.5
#> 17: green 2013-01-01 active 3.0
#> 18: green 2014-01-01 removed 2.5
#> 19: red 2011-01-01 archived 3.0
#> 20: red 2015-01-01 <NA> 3.5
#> 21: green 2015-01-01 <NA> 7.5
#> 22: green 2014-01-01 <NA> 14.0
#> 23: yellow 2014-01-01 <NA> 4.5
#> 24: red 2013-01-01 <NA> 5.5
#> 25: green 2011-01-01 <NA> 10.5
#> 26: red 2014-01-01 <NA> 2.5
#> 27: yellow 2015-01-01 <NA> 2.0
#> 28: red 2012-01-01 <NA> 2.0
#> 29: red 2011-01-01 <NA> 6.5
#> 30: yellow 2012-01-01 <NA> 2.5
#> 31: green 2013-01-01 <NA> 3.0
#> 32: red <NA> <NA> 20.0
#> 33: green <NA> <NA> 35.0
#> 34: yellow <NA> <NA> 9.0
#> 35: total <NA> <NA> 64.0
#> color year status V1
#> <char> <Date> <fctr> <num>
rollup(DT, j=.N, by=c("color", "year", "status", "value"),
# label can be explicitly specified as NA or NaN
label = list(color=NA_character_, year=as.Date(NA), status=factor(NA), value=NaN))
#> color year status value N
#> <char> <Date> <fctr> <num> <int>
#> 1: red 2015-01-01 active 3.5 1
#> 2: green 2015-01-01 inactive 3.5 1
#> 3: green 2014-01-01 archived 3.5 1
#> 4: green 2015-01-01 archived 2.0 1
#> 5: green 2015-01-01 inactive 2.0 1
#> 6: yellow 2014-01-01 active 2.5 1
#> 7: red 2013-01-01 inactive 2.0 1
#> 8: yellow 2014-01-01 active 2.0 1
#> 9: green 2011-01-01 active 3.5 1
#> 10: red 2014-01-01 inactive 2.5 1
#> 11: green 2011-01-01 archived 2.5 1
#> 12: yellow 2015-01-01 active 2.0 1
#> 13: red 2012-01-01 archived 2.0 1
#> 14: red 2011-01-01 removed 3.5 1
#> 15: green 2014-01-01 inactive 3.0 2
#> 16: green 2011-01-01 removed 2.0 1
#> 17: yellow 2012-01-01 archived 2.5 1
#> 18: green 2011-01-01 active 2.5 1
#> 19: red 2013-01-01 removed 3.5 1
#> 20: green 2014-01-01 inactive 2.0 1
#> 21: green 2013-01-01 active 3.0 1
#> 22: green 2014-01-01 removed 2.5 1
#> 23: red 2011-01-01 archived 3.0 1
#> 24: red 2015-01-01 active NaN 1
#> 25: green 2015-01-01 inactive NaN 2
#> 26: green 2014-01-01 archived NaN 1
#> 27: green 2015-01-01 archived NaN 1
#> 28: yellow 2014-01-01 active NaN 2
#> 29: red 2013-01-01 inactive NaN 1
#> 30: green 2011-01-01 active NaN 2
#> 31: red 2014-01-01 inactive NaN 1
#> 32: green 2011-01-01 archived NaN 1
#> 33: yellow 2015-01-01 active NaN 1
#> 34: red 2012-01-01 archived NaN 1
#> 35: red 2011-01-01 removed NaN 1
#> 36: green 2014-01-01 inactive NaN 3
#> 37: green 2011-01-01 removed NaN 1
#> 38: yellow 2012-01-01 archived NaN 1
#> 39: red 2013-01-01 removed NaN 1
#> 40: green 2013-01-01 active NaN 1
#> 41: green 2014-01-01 removed NaN 1
#> 42: red 2011-01-01 archived NaN 1
#> 43: red 2015-01-01 <NA> NaN 1
#> 44: green 2015-01-01 <NA> NaN 3
#> 45: green 2014-01-01 <NA> NaN 5
#> 46: yellow 2014-01-01 <NA> NaN 2
#> 47: red 2013-01-01 <NA> NaN 2
#> 48: green 2011-01-01 <NA> NaN 4
#> 49: red 2014-01-01 <NA> NaN 1
#> 50: yellow 2015-01-01 <NA> NaN 1
#> 51: red 2012-01-01 <NA> NaN 1
#> 52: red 2011-01-01 <NA> NaN 2
#> 53: yellow 2012-01-01 <NA> NaN 1
#> 54: green 2013-01-01 <NA> NaN 1
#> 55: red <NA> <NA> NaN 7
#> 56: green <NA> <NA> NaN 13
#> 57: yellow <NA> <NA> NaN 4
#> 58: <NA> <NA> <NA> NaN 24
#> color year status value N
#> <char> <Date> <fctr> <num> <int>
# cube
cube(DT, j = sum(value), by = c("color","year","status"), id=TRUE)
#> grouping color year status V1
#> <int> <char> <Date> <fctr> <num>
#> 1: 0 red 2015-01-01 active 3.5
#> 2: 0 green 2015-01-01 inactive 5.5
#> 3: 0 green 2014-01-01 archived 3.5
#> 4: 0 green 2015-01-01 archived 2.0
#> 5: 0 yellow 2014-01-01 active 4.5
#> 6: 0 red 2013-01-01 inactive 2.0
#> 7: 0 green 2011-01-01 active 6.0
#> 8: 0 red 2014-01-01 inactive 2.5
#> 9: 0 green 2011-01-01 archived 2.5
#> 10: 0 yellow 2015-01-01 active 2.0
#> 11: 0 red 2012-01-01 archived 2.0
#> 12: 0 red 2011-01-01 removed 3.5
#> 13: 0 green 2014-01-01 inactive 8.0
#> 14: 0 green 2011-01-01 removed 2.0
#> 15: 0 yellow 2012-01-01 archived 2.5
#> 16: 0 red 2013-01-01 removed 3.5
#> 17: 0 green 2013-01-01 active 3.0
#> 18: 0 green 2014-01-01 removed 2.5
#> 19: 0 red 2011-01-01 archived 3.0
#> 20: 1 red 2015-01-01 <NA> 3.5
#> 21: 1 green 2015-01-01 <NA> 7.5
#> 22: 1 green 2014-01-01 <NA> 14.0
#> 23: 1 yellow 2014-01-01 <NA> 4.5
#> 24: 1 red 2013-01-01 <NA> 5.5
#> 25: 1 green 2011-01-01 <NA> 10.5
#> 26: 1 red 2014-01-01 <NA> 2.5
#> 27: 1 yellow 2015-01-01 <NA> 2.0
#> 28: 1 red 2012-01-01 <NA> 2.0
#> 29: 1 red 2011-01-01 <NA> 6.5
#> 30: 1 yellow 2012-01-01 <NA> 2.5
#> 31: 1 green 2013-01-01 <NA> 3.0
#> 32: 2 red <NA> active 3.5
#> 33: 2 green <NA> inactive 13.5
#> 34: 2 green <NA> archived 8.0
#> 35: 2 yellow <NA> active 6.5
#> 36: 2 red <NA> inactive 4.5
#> 37: 2 green <NA> active 9.0
#> 38: 2 red <NA> archived 5.0
#> 39: 2 red <NA> removed 7.0
#> 40: 2 green <NA> removed 4.5
#> 41: 2 yellow <NA> archived 2.5
#> 42: 3 red <NA> <NA> 20.0
#> 43: 3 green <NA> <NA> 35.0
#> 44: 3 yellow <NA> <NA> 9.0
#> 45: 4 <NA> 2015-01-01 active 5.5
#> 46: 4 <NA> 2015-01-01 inactive 5.5
#> 47: 4 <NA> 2014-01-01 archived 3.5
#> 48: 4 <NA> 2015-01-01 archived 2.0
#> 49: 4 <NA> 2014-01-01 active 4.5
#> 50: 4 <NA> 2013-01-01 inactive 2.0
#> 51: 4 <NA> 2011-01-01 active 6.0
#> 52: 4 <NA> 2014-01-01 inactive 10.5
#> 53: 4 <NA> 2011-01-01 archived 5.5
#> 54: 4 <NA> 2012-01-01 archived 4.5
#> 55: 4 <NA> 2011-01-01 removed 5.5
#> 56: 4 <NA> 2013-01-01 removed 3.5
#> 57: 4 <NA> 2013-01-01 active 3.0
#> 58: 4 <NA> 2014-01-01 removed 2.5
#> 59: 5 <NA> 2015-01-01 <NA> 13.0
#> 60: 5 <NA> 2014-01-01 <NA> 21.0
#> 61: 5 <NA> 2013-01-01 <NA> 8.5
#> 62: 5 <NA> 2011-01-01 <NA> 17.0
#> 63: 5 <NA> 2012-01-01 <NA> 4.5
#> 64: 6 <NA> <NA> active 19.0
#> 65: 6 <NA> <NA> inactive 18.0
#> 66: 6 <NA> <NA> archived 15.5
#> 67: 6 <NA> <NA> removed 11.5
#> 68: 7 <NA> <NA> <NA> 64.0
#> grouping color year status V1
#> <int> <char> <Date> <fctr> <num>
cube(DT, j = lapply(.SD, sum), by = c("color","year","status"), id=TRUE, .SDcols="value")
#> grouping color year status value
#> <int> <char> <Date> <fctr> <num>
#> 1: 0 red 2015-01-01 active 3.5
#> 2: 0 green 2015-01-01 inactive 5.5
#> 3: 0 green 2014-01-01 archived 3.5
#> 4: 0 green 2015-01-01 archived 2.0
#> 5: 0 yellow 2014-01-01 active 4.5
#> 6: 0 red 2013-01-01 inactive 2.0
#> 7: 0 green 2011-01-01 active 6.0
#> 8: 0 red 2014-01-01 inactive 2.5
#> 9: 0 green 2011-01-01 archived 2.5
#> 10: 0 yellow 2015-01-01 active 2.0
#> 11: 0 red 2012-01-01 archived 2.0
#> 12: 0 red 2011-01-01 removed 3.5
#> 13: 0 green 2014-01-01 inactive 8.0
#> 14: 0 green 2011-01-01 removed 2.0
#> 15: 0 yellow 2012-01-01 archived 2.5
#> 16: 0 red 2013-01-01 removed 3.5
#> 17: 0 green 2013-01-01 active 3.0
#> 18: 0 green 2014-01-01 removed 2.5
#> 19: 0 red 2011-01-01 archived 3.0
#> 20: 1 red 2015-01-01 <NA> 3.5
#> 21: 1 green 2015-01-01 <NA> 7.5
#> 22: 1 green 2014-01-01 <NA> 14.0
#> 23: 1 yellow 2014-01-01 <NA> 4.5
#> 24: 1 red 2013-01-01 <NA> 5.5
#> 25: 1 green 2011-01-01 <NA> 10.5
#> 26: 1 red 2014-01-01 <NA> 2.5
#> 27: 1 yellow 2015-01-01 <NA> 2.0
#> 28: 1 red 2012-01-01 <NA> 2.0
#> 29: 1 red 2011-01-01 <NA> 6.5
#> 30: 1 yellow 2012-01-01 <NA> 2.5
#> 31: 1 green 2013-01-01 <NA> 3.0
#> 32: 2 red <NA> active 3.5
#> 33: 2 green <NA> inactive 13.5
#> 34: 2 green <NA> archived 8.0
#> 35: 2 yellow <NA> active 6.5
#> 36: 2 red <NA> inactive 4.5
#> 37: 2 green <NA> active 9.0
#> 38: 2 red <NA> archived 5.0
#> 39: 2 red <NA> removed 7.0
#> 40: 2 green <NA> removed 4.5
#> 41: 2 yellow <NA> archived 2.5
#> 42: 3 red <NA> <NA> 20.0
#> 43: 3 green <NA> <NA> 35.0
#> 44: 3 yellow <NA> <NA> 9.0
#> 45: 4 <NA> 2015-01-01 active 5.5
#> 46: 4 <NA> 2015-01-01 inactive 5.5
#> 47: 4 <NA> 2014-01-01 archived 3.5
#> 48: 4 <NA> 2015-01-01 archived 2.0
#> 49: 4 <NA> 2014-01-01 active 4.5
#> 50: 4 <NA> 2013-01-01 inactive 2.0
#> 51: 4 <NA> 2011-01-01 active 6.0
#> 52: 4 <NA> 2014-01-01 inactive 10.5
#> 53: 4 <NA> 2011-01-01 archived 5.5
#> 54: 4 <NA> 2012-01-01 archived 4.5
#> 55: 4 <NA> 2011-01-01 removed 5.5
#> 56: 4 <NA> 2013-01-01 removed 3.5
#> 57: 4 <NA> 2013-01-01 active 3.0
#> 58: 4 <NA> 2014-01-01 removed 2.5
#> 59: 5 <NA> 2015-01-01 <NA> 13.0
#> 60: 5 <NA> 2014-01-01 <NA> 21.0
#> 61: 5 <NA> 2013-01-01 <NA> 8.5
#> 62: 5 <NA> 2011-01-01 <NA> 17.0
#> 63: 5 <NA> 2012-01-01 <NA> 4.5
#> 64: 6 <NA> <NA> active 19.0
#> 65: 6 <NA> <NA> inactive 18.0
#> 66: 6 <NA> <NA> archived 15.5
#> 67: 6 <NA> <NA> removed 11.5
#> 68: 7 <NA> <NA> <NA> 64.0
#> grouping color year status value
#> <int> <char> <Date> <fctr> <num>
cube(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"), id=TRUE)
#> grouping color year status count amount value
#> <int> <char> <Date> <fctr> <int> <int> <num>
#> 1: 0 red 2015-01-01 active 1 4 3.5
#> 2: 0 green 2015-01-01 inactive 2 5 5.5
#> 3: 0 green 2014-01-01 archived 1 3 3.5
#> 4: 0 green 2015-01-01 archived 1 4 2.0
#> 5: 0 yellow 2014-01-01 active 2 5 4.5
#> 6: 0 red 2013-01-01 inactive 1 1 2.0
#> 7: 0 green 2011-01-01 active 2 9 6.0
#> 8: 0 red 2014-01-01 inactive 1 5 2.5
#> 9: 0 green 2011-01-01 archived 1 4 2.5
#> 10: 0 yellow 2015-01-01 active 1 4 2.0
#> 11: 0 red 2012-01-01 archived 1 4 2.0
#> 12: 0 red 2011-01-01 removed 1 1 3.5
#> 13: 0 green 2014-01-01 inactive 3 7 8.0
#> 14: 0 green 2011-01-01 removed 1 4 2.0
#> 15: 0 yellow 2012-01-01 archived 1 1 2.5
#> 16: 0 red 2013-01-01 removed 1 3 3.5
#> 17: 0 green 2013-01-01 active 1 2 3.0
#> 18: 0 green 2014-01-01 removed 1 5 2.5
#> 19: 0 red 2011-01-01 archived 1 1 3.0
#> 20: 1 red 2015-01-01 <NA> 1 4 3.5
#> 21: 1 green 2015-01-01 <NA> 3 9 7.5
#> 22: 1 green 2014-01-01 <NA> 5 15 14.0
#> 23: 1 yellow 2014-01-01 <NA> 2 5 4.5
#> 24: 1 red 2013-01-01 <NA> 2 4 5.5
#> 25: 1 green 2011-01-01 <NA> 4 17 10.5
#> 26: 1 red 2014-01-01 <NA> 1 5 2.5
#> 27: 1 yellow 2015-01-01 <NA> 1 4 2.0
#> 28: 1 red 2012-01-01 <NA> 1 4 2.0
#> 29: 1 red 2011-01-01 <NA> 2 2 6.5
#> 30: 1 yellow 2012-01-01 <NA> 1 1 2.5
#> 31: 1 green 2013-01-01 <NA> 1 2 3.0
#> 32: 2 red <NA> active 1 4 3.5
#> 33: 2 green <NA> inactive 5 12 13.5
#> 34: 2 green <NA> archived 3 11 8.0
#> 35: 2 yellow <NA> active 3 9 6.5
#> 36: 2 red <NA> inactive 2 6 4.5
#> 37: 2 green <NA> active 3 11 9.0
#> 38: 2 red <NA> archived 2 5 5.0
#> 39: 2 red <NA> removed 2 4 7.0
#> 40: 2 green <NA> removed 2 9 4.5
#> 41: 2 yellow <NA> archived 1 1 2.5
#> 42: 3 red <NA> <NA> 7 19 20.0
#> 43: 3 green <NA> <NA> 13 43 35.0
#> 44: 3 yellow <NA> <NA> 4 10 9.0
#> 45: 4 <NA> 2015-01-01 active 2 8 5.5
#> 46: 4 <NA> 2015-01-01 inactive 2 5 5.5
#> 47: 4 <NA> 2014-01-01 archived 1 3 3.5
#> 48: 4 <NA> 2015-01-01 archived 1 4 2.0
#> 49: 4 <NA> 2014-01-01 active 2 5 4.5
#> 50: 4 <NA> 2013-01-01 inactive 1 1 2.0
#> 51: 4 <NA> 2011-01-01 active 2 9 6.0
#> 52: 4 <NA> 2014-01-01 inactive 4 12 10.5
#> 53: 4 <NA> 2011-01-01 archived 2 5 5.5
#> 54: 4 <NA> 2012-01-01 archived 2 5 4.5
#> 55: 4 <NA> 2011-01-01 removed 2 5 5.5
#> 56: 4 <NA> 2013-01-01 removed 1 3 3.5
#> 57: 4 <NA> 2013-01-01 active 1 2 3.0
#> 58: 4 <NA> 2014-01-01 removed 1 5 2.5
#> 59: 5 <NA> 2015-01-01 <NA> 5 17 13.0
#> 60: 5 <NA> 2014-01-01 <NA> 8 25 21.0
#> 61: 5 <NA> 2013-01-01 <NA> 3 6 8.5
#> 62: 5 <NA> 2011-01-01 <NA> 6 19 17.0
#> 63: 5 <NA> 2012-01-01 <NA> 2 5 4.5
#> 64: 6 <NA> <NA> active 7 24 19.0
#> 65: 6 <NA> <NA> inactive 7 18 18.0
#> 66: 6 <NA> <NA> archived 6 17 15.5
#> 67: 6 <NA> <NA> removed 4 13 11.5
#> 68: 7 <NA> <NA> <NA> 24 72 64.0
#> grouping color year status count amount value
#> <int> <char> <Date> <fctr> <int> <int> <num>
# groupingsets
groupingsets(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"),
sets = list("color", c("year","status"), character()), id=TRUE)
#> grouping color year status count amount value
#> <int> <char> <Date> <fctr> <int> <int> <num>
#> 1: 3 red <NA> <NA> 7 19 20.0
#> 2: 3 green <NA> <NA> 13 43 35.0
#> 3: 3 yellow <NA> <NA> 4 10 9.0
#> 4: 4 <NA> 2015-01-01 active 2 8 5.5
#> 5: 4 <NA> 2015-01-01 inactive 2 5 5.5
#> 6: 4 <NA> 2014-01-01 archived 1 3 3.5
#> 7: 4 <NA> 2015-01-01 archived 1 4 2.0
#> 8: 4 <NA> 2014-01-01 active 2 5 4.5
#> 9: 4 <NA> 2013-01-01 inactive 1 1 2.0
#> 10: 4 <NA> 2011-01-01 active 2 9 6.0
#> 11: 4 <NA> 2014-01-01 inactive 4 12 10.5
#> 12: 4 <NA> 2011-01-01 archived 2 5 5.5
#> 13: 4 <NA> 2012-01-01 archived 2 5 4.5
#> 14: 4 <NA> 2011-01-01 removed 2 5 5.5
#> 15: 4 <NA> 2013-01-01 removed 1 3 3.5
#> 16: 4 <NA> 2013-01-01 active 1 2 3.0
#> 17: 4 <NA> 2014-01-01 removed 1 5 2.5
#> 18: 7 <NA> <NA> <NA> 24 72 64.0