frollapply {data.table} | R Documentation |
Rolling user-defined function
Description
Fast rolling user-defined function (UDF) to calculate on a sliding window. Experimental. Please read, at least, caveats section below. For "time-aware" (irregularly spaced time series) rolling function see frolladapt
.
Usage
frollapply(X, N, FUN, ..., by.column=TRUE, fill=NA,
align=c("right","left","center"), adaptive=FALSE, partial=FALSE,
give.names=FALSE, simplify=TRUE, x, n)
Arguments
X |
Atomic vector, |
N |
Integer, non-negative, rolling window size. This is the total number of included values in aggregate function. In case of an adaptive rolling function window size has to be provided as a vector for each indivdual value of |
FUN |
The function to be applied on subsets of |
... |
Extra arguments passed to |
by.column |
Logical. When |
fill |
An object; value to pad by. Defaults to |
align |
Character, specifying the "alignment" of the rolling window, defaulting to |
adaptive |
Logical, default |
partial |
Logical, default |
give.names |
Logical, default |
simplify |
Logical or a function. When |
x |
Deprecated, use |
n |
Deprecated, use |
Value
Argument simplify
impacts the type returned. Its default value TRUE
is set for convenience and backward compatibility, but it is advised to use simplify=unlist
(or other desired function) instead.
-
simplify=FALSE
will always return list where each element will be a result of each iteration. -
simplify=unlist
(or any other function) will return object returned by provided function as supplied with results offrollapply
usingsimplify=FALSE
. -
simplify=TRUE
will try to simplify results byunlist
,rbind
or other functions, its behavior is subject to change, seesimplify
argument section below for more details.
by.column
argument
Setting by.column
to FALSE
allows to apply function on multiple variables rather than a single vector. Then X
expects to be data.table, data.table or a list of equal length vectors, and window size provided in N
refers to number of rows (or length of a vectors in a list). See examples for use cases. Error "incorrect number of dimensions" can be commonly observed when by.column
was not set to FALSE
when FUN
expects its input to be a data.frame/data.table.
simplify
argument
When set to TRUE
, the default, results from rolling function which are normally stored in a list may be simplified either with unlist
or rbindlist
. It also attempts to match type, size and names of fill
argument to the results of a function.
One should avoid simplify=TRUE
when writing robust code. One reason is performance, as explained in Performance consideration section below. Another is backward compatibility. For backward compatibility and performance one should always provide desired function to simplify
explicitly. In future version we may change internal simplifylist
function, then simplify=TRUE
may return object of a different type, breaking downstream code.
Caveats
With great power comes great responsibility.
An optimization used to avoid repeated allocation of window subsets (explained more deeply in Implementation section below) may, in special cases, return rather surprising results:
setDTthreads(1) frollapply(c(1, 9), N=1L, FUN=identity) ## unexpected #[1] 9 9 frollapply(c(1, 9), N=1L, FUN=list) ## unexpected # V1 # <num> #1: 9 #2: 9 setDTthreads(2, throttle=1) ## disable throttle frollapply(c(1, 9), N=1L, FUN=identity) ## good only because threads >= input #[1] 1 9 ## on Linux and Macos frollapply(c(1, 5, 9), N=1L, FUN=identity) ## unexpected again #[1] 5 5 9
Problem occurs, in rather unlikely scenarios for rolling computations, when objects returned from a function can be its input (i.e.
identity
), or a reference to it (i.e.list
), then one has to add extracopy
call:setDTthreads(1) frollapply(c(1, 9), N=1L, FUN=function(x) copy(identity(x))) ## only 'copy' would be equivalent here #[1] 1 9 frollapply(c(1, 9), N=1L, FUN=function(x) copy(list(x))) # V1 # <num> #1: 1 #2: 9
-
FUN
calls are internally passed toparallel::mcparallel
to evaluate them in parallel. We inherit few limitations fromparallel
package explained below. This optimization can be disabled completely by callingsetDTthreads(1)
, then limitations listed below do not apply because all iterations ofFUN
evaluation will be made sequentially without use ofparallel
package. Note that on Windows platform this optimization is always disabled due to lack of fork used byparallel
package. One can useoptions(datatable.verbose=TRUE)
to get extra information iffrollapply
is running multithreaded or not.Warnings produced inside the function are silently ignored; for consistency we ignore warnings also when running single threaded path.
-
FUN
should not use any on-screen devices, GUI elements, tcltk, multithreaded libraries. Note thatsetDTthreads(1L)
is passed to forked processes, therefore any data.table code insideFUN
will be forced to be single threaded. It is advised to not callsetDTthreads
insideFUN
.frollapply
is already parallelized and nested parallelism is rarely a good idea. Any operation that could misbehave when run in parallel has to be handled. For example writing to the same file from multiple CPU threads.
old = setDTthreads(1L) frollapply(iris, 5L, by.column=FALSE, FUN=fwrite, file="rolling-data.csv", append=TRUE) setDTthreads(old)
Objects returned from forked processes,
FUN
, are serialized. This may cause problems for objects that are meant not to be serialized, like data.table. We are handling that for data.table class internally infrollapply
wheneverFUN
is returning data.table (which is checked on the results of the firstFUN
call so it assumes function is type stable). If data.table is nested in another object returned fromFUN
then the problem may still manifest, in such case one has to callsetDT
on objects returned fromFUN
. This can be also nicely handled viasimplify
argument when passing a function that callssetDT
on nested data.table objects returned fromFUN
. Anyway, returning data.table fromFUN
should, in majority of cases, be avoided from the performance reasons, see UDF optimization section for details.setDTthreads(2, throttle=1) ## disable throttle ## frollapply will fix DT in most cases ans = frollapply(1:2, 2, data.table) .selfref.ok(ans) #[1] TRUE ans = frollapply(1:2, 2, data.table, simplify=FALSE) .selfref.ok(ans[[2L]]) #[1] TRUE ## nested DT not fixed ans = frollapply(1:2, 2, function(x) list(data.table(x)), fill=list(data.table(NA)), simplify=FALSE) .selfref.ok(ans[[2L]][[1L]]) #[1] FALSE #### now if we want to use it set(ans[[2L]][[1L]],, "newcol", 1L) #Error in set(ans[[2L]][[1L]], , "newcol", 1L) : # This data.table has either been loaded from disk (e.g. using readRDS()/load()) or constructed manually (e.g. using structure()). Please run setDT() or setalloccol() on it first (to pre-allocate space for new columns) before assigning by reference to it. #### fix as explained in error message ans = lapply(ans, lapply, setDT) .selfref.ok(ans[[2L]][[1L]]) #[1] TRUE ## fix inside frollapply via simplify simplifix = function(x) lapply(x, lapply, setDT) ans = frollapply(1:2, 2, function(x) list(data.table(x)), fill=list(data.table(NA)), simplify=simplifix) .selfref.ok(ans[[2L]][[1L]]) #[1] TRUE ## automatic fix may not work for a non-type stable function f = function(x) (if (x[1L]==1L) data.frame else data.table)(x) ans = frollapply(1:3, 2, f, fill=data.table(NA), simplify=FALSE) .selfref.ok(ans[[3L]]) #[1] FALSE #### fix inside frollapply via simplify simplifix = function(x) lapply(x, function(y) if (is.data.table(y)) setDT(y) else y) ans = frollapply(1:3, 2, f, fill=data.table(NA), simplify=simplifix) .selfref.ok(ans[[3L]]) #[1] TRUE setDTthreads(2, throttle=1024) ## enable throttle
Due to possible future improvements of handling simplification of results returned from rolling function, the default
simplify=TRUE
may not be backward compatible for functions that produce results that haven't been already automatically simplified. Seesimplify
argument section for details.
Performance consideration
frollapply
is meant to run any UDF function. If one needs to use a common function like mean, sum, max, etc., then we have highly optimized, implemented in C language, rolling functions described in froll
manual.
Most crucial optimizations are the ones to be applied on UDF. Those are discussed in next section UDF optimization below.
When using
by.column=FALSE
one can subset dataset before passing it toX
to keep only columns relevant for the computation:x = setDT(lapply(1:1000, function(x) as.double(rep.int(x,1e4L)))) f = function(x) sum(x$V1 * x$V2) system.time(frollapply(x, 100, f, by.column=FALSE)) # user system elapsed # 0.376 0.073 0.233 system.time(frollapply(x[, c("V1","V2"), with=FALSE], 100, f, by.column=FALSE)) # user system elapsed # 0.376 0.073 0.233
Avoid
partial
argument, seepartial
argument section offroll
manual.Avoid
simplify=TRUE
and provide a function instead:x = rnorm(1e5) system.time(frollapply(x, 2, function(x) 1L, simplify=TRUE)) # user system elapsed # 0.227 0.095 0.236 system.time(frollapply(x, 2, function(x) 1L, simplify=unlist)) # user system elapsed # 0.054 0.049 0.091
CPU threads utilization in
frollapply
can be controlled bysetDTthreads
, which by default uses half of available CPU threads. Usage of multiple CPU threads will be throttled for small input, as described insetDTthreads
manual.Optimization that avoids repeated allocation of a window subset (see Implementation section for details), in case of adaptive rolling function, depends on R's growable bit. This feature has been added in R 3.4.0. Adaptive
frollapply
will still work on older versions of R but, due to repeated allocation of window subset, it will be much slower.Parallel computation of
FUN
is handled byparallel
package (part of R core since 2.14.0) and its fork mechanism. Fork is not available on Windows OS therefore it will be always single threaded on that platform.
UDF optimization
FUN will be evaluated many times so should be highly optimized. Tips below are not specific to frollapply
and can be applied to any code is meant to run in many iterations.
It is usually better to return the most lightweight objects from
FUN
, for example it will be faster to return a list rather a data.table. In the case presented below,simplify=TRUE
is callingrbindlist
on the results anyway, which makes the results equal:fun1 = function(x) {tmp=range(x); data.table(min=tmp[1L], max=tmp[2L])} fun2 = function(x) {tmp=range(x); list(min=tmp[1L], max=tmp[2L])} fill1 = data.table(min=NA_integer_, max=NA_integer_) fill2 = list(min=NA_integer_, max=NA_integer_) system.time(a<-frollapply(1:1e4, 100, fun1, fill=fill1, simplify=rbindlist)) # user system elapsed # 0.934 0.347 0.706 system.time(b<-frollapply(1:1e4, 100, fun2, fill=fill2, simplify=rbindlist)) # user system elapsed # 0.010 0.033 0.094 all.equal(a, b) #[1] TRUE
Code that is not dependent on a rolling window should be taken out as pre or post computation:
x = c(1L,3L) system.time(for (i in 1:1e6) sum(x+1L)) # user system elapsed # 0.218 0.002 0.221 system.time({y = x+1L; for (i in 1:1e6) sum(y)}) # user system elapsed # 0.160 0.001 0.161
Being strict about data types removes the need for R to handle them automatically:
x = vector("integer", 1e6) system.time(for (i in 1:1e6) x[i] = NA) # user system elapsed # 0.114 0.000 0.114 system.time(for (i in 1:1e6) x[i] = NA_integer_) # user system elapsed # 0.029 0.000 0.030
If a function calls another function under the hood, it is usually better to call the latter one directly:
x = matrix(c(1L,2L,3L,4L), c(2L,2L)) system.time(for (i in 1:1e4) colSums(x)) # user system elapsed # 0.033 0.000 0.033 system.time(for (i in 1:1e4) .colSums(x, 2L, 2L)) # user system elapsed # 0.010 0.002 0.012
There are many functions that may be optimized for scaling up for bigger input, yet for a small input they may carry bigger overhead comparing to their simpler counterparts. One may need to experiment on own data, but low overhead functions are likely to be faster when evaluating in many iterations:
## uniqueN x = c(1L,3L,5L) system.time(for (i in 1:1e4) uniqueN(x)) # user system elapsed # 0.078 0.001 0.080 system.time(for (i in 1:1e4) length(unique(x))) # user system elapsed # 0.018 0.000 0.018 ## column subset x = data.table(v1 = c(1L,3L,5L)) system.time(for (i in 1:1e4) x[, v1]) # user system elapsed # 1.952 0.011 1.964 system.time(for (i in 1:1e4) x[["v1"]]) # user system elapsed # 0.036 0.000 0.035
Implementation
Evaluation of UDF comes with very limited capabilities for optimizations, therefore speed improvements in frollapply
should not be expected as good as in other data.table fast functions. frollapply
is implemented almost exclusively in R, rather than C. Its speed improvement comes from two optimizations that have been applied:
No repeated allocation of a rolling window subset.
Object (type ofX
and size ofN
) is allocated once (for each CPU thread), and then for each iteration this object is being re-used by copying expected subset of data into it. This means we still have to subset data on each iteration, but we only copy data into pre-allocated window object, instead of allocating in each iteration. Allocation is carrying much bigger overhead than copy. The faster theFUN
evaluates the more relative speedup we are getting, because allocation of a subset does not depend on how fast or slowFUN
evaluates. See caveats section for possible edge cases caused by this optimization.Parallel evaluation of
FUN
calls.
Until now (September 2025) all the multithreaded code in data.table was using OpenMP. It can be used only in C language and it has very low overhead. Unfortunately it could not be applied infrollapply
because to evaluate UDF from C code one has to call R's C api that is not thread safe (can be run only from single threaded C code). Thereforefrollapply
usesparallel-package
to provide parallelism on R language level. It uses fork parallelism, which has low overhead as well (unless results of computation are big in size which is not an issue for rolling statistics). Fork is not available on Windows OS. See caveats section for limitations caused by using this optimization.
Note
Be aware that rolling functions operates on the physical order of input. If the intent is to roll values in a vector by a logical window, for example an hour, or a day, then one has to ensure that there are no gaps in the input, or use adaptive rolling function to handle gaps, for which we provide helper function frolladapt
to generate adaptive window size.
See Also
froll
, frolladapt
, shift
, data.table
, setDTthreads
Examples
frollapply(1:16, 4, median)
frollapply(1:9, 3, toString)
## vectorized input
x = list(1:10, 10:1)
n = c(3, 4)
frollapply(x, n, sum)
## give names
x = list(data1 = 1:10, data2 = 10:1)
n = c(small = 3, big = 4)
frollapply(x, n, sum, give.names=TRUE)
## by.column=FALSE
x = as.data.table(iris)
flow = function(x) {
v1 = x[[1L]]
v2 = x[[2L]]
(v1[2L] - v1[1L] * (1+v2[2L])) / v1[1L]
}
x[, "flow" := frollapply(.(Sepal.Length, Sepal.Width), 2L, flow, by.column=FALSE),
by = Species][]
## rolling regression: by.column=FALSE
f = function(x) coef(lm(v2 ~ v1, data=x))
x = data.table(v1=rnorm(120), v2=rnorm(120))
frollapply(x, 4, f, by.column=FALSE)