Skip to contents

Convenient test of data equality between data.table objects. Performs some factor level stripping.

Usage

# S3 method for data.table
all.equal(target, current, trim.levels=TRUE, check.attributes=TRUE,
    ignore.col.order=FALSE, ignore.row.order=FALSE, tolerance=sqrt(.Machine$double.eps),
    ...)

Arguments

target, current

data.tables to compare. If current is not a data.table, but check.attributes is FALSE, it will be coerced to one via as.data.table.

trim.levels

A logical indicating whether or not to remove all unused levels in columns that are factors before running equality check. It effect only when check.attributes is TRUE and ignore.row.order is FALSE.

check.attributes

A logical indicating whether or not to check attributes, will apply not only to data.table but also attributes of the columns. It will skip c("row.names",".internal.selfref") data.table attributes.

ignore.col.order

A logical indicating whether or not to ignore columns order in data.table.

ignore.row.order

A logical indicating whether or not to ignore rows order in data.table. This option requires datasets to use data types on which join can be made, so no support for list, complex, raw, but still supports integer64.

tolerance

A numeric value used when comparing numeric columns, by default sqrt(.Machine$double.eps). Unless non-default value provided it will be forced to 0 if used together with ignore.row.order and duplicate rows detected or factor columns present.

...

Passed down to internal call of all.equal.

Details

For efficiency data.table method will exit on detected non-equality issues, unlike most all.equal methods which process equality checks further. Besides that fact it also handles the most time consuming case of ignore.row.order = TRUE very efficiently.

Value

Either TRUE or a vector of mode "character" describing the differences between target and current.

See also

Examples

dt1 <- data.table(A = letters[1:10], X = 1:10, key = "A")
dt2 <- data.table(A = letters[5:14], Y = 1:10, key = "A")
isTRUE(all.equal(dt1, dt1))
#> [1] TRUE
is.character(all.equal(dt1, dt2))
#> [1] TRUE

# ignore.col.order
x <- copy(dt1)
y <- dt1[, .(X, A)]
all.equal(x, y)
#> [1] "Different column order"
all.equal(x, y, ignore.col.order = TRUE)
#> [1] TRUE

# ignore.row.order
x <- setkeyv(copy(dt1), NULL)
y <- dt1[sample(nrow(dt1))]
all.equal(x, y)
#> [1] "Column 'A': 10 string mismatches"
all.equal(x, y, ignore.row.order = TRUE)
#> [1] TRUE

# check.attributes
x = copy(dt1)
y = setkeyv(copy(dt1), NULL)
all.equal(x, y)
#> [1] "Datasets have different keys. 'target': [A]. 'current': has no key."
all.equal(x, y, check.attributes = FALSE)
#> [1] TRUE
x = data.table(1L)
y = 1L
all.equal(x, y)
#> [1] "target is data.table, current is numeric"
all.equal(x, y, check.attributes = FALSE)
#> [1] TRUE

# trim.levels
x <- data.table(A = factor(letters[1:10])[1:4]) # 10 levels
y <- data.table(A = factor(letters[1:5])[1:4]) # 5 levels
all.equal(x, y, trim.levels = FALSE)
#> [1] "Column 'A': Levels not identical. No attempt to refactor because trim.levels is FALSE"
all.equal(x, y, trim.levels = FALSE, check.attributes = FALSE)
#> [1] "Column 'A': Levels not identical. No attempt to refactor because trim.levels is FALSE"
all.equal(x, y)
#> [1] TRUE