Skip to contents

r-universe

Project Status: Active – The project has reached a stable, usable state and is being actively developed.

DOI

Overview

A dependency-free collection of simple functions for cleaning rectangular data. This package allows to detect, count and replace values or discard rows/columns using a predicate function. In addition, it provides tools to check conditions and return informative error messages.

To cite arkhe in publications use:

  Frerebeau N (2024). _arkhe: Tools for Cleaning Rectangular Data_.
  Université Bordeaux Montaigne, Pessac, France.
  doi:10.5281/zenodo.3526659 <https://doi.org/10.5281/zenodo.3526659>,
  R package version 1.6.0, <https://packages.tesselle.org/arkhe/>.

A BibTeX entry for LaTeX users is

  @Manual{,
    author = {Nicolas Frerebeau},
    title = {{arkhe: Tools for Cleaning Rectangular Data}},
    year = {2024},
    organization = {Université Bordeaux Montaigne},
    address = {Pessac, France},
    note = {R package version 1.6.0},
    url = {https://packages.tesselle.org/arkhe/},
    doi = {10.5281/zenodo.3526659},
  }

This package is a part of the tesselle project
<https://www.tesselle.org>.

Installation

You can install the released version of arkhe from CRAN with:

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("tesselle/arkhe")

Usage

## Load the package
library(arkhe)

## Create a matrix
X <- matrix(sample(1:10, 25, TRUE), nrow = 5, ncol = 5)

## Add NA
k <- sample(1:25, 3, FALSE)
X[k] <- NA
X
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    6    7    1   NA   NA
#> [2,]    6    3    1    7    1
#> [3,]    9    5    6    4   10
#> [4,]   10    9    6   NA    5
#> [5,]    6   10   10    7    9

## Count missing values in rows
count(X, f = is.na, margin = 1)
#> [1] 2 0 0 1 0
## Count non-missing values in columns
count(X, f = is.na, margin = 2, negate = TRUE)
#> [1] 5 5 5 3 4

## Find row with NA
detect(X, f = is.na, margin = 1)
#> [1]  TRUE FALSE FALSE  TRUE FALSE
## Find column without any NA
detect(X, f = is.na, margin = 2, negate = TRUE, all = TRUE)
#> [1]  TRUE  TRUE  TRUE FALSE FALSE

## Remove row with any NA
discard(X, f = is.na, margin = 1, all = FALSE)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    6    3    1    7    1
#> [2,]    9    5    6    4   10
#> [3,]    6   10   10    7    9
## Remove column with any NA
discard(X, f = is.na, margin = 2, all = FALSE)
#>      [,1] [,2] [,3]
#> [1,]    6    7    1
#> [2,]    6    3    1
#> [3,]    9    5    6
#> [4,]   10    9    6
#> [5,]    6   10   10

## Replace NA with zeros
replace_NA(X, value = 0)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    6    7    1    0    0
#> [2,]    6    3    1    7    1
#> [3,]    9    5    6    4   10
#> [4,]   10    9    6    0    5
#> [5,]    6   10   10    7    9

Contributing

Please note that the arkhe project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.