Skip to contents

Biplot

Usage

# S4 method for CA
biplot(
  x,
  axes = c(1, 2),
  type = c("rows", "columns", "contributions"),
  active = TRUE,
  sup = TRUE,
  label = c("rows", "columns")
)

# S4 method for PCA
biplot(
  x,
  axes = c(1, 2),
  type = c("form", "covariance"),
  active = TRUE,
  sup = TRUE,
  label = c("individuals", "variables")
)

Arguments

x

A CA or PCA object.

axes

A length-two numeric vector giving the dimensions to be plotted.

type

A character string specifying the biplot to be plotted (see below). It must be one of "rows", "columns", "contribution" (CA), "form" or "covariance" (PCA). Any unambiguous substring can be given.

active

A logical scalar: should the active observations be plotted?

sup

A logical scalar: should the supplementary observations be plotted?

label

A character vector specifying whether "rows"/"individuals" and/or "columns"/"variables" names must be mapped (e.g. for use with ggrepel::geom_label_repel()). Any unambiguous substring can be given.

Value

A ggplot2::ggplot object.

Details

A biplot is the simultaneous representation of rows and columns of a rectangular dataset. It is the generalization of a scatterplot to the case of mutlivariate data: it allows to visualize as much information as possible in a single graph (Greenacre 2010).

Biplots have the drawbacks of their advantages: they can quickly become difficult to read as they display a lot of information at once. It may then be preferable to visualize the results for individuals and variables separately.

PCA Biplots

form

Form biplot (row-metric-preserving). The form biplot favors the representation of the individuals: the distance between the individuals approximates the Euclidean distance between rows. In the form biplot the length of a vector approximates the quality of the representation of the variable.

covariance

Covariance biplot (column-metric-preserving). The covariance biplot favors the representation of the variables: the length of a vector approximates the standard deviation of the variable and the cosine of the angle formed by two vectors approximates the correlation between the two variables. In the covariance biplot the distance between the individuals approximates the Mahalanobis distance between rows.

CA Biplots

rows

Row principal biplot.

columns

Column principal biplot.

contribution

Contribution biplot

.

References

Aitchison, J. and Greenacre, M. (2002). Biplots of Compositional Data. Journal of the Royal Statistical Society: Series C (Applied Statistics), 51(4): 375-92. doi:10.1111/1467-9876.00275 .

Greenacre, M. J. Biplots in Practice. Bilbao: Fundación BBVA, 2010.

Author

N. Frerebeau

Examples

## Replicate examples from Greenacre 2007, p. 59-68
data("countries")

## Compute principal components analysis
## All rows and all columns obtain the same weight
row_w <- rep(1 / nrow(countries), nrow(countries)) # 1/13
col_w <- rep(1 / ncol(countries), ncol(countries)) # 1/6
Y <- pca(countries, scale = FALSE, weight_row = row_w, weight_col = col_w)

## Row-metric-preserving biplot (form biplot)
biplot(Y, type = "form") +
  ggrepel::geom_label_repel()


## Column-metric-preserving biplot (covariance biplot)
biplot(Y, type = "covariance") +
  ggrepel::geom_label_repel()


## Replicate examples from Greenacre 2007, p. 79-88
data("benthos")

## Compute correspondence analysis
X <- ca(benthos)

## Row principal CA biplot
biplot(X, type = "row") +
  ggrepel::geom_label_repel()
#> Warning: ggrepel: 91 unlabeled data points (too many overlaps). Consider increasing max.overlaps


## Column principal CA biplot
biplot(X, type = "column") +
  ggrepel::geom_label_repel()
#> Warning: ggrepel: 81 unlabeled data points (too many overlaps). Consider increasing max.overlaps


## Contribution CA biplot
biplot(X, type = "contrib") +
  ggrepel::geom_label_repel()
#> Warning: ggrepel: 93 unlabeled data points (too many overlaps). Consider increasing max.overlaps