Computes a principal components analysis based on the singular value decomposition.

pca(object, ...)

# S4 method for data.frame
pca(
  object,
  center = TRUE,
  scale = TRUE,
  rank = NULL,
  sup_row = NULL,
  sup_col = NULL,
  weight_row = NULL,
  weight_col = NULL
)

# S4 method for matrix
pca(
  object,
  center = TRUE,
  scale = TRUE,
  rank = NULL,
  sup_row = NULL,
  sup_col = NULL,
  weight_row = NULL,
  weight_col = NULL
)

Arguments

object

A \(m \times p\) numeric matrix or a data.frame.

...

Currently not used.

center

A logical scalar: should the variables be shifted to be zero centered?

scale

A logical scalar: should the variables be scaled to unit variance?

rank

An integer value specifying the maximal number of components to be kept in the results. If NULL (the default), \(p - 1\) components will be returned.

sup_row

A numeric or logical vector specifying the indices of the supplementary rows (individuals).

sup_col

A numeric or logical vector specifying the indices of the supplementary columns (variables).

weight_row

A numeric vector specifying the active row (individual) weights. If NULL (the default), no weights are used.

weight_col

A numeric vector specifying the active column (variable) weights. If NULL (the default), no weights are used.

Value

A PCA object.

References

Lebart, L., Piron, M. and Morineau, A. Statistique exploratoire multidimensionnelle: visualisation et inférence en fouille de données. Paris: Dunod, 2006.

See also

get_*(), stats::predict(), svd()

Other multivariate analysis: ca(), predict()

Author

N. Frerebeau

Examples

## Load data
data("compiegne", package = "folio")

## Compute principal components analysis
X <- pca(compiegne, scale = TRUE, sup_col = 7:10)

## Get row coordinates
get_coordinates(X, margin = 1)
#>          F1         F2         F3         F4  .sup
#> 5 -4.028910 -0.8641397 -0.8323490 -0.5311213 FALSE
#> 4 -1.200000  1.3998925  2.0665385  0.5545804 FALSE
#> 3  1.281060  3.0459165 -1.4606637  0.3039523 FALSE
#> 2  2.524114 -0.5491907  0.6135577 -1.2700893 FALSE
#> 1  1.423737 -3.0324786 -0.3870835  0.9426779 FALSE

## Get column coordinates
get_coordinates(X, margin = 2)
#>             F1         F2           F3           F4  .sup
#> A -0.757352643  0.6450544  0.002336832  0.101569114 FALSE
#> B  0.731814163  0.5399164 -0.397242180  0.123032258 FALSE
#> C -0.654603931 -0.2027554  0.699509268  0.202659109 FALSE
#> D  0.822280000 -0.5033252  0.120820243 -0.236477847 FALSE
#> E  0.766689218  0.5390548  0.337863168 -0.086348632 FALSE
#> F  0.866067134 -0.4640471 -0.079538530  0.168111998 FALSE
#> K -0.003259205  0.9040812  0.275262947  0.326889788 FALSE
#> L  0.447986933  0.8295169 -0.333466553  0.003079837 FALSE
#> M  0.884262154 -0.3352261  0.309946481 -0.098168669 FALSE
#> N -0.902522062 -0.1616100 -0.047496227 -0.396333502 FALSE
#> O  0.110236742 -0.9024775  0.072642687  0.410006379 FALSE
#> P  0.425781193  0.5743904  0.672218354 -0.192115939 FALSE
#> G  0.887407063 -0.4351508  0.067691802 -0.136272946  TRUE
#> H  0.575300330 -0.7523059 -0.070885782  0.313114383  TRUE
#> I  0.718162798 -0.5002278  0.134453678 -0.464689714  TRUE
#> J  0.756161121 -0.6513134 -0.058568231  0.024103134  TRUE

## Get row contributions
get_contributions(X, margin = 1)
#>          F1        F2        F3        F4
#> 5 58.575577  3.476175  9.088098  8.860150
#> 4  5.196417  9.122694 56.020766  9.660123
#> 3  5.922161 43.188663 27.987396  2.901780
#> 2 22.991073  1.404042  4.938247 50.666637
#> 1  7.314773 42.808425  1.965492 27.911310

## Get correlations between variables and dimensions
get_correlations(X)
#>             F1         F2           F3           F4  .sup
#> A -0.757352643  0.6450544  0.002336832  0.101569114 FALSE
#> B  0.731814163  0.5399164 -0.397242180  0.123032258 FALSE
#> C -0.654603931 -0.2027554  0.699509268  0.202659109 FALSE
#> D  0.822280000 -0.5033252  0.120820243 -0.236477847 FALSE
#> E  0.766689218  0.5390548  0.337863168 -0.086348632 FALSE
#> F  0.866067134 -0.4640471 -0.079538530  0.168111998 FALSE
#> K -0.003259205  0.9040812  0.275262947  0.326889788 FALSE
#> L  0.447986933  0.8295169 -0.333466553  0.003079837 FALSE
#> M  0.884262154 -0.3352261  0.309946481 -0.098168669 FALSE
#> N -0.902522062 -0.1616100 -0.047496227 -0.396333502 FALSE
#> O  0.110236742 -0.9024775  0.072642687  0.410006379 FALSE
#> P  0.425781193  0.5743904  0.672218354 -0.192115939 FALSE
#> G  0.887407063 -0.4351508  0.067691802 -0.136272946  TRUE
#> H  0.575300330 -0.7523059 -0.070885782  0.313114383  TRUE
#> I  0.718162798 -0.5002278  0.134453678 -0.464689714  TRUE
#> J  0.756161121 -0.6513134 -0.058568231  0.024103134  TRUE

## Get eigenvalues
get_eigenvalues(X)
#>    eigenvalues  variance cumulative
#> F1    5.542281 46.185672   46.18567
#> F2    4.296316 35.802634   81.98831
#> F3    1.524642 12.705352   94.69366
#> F4    0.636761  5.306341  100.00000