-
Notifications
You must be signed in to change notification settings - Fork 17
Closed
Labels
Description
I believe this is related to issue #23. The calculated proportion of variance is pretty consistent with stats::prcomp
without scaling (fixed after #23), but when scale. = TRUE
it differs widely. I believe the solution would be to check for scale.
and use the scaled matrix to calculate the proportion of variance if scale. = TRUE
.
Based on your example, but using scale. = TRUE
:
set.seed(1)
x <- matrix(rnorm(200), nrow=20)
p1 <- prcomp_irlba(x, n=3, scale. = TRUE)
summary(p1)
Importance of components:
PC1 PC2 PC3
Standard deviation 1.5308 1.3415 1.3121
Proportion of Variance 0.2769 0.2126 0.2034
Cumulative Proportion 0.2769 0.4895 0.6929
p2 <- prcomp(x, tol=0.7, scale = TRUE)
summary(p2)
Importance of first k=4 (out of 10) components:
PC1 PC2 PC3 PC4
Standard deviation 1.5308 1.3415 1.3121 1.1814
Proportion of Variance 0.2343 0.1800 0.1722 0.1396
Cumulative Proportion 0.2343 0.4143 0.5864 0.7260
Using scaled matrix:
# Proportion of Variance:
round(summary(p1)$importance[1,]**2 /
sum( apply( Re(scale(x, center=colMeans(x), scale=TRUE)) , 2, stats::var ) ), 4)
PC1 PC2 PC3
0.2343 0.1800 0.1722
# Cumulative Proportion:
round(cumsum(summary(p1)$importance[1,]**2 /
sum( apply( Re(scale(x, center=colMeans(x), scale=TRUE)) , 2, stats::var ) )), 4)
PC1 PC2 PC3
0.2343 0.4143 0.5864