Skip to content

Issue with proportion of variance in prcomp_irlba with scaling #32

@Baurice

Description

@Baurice

I believe this is related to issue #23. The calculated proportion of variance is pretty consistent with stats::prcomp without scaling (fixed after #23), but when scale. = TRUE it differs widely. I believe the solution would be to check for scale. and use the scaled matrix to calculate the proportion of variance if scale. = TRUE.

Based on your example, but using scale. = TRUE:

set.seed(1)
x  <- matrix(rnorm(200), nrow=20)
p1 <- prcomp_irlba(x, n=3, scale. = TRUE)
summary(p1)
Importance of components:
                          PC1    PC2    PC3
Standard deviation     1.5308 1.3415 1.3121
Proportion of Variance 0.2769 0.2126 0.2034
Cumulative Proportion  0.2769 0.4895 0.6929
p2 <- prcomp(x, tol=0.7, scale = TRUE)
summary(p2)
Importance of first k=4 (out of 10) components:
                          PC1    PC2    PC3    PC4
Standard deviation     1.5308 1.3415 1.3121 1.1814
Proportion of Variance 0.2343 0.1800 0.1722 0.1396
Cumulative Proportion  0.2343 0.4143 0.5864 0.7260

Using scaled matrix:

# Proportion of Variance:
round(summary(p1)$importance[1,]**2 / 
sum( apply( Re(scale(x, center=colMeans(x), scale=TRUE)) , 2, stats::var ) ), 4)
   PC1    PC2    PC3 
0.2343 0.1800 0.1722
# Cumulative Proportion:
round(cumsum(summary(p1)$importance[1,]**2 / 
sum( apply( Re(scale(x, center=colMeans(x), scale=TRUE)) , 2, stats::var ) )), 4)
   PC1    PC2    PC3 
0.2343 0.4143 0.5864 

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions