Skip to content

Discrepancy in irlba with center argument #22

@LTLA

Description

@LTLA

irlba seems to behave inconsistently depending on whether column-centring is performed explicitly or via center. To demonstrate, I've mocked up some single-cell RNA-seq data:

set.seed(1000)
ncells <- 100
ngenes <- 10000
counts <- matrix(as.double(rpois(ncells*ngenes, lambda=100)), ncol=ncells)
centers <- rowMeans(counts)

If I apply irlba on the transposed matrix (i.e., genes are now columns, cells are rows) with explicit centring outside the function or via center, I get substantially different results:

library(irlba)
set.seed(100)
out <- irlba(t(counts - centers), nu=10, nv=10)
head(out$d)
## [1] 1105.339 1091.932 1086.880 1085.875 1083.415 1080.327

set.seed(100)
out2 <- irlba(t(counts), center=centers, nu=10, nv=10)
head(out2$d)
## [1] 3961.623 2629.205 1221.687 1190.174 1170.183 1165.110

I might have expected some small differences due to vagaries of random initialization or numerical precision, but these differences in the singular values seem to be rather large. On a related note, running the following code in a fresh R session results in a segfault ("memory not mapped"):

set.seed(1000)
ncells <- 100
ngenes <- 10000
counts <- matrix(rpois(ncells*ngenes, lambda=100), ncol=ncells)
centers <- rowMeans(counts)
out <- irlba::irlba(t(counts), center=centers, nu=10, nv=10)

Presumably, it's something to do with the integer nature of counts, as coercion to double-precision avoids the problem. Anyway, here's my session information:

R version 3.4.0 Patched (2017-04-24 r72627)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

Matrix products: default
BLAS: /home/cri.camres.org/lun01/Software/R/R-3-4-branch_devel/lib/libRblas.so
LAPACK: /home/cri.camres.org/lun01/Software/R/R-3-4-branch_devel/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] irlba_2.2.1   Matrix_1.2-11

loaded via a namespace (and not attached):
[1] compiler_3.4.0  grid_3.4.0      lattice_0.20-35

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions