Skip to content

Word vector model improvements #868

@davidmezzetti

Description

@davidmezzetti

Make the following improvements with word vector model vectorization.

  • Lazy load word vector model with multi-process loading. This prevents unnecessary memory usage with low volumes of data.
  • Allow specifying the number of parallel processes with the parallel parameter. When set to True, all available CPUs will be used. Setting to False will only use a single CPU.
  • Set imap chunksize to encodebatch size for more efficient multi-processing.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions