Skip to content

Use good data structures in Redux state, to avoid quadratic work #3949

@gnprice

Description

@gnprice

Priority checklist (with updates from our progress):


As discussed in #3339, there are a lot of places where we maintain large amounts of data -- like all the users in the realm -- in data structures like Array which make it inefficient to find the data we want at a given time, like a particular user.

Some other data structures are maintained as objects used as maps: notably, the collection of all the messages we have from the server. These are efficient for lookup... but both these objects-as-maps, and the Arrays, are extremely inefficient to build up for large amounts of data in the Redux style we use. Specifically, building a new array like [...state, newItem] or a new object like { ...state, [newKey]: newItem } has to copy the entire existing array or object, taking linear time -- which means building up an array or object with N items this way takes quadratic time O(N^2). Demo at #3339 (comment) .

So, we should fix that.

There are a couple of potential stages here:

  • As a modest but useful improvement, we can use Map instead of object-as-map. This is still quadratic time to build, but over 2x faster (Avoid linear scans through all users, streams, etc. #3339 (comment)), a solid constant-factor improvement. It's also cleaner in the code and for type-checking.
  • For a more complete solution, we'd want to be maintaining our data structures in an efficient way. Efficiently maintaining data structures without mutating them is a bit subtle, but there are well-understood techniques for doing so (called "persistent data structures"). One implementation of those in JS is Immutable.js.

The main technical obstacle I believe we'll need to resolve is how to serialize and deserialize these data structures for redux-persist. This shouldn't fundamentally be complicated once we work out how; but it'll require some studying of the docs of redux-persist and friends and of Immutable.js, then possibly of implementations where docs are inadequate, in order to see how to wire up appropriate serialization and deserialization functions. Relatedly, we'll want to think through and test the migration path for data serialized by a previous version of the app.

I've forked off #3950 for using Map in place of object-as-map (the first stage above), which will also be an opportunity to figure out this aspect.

A quick extra note, in addition to what's there:


Specifically about Immutable.js: one thing I remember taking from when I was reading about this area last year is that people get annoyed with converting their objects to and from Immutable's types.

My thinking on that -- untested, so maybe this is hard for some reason -- is that that could be addressed by

  • using Immutable for the big collections, like the user-id -> user map; but
  • using plain old JS objects (just as we do today) for the struct-like objects inside those collections, like the individual users.

Within this repo there's some ancient history with Immutable, from 2016: it was used at first and then was ripped out, in a series ending at 2eb654f. It looks like the strategy there was to use it even for the struct-like objects inside the collections; as expected, this meant its API showed up all over the code, and the people working on the app at the time decided they found that too annoying to keep doing.

Previous related chat discussion here:
https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/pure.20functions/near/793927

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions