Skip to content

Conversation

paultranvan
Copy link
Contributor

@paultranvan paultranvan commented Feb 27, 2025

We used to create a new reference on the documents state for any query.
This is problematic because we do a referential comparison on the updated documents state to evaluate if there is any update on it, resulting on all store's queries re-evaluation.
So, basically, we were doing a full queries re-evaluation for any query execution, even if the same query is run several time with no change.
This is particularly problematic when there are a lot of queries and/or
queries with many documents returned, as their re-evaluation will be
costful. Moreover, the document merging itself is costful.

The comparison between old and new docs is crucial for performances, so
we did some performances measures: we took a worst case where we fill
the store with 25K docs with a query and run again the same query with
no document change.
What we measure here is the time taken in the extractAndMergeDocument method:

  • Before: 73892 ms
  • Compare with lodash isEqual: 47530 ms
  • Compare with stringify + djb2 hash: 19785 ms
  • Compare with fast-deep-equal: 2404 ms (used solution)
  • Compare with JSON.stringify: 1899 ms
  • Compare with rev (for the record): 123ms

ℹ️ The JSON.stringify comparison is faster, but it does not guarantee
key order: 2 documents with same value but different order will be seen
as different. For safety, we prefer fast-deep-equal, slighly slower, but
more reliable.

ℹ️ The rev measurement is here because it would be the ideal solution:
assume any document change implies a new revision, as it is made with
CouchDB / PouchDB. However, this is more complicated in practice:

  • Some docs returned by the stack do not have revision
  • Some docs returned by the stack might have changed, with the same
    revision: for instance the thumbnails links for files change over
    time.
  • If we have a partial doc query, i.e. a query with .select(fields),
    and then another query returning a common doc, the content might be
    different between the 2 queries, with the same doc revision.
    As this is more complicated to deal with, we do not rely on revision
    yet.

We also improved the document merging. For 1K docs to merge:

  • Before: 1626ms
  • After: 278ms

@paultranvan paultranvan marked this pull request as draft February 27, 2025 17:29
@Crash--
Copy link
Contributor

Crash-- commented Feb 28, 2025

This is particularly problematic when there are a lot of queries

This is why we should add a garbage collector for queries! You've the paper for that :p.

@paultranvan paultranvan force-pushed the fix-store-documents-updates branch 2 times, most recently from 6c1d55c to 291a75a Compare March 7, 2025 13:09
@paultranvan paultranvan marked this pull request as ready for review March 7, 2025 13:09
@paultranvan paultranvan force-pushed the fix-store-documents-updates branch 4 times, most recently from 8c2c1c8 to a2bc4d0 Compare March 7, 2025 14:11
@paultranvan paultranvan force-pushed the fix-store-documents-updates branch 2 times, most recently from 9e79d2c to 2b1dc8e Compare March 11, 2025 16:05
@paultranvan paultranvan force-pushed the fix-store-documents-updates branch 3 times, most recently from fbbc164 to db2aabf Compare March 11, 2025 16:56
We used to create a new reference on the `documents` state for any
query.
This is problematic because we do a referential comparison on the
updated `documents` state to evaluate if there is any update on it,
resulting on all store's queries re-evaluation.
So, basically, we were doing a full queries re-evaluation for *any*
query execution, even if the same query is run several time with
no change.
This is particularly problematic when there are a lot of queries and/or
queries with many documents returned, as their re-evaluation will be
costful. Moreover, the document merging itself is costful.

The comparison between old and new docs is crucial for performances, so
we did some performances measures: we took a worst case where we fill
the store with 25K docs with a query and run again the same query with
no document change.

What we measure here is the time taken in the `extractAndMergeDocument`
method:

- **Before: 73892 ms**
- Compare with lodash isEqual: 47530 ms
- Compare with stringify + djb2 hash: 19785 ms
- **Compare with fast-deep-equal: 2404 ms (used solution)**
- Compare with JSON.stringify: 1899 ms
- Compare with rev (for the record): 123ms

ℹ️  The JSON.stringify comparison is faster, but it does not guarantee
key order: 2 documents with same value but different order will be seen
as different. For safety, we prefer fast-deep-equal, slighly slower, but
more reliable.

ℹ️  The rev measurement is here because it would be the ideal solution:
assume any document change implies a new revision, as it is made with
CouchDB / PouchDB. However, this is more complicated in practice:
- Some docs returned by the stack do not have revision
- Some docs returned by the stack might have changed, with the same
  revision: for instance the thumbnails links for files change over
time.
- If we have a partial doc query, i.e. a query with `.select(fields)`,
  and then another query returning a common doc, the content might be
different between the 2 queries, with the same doc revision.
As this is more complicated to deal with, we do not rely on revision
yet.
It appears @fastify/deepmerge is faster than merge from lodash.
To measure performances, we first query 1000 docs. Then, we mutate each
doc, and make the query again.
We measure the whole `extractAndMergeDocument` function.

Before: 1626ms
After: 278ms
@paultranvan paultranvan force-pushed the fix-store-documents-updates branch from db2aabf to b0714ba Compare March 11, 2025 17:03
@paultranvan paultranvan merged commit f482946 into master Mar 11, 2025
3 checks passed
@paultranvan paultranvan deleted the fix-store-documents-updates branch March 11, 2025 17:40
Ldoppea added a commit to cozy/create-cozy-app that referenced this pull request Apr 3, 2025
In cozy/cozy-client#1594 we added `@fastify/deepmerge` module into
cozy-client

Unfortunately this library is packaged as-is, without any transpilation
done. So old projects like cozy-home or cozy-drive will struggle to use
it as its code contains some modern instructions like optional chaining

To make it work, we want to configure webpack to transpile this library
Ldoppea added a commit to cozy/create-cozy-app that referenced this pull request Apr 3, 2025
In cozy/cozy-client#1594 we added the @fastify/deepmerge node module to
cozy-client

Unfortunately this library is packaged as-is, without any transpilation
done. So old projects like cozy-home or cozy-drive will struggle to use
it as its code contains some modern instructions like optional chaining

With current webpack configuration, babel won't transpile external code

We want to configure webpack to transpile the @fastify/deepmerge module
Ldoppea added a commit to cozy/cozy-drive that referenced this pull request Apr 17, 2025
Since cozy/cozy-client#1594 the `selectors.spec.js` test where failing

This was due to incorrect mock for `client.requestQuery` that would
return incorrect values with mixed folders and files where
queryDefinition did request for only one type

With previous cozy-client implementation, the `extractAndMergeDocument`
would always return a new reference for the documents list. Then on the
store's Reducer, the `haveDocumentsChanged` variable would always be
`true`, so the queries helpers would try to auto update the queries and
so would replay them with Sift

The combination of incorrect mock + incorrect `dir_id` fields would
result to Sift not being able to retrieve existing files and so it
would chose to empty the entire `data` field from the `file` query in
the redux store

Also the `directory` query would contains both files and folders, so
the test did see 13 items in queries, but for the wrong reason:
- Instead of having: `files query=> 10 files` and `directories query => 3 folders`
- We had: `files query=> 0 files` and `directories query => 3 folders and 10 files`

With the new cozy-client implementation, the `extractAndMergeDocument`
would return the reference for the documents list as both `file` and
`directory` queries where mocked to receive the exact same content, so
the documents list was identical. Then the on the store's Reducer, the
`haveDocumentsChanged` variable would be `false`, and so the queries
helpers would not replay queries with Sift

So we obtained the following result:
`files query=> 3 folders and 10 files` and
`directories query => 3 folders and 10 files`, resulting to 26 items

This shows that with the new implementation, cozy-client is more
sensitive to impossible tests configurations

By fixing the mock, we now correctly have: `files query=> 10 files` and
`directories query => 3 folders`

The fix on the `dir_id` is not mandatory here, because there are no
intersection between both queries results, but we chose to still add it
in order to have a clean test setup
test: Correctly set dir_id for `selectors` tests
Ldoppea added a commit to cozy/cozy-drive that referenced this pull request Apr 17, 2025
Since cozy/cozy-client#1594 the `selectors.spec.js` test where failing

This was due to incorrect mock for `client.requestQuery` that would
return incorrect values with mixed folders and files where
queryDefinition did request for only one type

With previous cozy-client implementation, the `extractAndMergeDocument`
would always return a new reference for the documents list. Then on the
store's Reducer, the `haveDocumentsChanged` variable would always be
`true`, so the queries helpers would try to auto update the queries and
so would replay them with Sift

The combination of incorrect mock + incorrect `dir_id` fields would
result to Sift not being able to retrieve existing files and so it
would chose to empty the entire `data` field from the `file` query in
the redux store

Also the `directory` query would contains both files and folders, so
the test did see 13 items in queries, but for the wrong reason:
- Instead of having: `files query=> 10 files` and `directories query => 3 folders`
- We had: `files query=> 0 files` and `directories query => 3 folders and 10 files`

With the new cozy-client implementation, the `extractAndMergeDocument`
would return the reference for the documents list as both `file` and
`directory` queries where mocked to receive the exact same content, so
the documents list was identical. Then the on the store's Reducer, the
`haveDocumentsChanged` variable would be `false`, and so the queries
helpers would not replay queries with Sift

So we obtained the following result:
`files query=> 3 folders and 10 files` and
`directories query => 3 folders and 10 files`, resulting to 26 items

This shows that with the new implementation, cozy-client is more
sensitive to impossible tests configurations

By fixing the mock, we now correctly have: `files query=> 10 files` and
`directories query => 3 folders`

The fix on the `dir_id` is not mandatory here, because there are no
intersection between both queries results, but we chose to still add it
in order to have a clean test setup
test: Correctly set dir_id for `selectors` tests
Ldoppea added a commit to cozy/cozy-drive that referenced this pull request Apr 17, 2025
Since cozy/cozy-client#1594 the `selectors.spec.js` test where failing

This was due to incorrect mock for `client.requestQuery` that would
return incorrect values with mixed folders and files where
queryDefinition did request for only one type

With previous cozy-client implementation, the `extractAndMergeDocument`
would always return a new reference for the documents list. Then on the
store's Reducer, the `haveDocumentsChanged` variable would always be
`true`, so the queries helpers would try to auto update the queries and
so would replay them with Sift

The combination of incorrect mock + incorrect `dir_id` fields would
result to Sift not being able to retrieve existing files and so it
would chose to empty the entire `data` field from the `file` query in
the redux store

Also the `directory` query would contains both files and folders, so
the test did see 13 items in queries, but for the wrong reason:
- Instead of having: `files query=> 10 files` and `directories query => 3 folders`
- We had: `files query=> 0 files` and `directories query => 3 folders and 10 files`

With the new cozy-client implementation, the `extractAndMergeDocument`
would return the reference for the documents list as both `file` and
`directory` queries where mocked to receive the exact same content, so
the documents list was identical. Then the on the store's Reducer, the
`haveDocumentsChanged` variable would be `false`, and so the queries
helpers would not replay queries with Sift

So we obtained the following result:
`files query=> 3 folders and 10 files` and
`directories query => 3 folders and 10 files`, resulting to 26 items

This shows that with the new implementation, cozy-client is more
sensitive to impossible tests configurations

By fixing the mock, we now correctly have: `files query=> 10 files` and
`directories query => 3 folders`

The fix on the `dir_id` is not mandatory here, because there are no
intersection between both queries results, but we chose to still add it
in order to have a clean test setup
test: Correctly set dir_id for `selectors` tests
Ldoppea added a commit to cozy/cozy-drive that referenced this pull request Apr 17, 2025
Since cozy/cozy-client#1594 the `selectors.spec.js` test where failing

This was due to incorrect mock for `client.requestQuery` that would
return incorrect values with mixed folders and files where
queryDefinition did request for only one type

With previous cozy-client implementation, the `extractAndMergeDocument`
would always return a new reference for the documents list. Then on the
store's Reducer, the `haveDocumentsChanged` variable would always be
`true`, so the queries helpers would try to auto update the queries and
so would replay them with Sift

The combination of incorrect mock + incorrect `dir_id` fields would
result to Sift not being able to retrieve existing files and so it
would chose to empty the entire `data` field from the `file` query in
the redux store

Also the `directory` query would contains both files and folders, so
the test did see 13 items in queries, but for the wrong reason:
- Instead of having: `files query=> 10 files` and `directories query => 3 folders`
- We had: `files query=> 0 files` and `directories query => 3 folders and 10 files`

With the new cozy-client implementation, the `extractAndMergeDocument`
would return the reference for the documents list as both `file` and
`directory` queries where mocked to receive the exact same content, so
the documents list was identical. Then the on the store's Reducer, the
`haveDocumentsChanged` variable would be `false`, and so the queries
helpers would not replay queries with Sift

So we obtained the following result:
`files query=> 3 folders and 10 files` and
`directories query => 3 folders and 10 files`, resulting to 26 items

This shows that with the new implementation, cozy-client is more
sensitive to impossible tests configurations

By fixing the mock, we now correctly have: `files query=> 10 files` and
`directories query => 3 folders`

The fix on the `dir_id` is not mandatory here, because there are no
intersection between both queries results, but we chose to still add it
in order to have a clean test setup
test: Correctly set dir_id for `selectors` tests
Ldoppea added a commit to cozy/cozy-drive that referenced this pull request Apr 17, 2025
Since cozy/cozy-client#1594 the `selectors.spec.js` test where failing

This was due to incorrect mock for `client.requestQuery` that would
return incorrect values with mixed folders and files where
queryDefinition did request for only one type

With previous cozy-client implementation, the `extractAndMergeDocument`
would always return a new reference for the documents list. Then on the
store's Reducer, the `haveDocumentsChanged` variable would always be
`true`, so the queries helpers would try to auto update the queries and
so would replay them with Sift

The combination of incorrect mock + incorrect `dir_id` fields would
result to Sift not being able to retrieve existing files and so it
would chose to empty the entire `data` field from the `file` query in
the redux store

Also the `directory` query would contains both files and folders, so
the test did see 13 items in queries, but for the wrong reason:
- Instead of having: `files query=> 10 files` and `directories query => 3 folders`
- We had: `files query=> 0 files` and `directories query => 3 folders and 10 files`

With the new cozy-client implementation, the `extractAndMergeDocument`
would return the reference for the documents list as both `file` and
`directory` queries where mocked to receive the exact same content, so
the documents list was identical. Then the on the store's Reducer, the
`haveDocumentsChanged` variable would be `false`, and so the queries
helpers would not replay queries with Sift

So we obtained the following result:
`files query=> 3 folders and 10 files` and
`directories query => 3 folders and 10 files`, resulting to 26 items

This shows that with the new implementation, cozy-client is more
sensitive to impossible tests configurations

By fixing the mock, we now correctly have: `files query=> 10 files` and
`directories query => 3 folders`

The fix on the `dir_id` is not mandatory here, because there are no
intersection between both queries results, but we chose to still add it
in order to have a clean test setup
zatteo pushed a commit to cozy/cozy-drive that referenced this pull request Apr 17, 2025
Since cozy/cozy-client#1594 the `selectors.spec.js` test where failing

This was due to incorrect mock for `client.requestQuery` that would
return incorrect values with mixed folders and files where
queryDefinition did request for only one type

With previous cozy-client implementation, the `extractAndMergeDocument`
would always return a new reference for the documents list. Then on the
store's Reducer, the `haveDocumentsChanged` variable would always be
`true`, so the queries helpers would try to auto update the queries and
so would replay them with Sift

The combination of incorrect mock + incorrect `dir_id` fields would
result to Sift not being able to retrieve existing files and so it
would chose to empty the entire `data` field from the `file` query in
the redux store

Also the `directory` query would contains both files and folders, so
the test did see 13 items in queries, but for the wrong reason:
- Instead of having: `files query=> 10 files` and `directories query => 3 folders`
- We had: `files query=> 0 files` and `directories query => 3 folders and 10 files`

With the new cozy-client implementation, the `extractAndMergeDocument`
would return the reference for the documents list as both `file` and
`directory` queries where mocked to receive the exact same content, so
the documents list was identical. Then the on the store's Reducer, the
`haveDocumentsChanged` variable would be `false`, and so the queries
helpers would not replay queries with Sift

So we obtained the following result:
`files query=> 3 folders and 10 files` and
`directories query => 3 folders and 10 files`, resulting to 26 items

This shows that with the new implementation, cozy-client is more
sensitive to impossible tests configurations

By fixing the mock, we now correctly have: `files query=> 10 files` and
`directories query => 3 folders`

The fix on the `dir_id` is not mandatory here, because there are no
intersection between both queries results, but we chose to still add it
in order to have a clean test setup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants