Skip to content

Faster dataset search page updates #8930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 9, 2025
Merged

Conversation

wardi
Copy link
Contributor

@wardi wardi commented May 8, 2025

Fixes #8914

Proposed fixes:

  • use hx-boost on facet links, show more/less, facet pills, search form, pagination buttons
  • render only the search results and facet snippets in response to htmx requests
  • update select-switch.js to submit form with htmx when it hx-boost is set on form

example previous search page update time: 2.29s, with these changes: 0.05s (45x faster) on a debug instance with none of the static assets cached. On a perfectly cached production instance the worst case improvement is about 3x.

Other changes:

  • switch the "Remove" facet pills "x" icon tooltip attribute from data-bs-toggle to data-toggle because the former stays active on the htmx request (tooltip appearance changes)
  • moved blocks from templates/package/search.html to templates/package/snippets/search_results.html
    • page_primary_action
    • form
    • package_search_results_list

Features:

  • includes tests covering changes
  • includes updated documentation
  • includes user-visible changes
  • includes API changes
  • includes bugfix for possible backport

@wardi wardi marked this pull request as draft May 8, 2025 19:56
@wardi wardi force-pushed the 8914-faster-search-updates branch from d80cf2b to bc4e194 Compare May 13, 2025 11:26
@wardi
Copy link
Contributor Author

wardi commented May 13, 2025

We could avoid the extra form div if we decided to set htmx.config.disableInheritance = true globally. https://htmx.org/docs/#config

It might break the code of other early adopters of HTMX in ckan but might be worth it, WDYT @mutantsan @amercader ?

@mutantsan
Copy link
Contributor

  1. The tooltip wouldn't work with data-toggle attribute. As according to the bootstrap, we should initialize the tooltips manually. And we do it in main.js with. As you already've implemented the reinitialization for data-module scripts, you should do the same for the tooltip. To follow the CKAN approach, the tooltip initialization should probably be moved to a separate JS script, that will be on an element, that requires initialization.
$(function() {
  $('[data-bs-toggle="tooltip"]').each(function (index, element) {
    bootstrap.Tooltip.getOrCreateInstance(element)
  })
})
  1. I'd leave it as is, unless we will have problems with the current approach. Changing the global behavior doesn't feel right. Also, we might change the config only for the search page if we really need it.
    a. Btw, if I delete the hx-boost="false" from the wrapper div, it still works for me. What does it fix?
  2. Do we really need the select-switch.js now? If you'll add these attributes on a form hx-trigger="change, submit" it should handle the sort.
  3. Check the mobile search-facets, it seems, that the close event isn't initialized after swap, so it would be impossible to exit the modal.
  4. Apply search and go back with browser back button, you'll see the JS error in the console this.el.select2 is not a function ...
  5. Do we need some kind of load indicator in cases, when we have some heavy logic on backend, and the search isn't really immediate (like it is when you test it locally)

@mutantsan
Copy link
Contributor

Also, when you'll fix the tooltip, you'll probably encounter a weird issue, when clicking on the pill will re-render the page with new data, but the tooltip will keep hanging there. This happens, because the tooltip is attached by bootstrap to the body. We need to dispose of the tooltip in this case. The straightforward way is to add this to htmx-ckan.js:

document.body.addEventListener('htmx:beforeSwap', function (evt) {
    // Dispose any active tooltips before HTMX swaps the DOM
    document.querySelectorAll('[data-bs-toggle="tooltip"]').forEach(el => {
        const tooltip = bootstrap.Tooltip.getInstance(el);
        if (tooltip) {
            tooltip.dispose();
        }
    });
});

But if you'll move the tooltip initialization to a separate JS script, you might try to add a click event on an element with it, and inside the event handler try to locate the tooltip and call tooltip.dispose() method.

Or we can drop the tooltip from these elements and rely on title, which renders the browser tooltip anyway.

@wardi
Copy link
Contributor Author

wardi commented May 15, 2025

  1. The tooltip wouldn't work with data-toggle attribute. As according to the bootstrap, we should initialize the tooltips manually. And we do it in main.js with. As you already've implemented the reinitialization for data-module scripts, you should do the same for the tooltip. To follow the CKAN approach, the tooltip initialization should probably be moved to a separate JS script, that will be on an element, that requires initialization.

Thank you this worked great.

3. I'd leave it as is, unless we will have problems with the current approach. Changing the global behavior doesn't feel right. Also, we might change the config only for the search page if we really need it.
   a. Btw, if I delete the `hx-boost="false"` from the wrapper `div`, it still works for me. What does it fix?

I've removed the hx-boost="false" wrapper div and replaced it with hx-disinherit="hx-boost" on the form. What this does is prevent any links that could be added by template overrides from being boosted because those links might point to other pages.

4. Do we really need the `select-switch.js` now? If you'll add these attributes on a form `hx-trigger="change, submit"` it should handle the sort.

Love this, done.

We could even have the trigger run when the user pauses while entering search input, but I've left that out of this PR because it's a larger behavior change. The search is executed now when the user leaves the search box and they don't have to click the button anymore, however.

5. Check the mobile `search-facets`, it seems, that the close event isn't initialized after swap, so it would be impossible to exit the modal.

Now fixed with more hacks in htmx-ckan.js (let me know if this looks ok)

6. Apply search and go back with browser back button, you'll see the JS error in the console `this.el.select2 is not a function ...`

IIUC the select2.js isn't being initialized when htmx navigates back or forwards. I'm not sure what causes this file to be run in the first place or the best way to fix it for the htmx navigation case, do you have any suggestion?

7. Do we need some kind of load indicator in cases, when we have some heavy logic on backend, and the search isn't really immediate (like it is when you test it locally)

I don't think so. The response should always be almost instant with solr and we don't have a load indicator for reloading the page itself that does even more work.

@wardi
Copy link
Contributor Author

wardi commented May 15, 2025

@mutantsan re: select2 failure on history navigation: I've tried disabling the history cache but that has no effect. I've also failed to even find an event that I can catch when navigating back and forwards. We might need https://github.com/bigskysoftware/htmx-extensions/blob/main/src/restored/README.md 🤔

@wardi wardi mentioned this pull request May 26, 2025
@wardi
Copy link
Contributor Author

wardi commented May 27, 2025

@mutantsan I think I've fixed the select2 issue with a small change to our autocomplete module 52232f4 and a select2 upgrade from #8967

This version of select2 doesn't destroy the original select element making the fix much easier.

@mutantsan
Copy link
Contributor

@wardi, did you this error, btw? Can we fix it?

image

@wardi
Copy link
Contributor Author

wardi commented May 29, 2025

@mutantsan I don't see these errors, what do I need to do to trigger them?

@mutantsan
Copy link
Contributor

@wardi TBH, I don't do anything special 🤔Just apply search, facets, sort, go back and forth with back/forward browser buttons
also, I use firefox browser

@wardi
Copy link
Contributor Author

wardi commented May 29, 2025

I'm using firefox 137.0.2 . I see only warnings and no errors.
Screenshot from 2025-05-29 13-28-57

@wardi wardi marked this pull request as ready for review May 29, 2025 19:40
@wardi
Copy link
Contributor Author

wardi commented May 29, 2025

@aleeexgreeen if you have a moment please take a look at my midnight-blue template changes here to make sure I didn't revert or break anything.

@aleeexgreeen
Copy link
Contributor

@wardi No issues/nothing broken on the midnight blue side

@wardi
Copy link
Contributor Author

wardi commented May 30, 2025

@amercader this is ready for review. Everything looks good in my testing, but are you able to reproduce the errors that @mutantsan is reporting?

@amercader
Copy link
Member

amercader commented May 30, 2025

Was looking at this right now, I can't see the error. I wonder if it's caused by running different htmx versions? @mutantsan if you clear local storage do you still get the errors?

@@ -0,0 +1 @@
Navigate dataset search pages, facets and sort order without page reloads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mention here any changes at the template level that extension developers need to be aware of? e.g. how to make custom search pages HTMX enabled, what the new *_htmx.html do, etc. We can polish it before the next release but it's proaably easier to write up now while it's fresh

@amercader
Copy link
Member

This looks great, and it's obviously a massive user experience improvements, I've just added a few comments about documenting the changes.

I read about hx-boost and its potential issues. Just so I understand better, the difference between using hx-boost vs regular hx-post or hx-get with an hx-target attribute is just the hx-boost will hook into the page navigation? I suppose that's what we want in the search page to keep the current behaviour with standard links.
It does create weird things though, e.g, when clicking a pagination link the page scrolls to the top (correctly) but if you click the back button you get scrolled down to the pagination again. This is not a deal breaker but we should be on the look out for more things like this.

@mutantsan
Copy link
Contributor

@amercader @wardi
Here, take a look. I've tried the same in the Chrome browser
htmx-history

@mutantsan
Copy link
Contributor

I did a little debug, that's the info about the error:
image

The HTMX uses the browser local storage and stores there the cache for the page. Each unique URL will be cached. Since I click here and there and everything goes into the URL, we cache it.

We can disable the cache if we want.

By default, browsers have something like 5MB or 10MB max size per domain. In my case it was 5MB. And the HTMX cache was 10.
(htmx.config.historyCacheSize). So he though he has 5MB more to write, but there were none.

Now to the quota exceeding. I found this script to check the storage size

image

I found it weird, that 1 page adds 1480kb to the cache size. Although the response is 20~ KB. But it appears, that the HTMX doesn't cache the response direct (which is just a snippet). When the debug mode was enabled, it probably stored it as well. Because without it, multiple cache items take much less space

image

So, I assume we shouldn't worry about this, as it doesn't break anything and won't be an issue in production. Maybe, the only thing we can be aware of, is disable cache where we don't need it (for example, search should be already fast, do we need to cache it?) or tune the HTMX cache size.

@wardi
Copy link
Contributor Author

wardi commented Jun 1, 2025

I'm fine with disabling the history if it's creating errors for users in some cases, and because caching is really hard to get right.

We could disable caching history globally with htmx.config.historyEnabled https://htmx.org/docs/#config

@wardi
Copy link
Contributor Author

wardi commented Jun 1, 2025

OTOH when navigating back and forward on a web browser between normal pages the browser won't send a new request to the server. This history cache is emulating normal browser behavior so maybe we should keep the default?

@mutantsan
Copy link
Contributor

I suggest to lower the htmx.config.historyCacheSize from 10 to 5 globally. I assume, this should help to fix the issue. You can check your browser(s) and see what quota do you have for local storage. If some have a default 10, some 5, we can target the lower value.

I don’t think disabling cache globally is a good idea. At max, we can disable it with hx-history. And even when the quota exceeding error happens, it doesn’t really break anything.

The only important thing we have to consider here - what if it caches the value, but the uncached response by the URL would be different, cause something is changed already?

@amercader
Copy link
Member

Great sleuthing @mutantsan.

IIUC the issue:

  • HTMX caches all innerHTML body contents, not just the sent snippets
  • On debug mode this is much larger than on prod mode (~250Kb vs 50Kb). That's because of Flask Debug Toolbar
  • If the browser max size < htmx max size AND the max size is exceeded -> we get the error

If that is correct, I think what @mutantsan suggests to lower the htmx max value to 5 might be harmless, although I still haven't been able to reproduce it neither in Firefox nor on Chrome. In fact the local storage (as reported by the script you linked to ) never goes above 5Mb no matter how many HTMX requests I do.

In any case, as it seems it only happens on debug mode and is not reproducible I think we can merge this as it is, but...

@amercader
Copy link
Member

Another thing I noticed that worries me slightly is that when clicking something that does an htmx request and then looking at the Page Source (i.e Ctrl+U) this only displays the returned snippet (e.g. the source code starts with <section class="module" id="search-results" hx-swap-oob="true" hx-swap="none">).
Happens in both Chrome and FF.
Other than being really off-putting I wonder if this is a symptom of something that we are not doing correctly?

@wardi
Copy link
Contributor Author

wardi commented Jun 2, 2025

@amercader In developer tools we can still see the full version of actual DOM / page source in the inspector (and export it to HTML if needed). Maybe "View source" always returns the raw contents of the last server response instead of what is generated by JS?

@amercader
Copy link
Member

I've been investigating a bit more and it looks like this is the standard behaviour of hx-boost, so happy to merge this once the minor issues are sorted

@wardi
Copy link
Contributor Author

wardi commented Jun 3, 2025

@amercader minor issues addressed in bd445df let's wait for @smotornyuk to merge one of the select2 upgrade PRs and merge back from master before merging this one.

@smotornyuk
Copy link
Member

smotornyuk commented Jun 9, 2025

@amercader In developer tools we can still see the full version of actual DOM / page source in the inspector (and export it to HTML if needed). Maybe "View source" always returns the raw contents of the last server response instead of what is generated by JS?

The "show page source" returns the last response of the server for the current URL or fetches a new one if there is no cached response. For example, you can open JS console and run window.history.pushState({}, '', '/api/action/status_show'). This will change the URL in the address bar, but won't update the page content. Nevertheless, the browser thinks that URL reflects the current page, so if you focus on the page and press Ctrl+U, you'll see the API response instead of the real source of the current content. Here browser has no cached version so it will initiate a new request.

In case of HTMX, when boosted fragment is pulled, it rewrites the URL via history.pushState. When you press Ctrl-U, browser sees that you have just received the content from the given URL and shows this cached content, without a new real request to the server. That's why you see only one fragment. To see the whole page, you can press F5 - the source page will reload and trigger the real request to the server, pulling the whole page

BTW, I merged select2 upgrade, you can proceed with this PR

@amercader amercader merged commit 40bbc4b into master Jun 9, 2025
43 of 45 checks passed
@amercader amercader deleted the 8914-faster-search-updates branch June 9, 2025 13:59
@amercader
Copy link
Member

Great work @wardi , thanks @mutantsan and @smotornyuk for the reviews and insights

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Faster dataset search page updates
5 participants