Skip to content

Conversation

philroche
Copy link

Add new metadata key "persistent_urls" which removes the hash from all database urls when set to "true"

This PR is just to gauge if this, or something like it, is something you would consider merging?

I understand the reason why the substring of the hash is included in the url but
there are some use cases where the urls should persist across deployments. For bookmarks
for example or for scripts that use the JSON API.

This is the initial commit for this feature. Tests and documentation updates to follow.

…database urls when set to "true"

I understand the reason why the a substring of the hash is included in the url but
there are some use cases where the urls should persist across deployments. For bookmarks
for example or for scripts that use the JSON API.

This is the initial commit for this feature. Tests and documentation updates to follow.
@philroche philroche force-pushed the feature/persistent_urls branch from 800e6ff to 0d77a89 Compare May 14, 2018 09:41
@simonw
Copy link
Owner

simonw commented May 16, 2018

The URL does persist across deployments already, in that you can use the URL without the hash and it will redirect to the current location. Here's an example of that: https://san-francisco.datasettes.com/sf-trees/Street_Tree_List.json

This also works if you attempt to hit the incorrect hash, e.g. if you have deployed a new version of the database with an updated hash. The old hash will redirect, e.g. https://san-francisco.datasettes.com/sf-trees-c4b972c/Street_Tree_List.json

If you serve Datasette from a HTTP/2 proxy (I've been using Cloudflare for this) you won't even have to pay the cost of the redirect - Datasette sends a Link: <URL>; rel=preload header with those redirects, which causes Cloudflare to push out the redirected source as part of that HTTP/2 request. You can fire up the Chrome DevTools to watch this happen.

r.headers["Link"] = "<{}>; rel=preload".format(path)

All of that said... I'm not at all opposed to this feature. For consistency with other Datasette options (e.g. --cors) I'd prefer to do this as an optional argument to the datasette serve command - something like this:

datasette serve mydb.db --no-url-hash

@simonw
Copy link
Owner

simonw commented May 16, 2018

The principle benefit provided by the hash URLs is that Datasette can set a far-future cache expiry header on every response. This is particularly useful for JavaScript API work as it makes fantastic use of the browser's cache. It also means that if you are serving your API from behind a caching proxy like Cloudflare you get a fantastic cache hit rate.

An option to serve without persistent hashes would also need to turn off the cache headers.

Maybe the option should support both? If you hit a page with the hash in the URL you still get the cache headers, but hits to the URL without the hash serve uncashed content directly.

@philroche
Copy link
Author

Excellent, I was not aware of the auto redirect to the new hash. My bad

This solves my use case.

I do agree that your suggested --no-url-hash approach is much neater. I will investigate

@philroche philroche closed this May 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants