-
Notifications
You must be signed in to change notification settings - Fork 129
Remove code related to http caching #644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
duckdb_extension_load(httpfs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using the public one? (like json
/icu
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we still want it statically compiled into the binary, and since httpfs is in a separate repo these days we have to get it like this afaik. If you know of some other way to do that, which does not require us to have an additional submodule, then I'm all for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't duckdb_extension_load(httpfs)
just do exactly that?
For example with json:
y=# select * from duckdb.raw_query($$select extension_name, loaded, installed, install_mode, install_path from duckdb_extensions() where extension_name = 'json';$$);
NOTICE: result: extension_name loaded installed install_mode install_path
VARCHAR BOOLEAN BOOLEAN VARCHAR VARCHAR
[ Rows: 1]
json true true STATICALLY_LINKED (BUILT-IN)
raw_query
-----------
(1 row)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I thought JSON was also out of tree...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this is it: 7579326
ad5ec30
to
cbe5b24
Compare
I would prefer to see more progress on supporting community extensions (it's not clear whether community extensions are currently supported) and possibly supporting "persistent" settings (i.e. That said, if we are going to remove / replace it -- which I do think we should -- I would certainly like to remove it before 1.0. I would be curious, among those watching the repo, whether removing this feature without an immediate alternative would break their usage / use cases. |
We had implemented our own http caching in #88 when nothing like that existed yet. Now there are two different threads that are implementing something like this in a better way than we have done. So this PR removes our custom caching code from our codebase. Instead in a future release we'll make sure that pg_duckdb works well with these at least one of these other caching solutions. For reference the two projects that are currently implementing caching are: 1. An official in-memory cache by DuckDB, which will eventually improved to also support on-disk caching: duckdb/duckdb#16463 2. A community provided extension called `cache_httpfs`: https://github.com/dentiny/duck-read-cache-fs
7579326
to
d0106ee
Compare
My main concern now is that extensions introduce settings that the user may want to change. Since a pg_duckdb user does not directly interact with the DuckDB instance lifecycle, it's not clear if and when the user needs to (re-)set these settings. Basically, it's the same problem as secrets. That said, the defaults for cache_httpfs seem sane to me, though I would like to see this feature implemented which would then have some sort of setting to limit the disk cache size. That said, the current cache implementation does not have such a limit either—but it also does not eagerly cache everything. |
Okay, I'm merging this then. And we can add something to configure settings (or even general "duckdb setup commands") in a follow up PR. |
Since #644 we need to have git installed to build our custom DuckDB.
Since #644 we need to have git installed to build our custom DuckDB. Our docker build container didn't have that so our nightly Docker builds were broken for the past few days.
Since #644 we need to have git installed to build our custom DuckDB. Our docker build container didn't have that so our nightly Docker builds were broken for the past few days.
We had implemented our own http caching in #88 when nothing like that existed yet. Now there are two different threads that are implementing something like this in a better way than we have done. So this PR removes our custom caching code from our codebase. Instead in a future release we'll make sure that pg_duckdb works well with these at least one of these other caching solutions.
For reference the two projects that are currently implementing caching are:
cache_httpfs
: https://github.com/dentiny/duck-read-cache-fs