-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[Python] Rework internals of object registration #12625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Rework internals of object registration #12625
Conversation
…ked by the PythonReplacementScan to insert registered objects
@Mytherin should I add support for replacement scans to DESCRIBE before this gets merged, or are we fine with this regression in behavior for now? I should probably rewrite the tests that I have marked as expecting an error now if they don't really need to use DESCRIBE for now |
Ah.. seems I didn't test this enough, registered objects do not take precedence over temporary views, which is a different behavior than before. Since nobody has made a report of this issue, it's safe to assume nobody is relying on registered objects taking precedence over temporary views. But it is something to keep in mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! One comment. Let's also look at making DESCRIBE work for replacement scans in a separate PR.
…d to concatenate them in 'bind_basetableref.cpp'
…m/Tishj/duckdb into python_rework_registering_objects
Thanks! LGTM |
Merge pull request duckdb/duckdb#12625 from Tishj/python_rework_registering_objects
Previously we would create TEMP VIEWs for every registered object.
This is unintuitive behavior and could also silently cause DuckDB to overwrite temporary views created by users.
In #12161 we cleaned up how replacement scans take place and fix the issues around lifetime of referenced python objects.
This enables us to use this same mechanism for registered objects, everywhere that a replacement scan can be used, registered objects can also be used[1].
Internal details:
Through a ClientContextState we create a bridge between the DuckDBPyConnection and the PythonReplacementScan.
The
registered_objects
are now stored inside the (new)PythonContextState
.DuckDBPyConnection.register
adds to this map.DuckDBPyConnection.unregister
removes from this map.PythonReplacementScan
now tries to find a suitable registered object before any frames are scanned.Other changes
DuckDBPyConnection.from_df
andDuckDBPyConnection.from_arrow
had weird implementations that were handrolling the replacement, causing a couple issues.These have been simplified to just do a replacement into a TableRef and a TableFunctionRelation is created from that.
Bind(BaseTableRef &ref)
handrolled logic to concatenate catalog+schema+table names, this has been unified withParseInfo::QualifierToString
The
external_dependencies
fromClientContext
was removed, this was only used in the python client and was needed because created VIEWs could not contain ExternalDependencies, so the Connection was responsible for keeping the objects that the VIEWs relied on alive - this is no longer necessary because of [Python Dev] Push CTE internally for every (python) replacement scan that occurred. #12161[1] One place where this isn't true is DESCRIBE, as it only supports VIEWs and TABLEs at the moment, support would need to be added for replacement scans there to enable the behavior we lost here.