Skip to content

Conversation

Tishj
Copy link
Contributor

@Tishj Tishj commented Jun 20, 2024

Previously we would create TEMP VIEWs for every registered object.
This is unintuitive behavior and could also silently cause DuckDB to overwrite temporary views created by users.

In #12161 we cleaned up how replacement scans take place and fix the issues around lifetime of referenced python objects.

This enables us to use this same mechanism for registered objects, everywhere that a replacement scan can be used, registered objects can also be used[1].

Internal details:

Through a ClientContextState we create a bridge between the DuckDBPyConnection and the PythonReplacementScan.
The registered_objects are now stored inside the (new) PythonContextState.

DuckDBPyConnection.register adds to this map.
DuckDBPyConnection.unregister removes from this map.
PythonReplacementScan now tries to find a suitable registered object before any frames are scanned.

Other changes

  • DuckDBPyConnection.from_df and DuckDBPyConnection.from_arrow had weird implementations that were handrolling the replacement, causing a couple issues.
    These have been simplified to just do a replacement into a TableRef and a TableFunctionRelation is created from that.

  • Bind(BaseTableRef &ref) handrolled logic to concatenate catalog+schema+table names, this has been unified with ParseInfo::QualifierToString

  • The external_dependencies from ClientContext was removed, this was only used in the python client and was needed because created VIEWs could not contain ExternalDependencies, so the Connection was responsible for keeping the objects that the VIEWs relied on alive - this is no longer necessary because of [Python Dev] Push CTE internally for every (python) replacement scan that occurred. #12161

[1] One place where this isn't true is DESCRIBE, as it only supports VIEWs and TABLEs at the moment, support would need to be added for replacement scans there to enable the behavior we lost here.

@duckdb-draftbot duckdb-draftbot marked this pull request as draft June 20, 2024 17:21
@Tishj Tishj marked this pull request as ready for review June 20, 2024 17:21
@duckdb-draftbot duckdb-draftbot marked this pull request as draft June 20, 2024 19:17
@Tishj Tishj marked this pull request as ready for review June 20, 2024 19:17
@Tishj
Copy link
Contributor Author

Tishj commented Jun 21, 2024

@Mytherin should I add support for replacement scans to DESCRIBE before this gets merged, or are we fine with this regression in behavior for now?

I should probably rewrite the tests that I have marked as expecting an error now if they don't really need to use DESCRIBE for now

@Mytherin Mytherin changed the base branch from feature to main June 21, 2024 12:36
@Tishj
Copy link
Contributor Author

Tishj commented Jun 25, 2024

Ah.. seems I didn't test this enough, registered objects do not take precedence over temporary views, which is a different behavior than before.
That is kind of the issue this solves as well though, because previously created temporary views by the same name are currently silently deleted, replaced by the registered object.

Since nobody has made a report of this issue, it's safe to assume nobody is relying on registered objects taking precedence over temporary views. But it is something to keep in mind.

@duckdb-draftbot duckdb-draftbot marked this pull request as draft June 25, 2024 21:11
@Tishj Tishj marked this pull request as ready for review June 25, 2024 22:35
@duckdb-draftbot duckdb-draftbot marked this pull request as draft June 28, 2024 12:31
@Mytherin Mytherin marked this pull request as ready for review June 28, 2024 12:31
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! One comment. Let's also look at making DESCRIBE work for replacement scans in a separate PR.

@duckdb-draftbot duckdb-draftbot marked this pull request as draft July 1, 2024 14:22
@Tishj Tishj marked this pull request as ready for review July 1, 2024 14:43
@duckdb-draftbot duckdb-draftbot marked this pull request as draft July 2, 2024 06:58
@Tishj Tishj marked this pull request as ready for review July 3, 2024 14:34
@duckdb-draftbot duckdb-draftbot marked this pull request as draft July 8, 2024 08:12
@Tishj Tishj marked this pull request as ready for review July 8, 2024 08:49
@Mytherin Mytherin merged commit ae3ffa0 into duckdb:main Jul 11, 2024
@Mytherin
Copy link
Collaborator

Thanks! LGTM

github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Jul 17, 2024
Merge pull request duckdb/duckdb#12625 from Tishj/python_rework_registering_objects
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants