Skip to content

Conversation

cpwithers
Copy link
Contributor

Brief summary of the change made

Add STREAM as a pre table keyword for the databricks dialect as this is causing parsing issues due to the parser mistaking the keyword and function as a table name and alias:

Taking this example query: SELECT * FROM STREAM read_files('gs://my-bucket/avroData', includeExistingFiles => false);

The parser was producing the following:

[L:  1, P: 10]      |            from_clause:
// Removed
[L:  1, P: 15]      |                                naked_identifier:             'STREAM'
[L:  1, P: 21]      |                        whitespace:                           ' '
[L:  1, P: 22]      |                        alias_expression:
[L:  1, P: 22]      |                            naked_identifier:                 'read_files'

After this change the output is:

[L:  1, P: 10]      |            from_clause:
// Removed
[L:  1, P: 15]      |                    from_expression_element:
[L:  1, P: 15]      |                        keyword:                              'STREAM'
[L:  1, P: 21]      |                        whitespace:                           ' '
[L:  1, P: 22]      |                        table_expression:
[L:  1, P: 22]      |                            table_reference:
[L:  1, P: 22]      |                                naked_identifier:             'READ_FILES'

Fixes: #6414

Are there any other side effects of this change that we should be aware of?

None

Pull Request checklist

  • Please confirm you have completed any of the necessary steps below.

  • Included test cases to demonstrate any code changes, which may be one or more of the following:

    • .yml rule test cases in test/fixtures/rules/std_rule_cases.
    • .sql/.yml parser test cases in test/fixtures/dialects (note YML files can be auto generated with tox -e generate-fixture-yml).

modifiedBefore => current_date());

-- Reads a streaming table
SELECT * FROM STREAM read_files('gs://my-bucket/avroData', includeExistingFiles => false);
Copy link
Contributor

@keraion keraion May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks.

@cpwithers cpwithers force-pushed the feat/databricks-dialect-update branch from b9e53b8 to 1e18b62 Compare May 21, 2025 15:40
Copy link
Contributor

Coverage Results ✅

Name    Stmts   Miss  Cover   Missing
-------------------------------------
TOTAL   19692      0   100%

249 files skipped due to complete coverage.

Copy link
Contributor

@keraion keraion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the contribution!

@keraion keraion added this pull request to the merge queue May 22, 2025
Merged via the queue into sqlfluff:main with commit 518970e May 22, 2025
28 checks passed
@cpwithers cpwithers deleted the feat/databricks-dialect-update branch May 22, 2025 08:36
thomascjohnson pushed a commit to thomascjohnson/sqlfluff that referenced this pull request Jun 17, 2025
…qlfluff#6910)

Co-authored-by: Chris Withers <chris.withers@flagstoineim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New databricks streaming table function makes file unparsable
2 participants