-
-
Notifications
You must be signed in to change notification settings - Fork 873
Description
Search before asking
- I searched the issues and found no similar issues.
What Happened
SQL Fluff won't parse a Databricks SQL Notebook that starts with a Markdown Magic cell (interpreted as comments), when exported from databricks / committed to a git repo. The default format of a leading markdown cell adds an empty line before the first actual cell.
Edit:
Just tested: for any type of leading cells, this behavior persists. -> Any non-SQL cell as first cell will produce this issue. Format for R, Python, Shell Cells all are saved with a trailing empty line before the Command-Terminator.
Expected Behaviour
Notebooks gets parsed and linted.
Observed Behaviour
SQL Fluff fails on parsing and refuses to parse the rest of the file.
when removing the empty line before the first -- COMMAND -------- , the file gets parsed no problem.
adding any keyword, statement, etc also results in a succesfully parsed file, although any contents will then be rendered as part of the mark down cell and not interpreted / executed later. So basically this is an invalid notebook
How to reproduce
Parsable, but invalid SQL Notebook.
-- Databricks notebook source
-- MAGIC %md
-- MAGIC # MarkDown Cell
-- COMMAND ----------
-- MAGIC %md
-- MAGIC # Test
-- COMMAND ----------
-- DBTITLE 1,Create Widget
CREATE WIDGET TEXT base_ts_txt DEFAULT "2024-05-29 18:00:00";
Not parsable:
-- Databricks notebook source
-- MAGIC %md
-- MAGIC # MarkDown Cell
-- COMMAND ----------
-- MAGIC %md
-- MAGIC # Test
-- COMMAND ----------
-- DBTITLE 1,Create Widget
CREATE WIDGET TEXT base_ts_txt DEFAULT "2024-05-29 18:00:00";
Dialect
Databricks
Version
3.2.5
Configuration
[tool.sqlfluff.core]
dialect = "databricks"
templater = "jinja"
sql_file_exts = ".sql"
verbose = 2
exclude_rules = ["RF04", "ST05", "ST06","ST09", "LT05", "AL01", "ST01", "RF02", "RF03","AL09"]
ignore_files = []
[tool.sqlfluff.indentation]
indent_unit = "space"
tab_space_size = 4
allow_implicit_indents = "True"
[tool.sqlfluff.layout.type.groupby_clause]
line_position = "alone:strict"
[tool.sqlfluff.layout.type.statement_terminator]
spacing_before = "touch"
line_position = "trailing"
[tool.sqlfluff.layout.type.comma]
spacing_before = "touch"
line_position = "leading"
[tool.sqlfluff.layout.type.end_of_file]
spacing_before = "touch"
[tool.sqlfluff.rules.capitalisation.keywords]
capitalisation_policy = "upper"
[tool.sqlfluff.rules.convention.quoted_literals]
preferred_quoted_literal_style = "double_quotes"
[tool.sqlfluff.templater.python.context]
catalog = "xxx"
current_catalog = "xxx"
target_schema = "xxx"
source_schema = "xxx"
pos_ts = "2024-01-01 00:00:00"
pfc_ts = "2024-01-01 00:00:00"
delivery_begin_incl = "2024-01-01 00:00:00"
delivery_end_excl = "2024-01-01 00:00:00"
future_begin_incl = "2024-01-01 00:00:00"
base_ts_txt = "2024-01-01 00:00:00"
pid = ""
proc_id = ""
Are you willing to work on and submit a PR to address the issue?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct