-
-
Notifications
You must be signed in to change notification settings - Fork 873
Add Apache Doris SQL dialect support to SQLFluff #6979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR adds comprehensive support for the Apache Doris SQL dialect to SQLFluff, extending the existing MySQL grammar with Doris-specific syntax features. The implementation includes support for CREATE TABLE, DROP TABLE, INSERT statements, and various Doris-specific table properties and clauses. - **Engine Types**: Support for Doris-specific engines including `olap`, `mysql`, `elasticsearch`, `hive`, `hudi`, `iceberg`, `jdbc`, `broker` - **Key Types**: Support for `DUPLICATE KEY`, `AGGREGATE KEY`, and `UNIQUE KEY` syntax - **Partitioning**: - Auto partitioning with `AUTO PARTITION BY RANGE(function) ()` - Manual range partitioning with `PARTITION BY RANGE(columns)` - List partitioning with `PARTITION BY LIST(columns)` - **Distribution**: Support for `DISTRIBUTED BY HASH(columns)` and `DISTRIBUTED BY RANDOM` - **Rollup Definitions**: Support for `ROLLUP` clauses - **Table Properties**: Support for `PROPERTIES` clauses with key-value pairs - **Index Definitions**: Support for Doris-specific index types (`INVERTED`, `BITMAP`, `BLOOM_FILTER`) - **Generated Columns**: Support for `GENERATED ALWAYS AS` syntax - **CREATE TABLE ... AS SELECT (CTAS)**: Full support for CTAS with optional properties - **CREATE TABLE ... LIKE**: Support for creating tables based on existing table structure - **External Tables**: Support for `CREATE EXTERNAL TABLE` and `CREATE TEMPORARY EXTERNAL TABLE` - Support for `DROP TABLE [IF EXISTS] [db_name.]table_name [FORCE]` syntax - Database-qualified table names - Optional `IF EXISTS` clause - Optional `FORCE` keyword for unrecoverable table deletion - Basic `INSERT INTO table VALUES (...)` syntax - Column specification with `INSERT INTO table (col1, col2) VALUES (...)` - `DEFAULT` value support in VALUES clauses - Multiple row insertion with comma-separated VALUES - `INSERT ... SELECT` statements - **Partition specification** with `PARTITION (p1, p2)` clause - **Label specification** with `WITH LABEL label1` clause - Complex combinations of all INSERT features - Added Doris-specific reserved and unreserved keywords - Support for Doris aggregation functions (`MAX`, `MIN`, `REPLACE`, `SUM`, `BITMAP_UNION`, `HLL_UNION`, `QUANTILE_UNION`) - Support for `STORED` and `VIRTUAL` keywords in generated columns - Proper handling of Doris-specific data types and constraints - **Base Dialect**: Extends MySQL dialect to leverage existing MySQL grammar compatibility - **Custom Segments**: Implements Doris-specific grammar segments for complex syntax - **Keyword Management**: Properly manages Doris-specific keywords without conflicts - **Grammar Extensions**: Adds Doris-specific grammar rules while maintaining compatibility Comprehensive test coverage includes: - **CREATE TABLE tests**: 15+ test files covering various table creation scenarios - **DROP TABLE tests**: 4 test files covering different drop scenarios - **INSERT tests**: 9 test files covering various insert patterns - **Hive integration tests**: Multiple test files for Hive catalog integration - **Complex syntax tests**: Edge cases and combinations of multiple features - **MySQL Compatibility**: Maintains compatibility with existing MySQL syntax - **Doris Standards**: Follows official Apache Doris documentation and syntax - **Backward Compatibility**: No breaking changes to existing functionality The implementation follows the official Apache Doris documentation: - [CREATE TABLE](https://doris.apache.org/docs/dev/sql-manual/sql-statements/table-and-view/table/CREATE-TABLE) - [DROP TABLE](https://doris.apache.org/docs/dev/sql-manual/sql-statements/table-and-view/table/DROP-TABLE) - [INSERT](https://doris.apache.org/docs/dev/sql-manual/sql-statements/data-modification/DML/INSERT) - `src/sqlfluff/dialects/dialect_doris.py` - Main dialect implementation - `test/fixtures/dialects/doris/` - Comprehensive test suite (30+ test files) This PR provides complete Apache Doris SQL dialect support, enabling SQLFluff to properly parse, lint, and format Doris SQL code while maintaining compatibility with existing MySQL-based workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken an initial pass through this and it seems solid so far. The linter checks will fail on the missing newlines and there appears to be an empty test sql file. There are some opportunities for simplifying some of the classes down, but overall, very nice work. I'll take a second pass when I get the chance.
test/fixtures/dialects/doris/create_hive_table_as_select_basic.sql
Outdated
Show resolved
Hide resolved
test/fixtures/dialects/doris/create_hive_table_as_select_partitioned.sql
Outdated
Show resolved
Hide resolved
test/fixtures/dialects/doris/create_hive_table_as_select_with_comment.sql
Outdated
Show resolved
Hide resolved
test/fixtures/dialects/doris/create_hive_table_as_select_with_properties.sql
Outdated
Show resolved
Hide resolved
de6374f
to
4b7bc2d
Compare
Hi @keraion , thanks for your review. I fixed all issues. PTAL |
Would you mind running the pre-commit hooks on these files? If using tox |
So I need to run this command in my local env? What is it about to do? |
Yes, this will handle a few of the linting and formatting issues that remain. We have guides on getting tox setup, how to run the hooks automatically when committing, and a last checks guide that very briefly goes over it, but this command will run all the pre-commit hooks that we have in the Namely this will run black, ruff, and a few other helpers to clean up. These should be able to autofix most issues, but there may be a few items that need to be manually addressed. LMK if you have any other questions! |
Hi @keraion , I ran the tox and reformat the code, now it looks good:
|
Coverage Results ✅
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Thanks so much for the contribution. 🎉
Fix #6978
Description
This PR adds comprehensive support for the Apache Doris SQL dialect to SQLFluff, extending the existing MySQL grammar with Doris-specific syntax features. The implementation includes support for CREATE TABLE, DROP TABLE, INSERT statements, and various Doris-specific table properties and clauses.
Features Added
1. CREATE TABLE Statement Support
olap
,mysql
,elasticsearch
,hive
,hudi
,iceberg
,jdbc
,broker
DUPLICATE KEY
,AGGREGATE KEY
, andUNIQUE KEY
syntaxAUTO PARTITION BY RANGE(function) ()
PARTITION BY RANGE(columns)
PARTITION BY LIST(columns)
DISTRIBUTED BY HASH(columns)
andDISTRIBUTED BY RANDOM
ROLLUP
clausesPROPERTIES
clauses with key-value pairsINVERTED
,BITMAP
,BLOOM_FILTER
)GENERATED ALWAYS AS
syntaxCREATE EXTERNAL TABLE
andCREATE TEMPORARY EXTERNAL TABLE
2. DROP TABLE Statement Support
DROP TABLE [IF EXISTS] [db_name.]table_name [FORCE]
syntaxIF EXISTS
clauseFORCE
keyword for unrecoverable table deletion3. INSERT Statement Support
INSERT INTO table VALUES (...)
syntaxINSERT INTO table (col1, col2) VALUES (...)
DEFAULT
value support in VALUES clausesINSERT ... SELECT
statementsPARTITION (p1, p2)
clauseWITH LABEL label1
clause4. Doris-Specific Keywords and Grammar
MAX
,MIN
,REPLACE
,SUM
,BITMAP_UNION
,HLL_UNION
,QUANTILE_UNION
)STORED
andVIRTUAL
keywords in generated columnsTechnical Implementation
Testing
Comprehensive test coverage includes:
Compatibility
Documentation
The implementation follows the official Apache Doris documentation:
Files Changed
src/sqlfluff/dialects/dialect_doris.py
- Main dialect implementationtest/fixtures/dialects/doris/
- Comprehensive test suite (30+ test files)Pull Request checklist
Please confirm you have completed any of the necessary steps below.
Included test cases to demonstrate any code changes, which may be one or more of the following:
.yml
rule test cases intest/fixtures/rules/std_rule_cases
..sql
/.yml
parser test cases intest/fixtures/dialects
(note YML files can be auto generated withtox -e generate-fixture-yml
).test/fixtures/linter/autofix
.Added appropriate documentation for the change.
Created GitHub issues for any relevant followup/future enhancements if appropriate.