Skip to content

Parsed results of FROM Clause in SQL query  #3

@JasperGuo

Description

@JasperGuo

Hi,

Thanks for your great job in building such a dataset for nl2sql.
I am now conducting a characteristics study of the sql on it.
I find that some parsed results of FROM Clause in SQL Query are confusing.

Here is an example.
The FROM Clause in the query contains two join operations and three tables involved.
But in the parsed results of table_units, only two of the tables are involved.

    {
        "db_id": "department_management",
        "query": "SELECT DISTINCT T1.creation FROM department AS T1 JOIN management AS T2 ON T1.department_id  =  T2.department_id JOIN head AS T3 ON T2.head_id  =  T3.head_id WHERE T3.born_state  =  'Alabama'",
        "sql": {
            "except": null,
            "from": {
                "conds": [
                    [
                        false,
                        2,
                        [
                            0,
                            [
                                0,
                                1,
                                false
                            ],
                            null
                        ],
                        [
                            0,
                            11,
                            false
                        ],
                        null
                    ]
                ],
                "table_units": [
                    [
                        "table_unit",
                        0
                    ],
                    [
                        "table_unit",
                        2
                    ]
                ]
            },
           ...
        }
    }

Based on my further inspection, it may be caused by the order of JOIN and ON clauses.
Here is a clear and correct example.

{
        "db_id": "bike_1",
        "query": "SELECT count(*) FROM station AS T1 JOIN trip AS T2 JOIN station AS T3 JOIN trip AS T4 ON T1.id  =  T2.start_station_id AND T2.id  =  T4.id AND T3.id  =  T4.end_station_id WHERE T1.city  =  \"Mountain View\" AND T3.city  =  \"Palo Alto\"",
        "sql": {
            "except": null,
            "from": {
                "conds": [
                    [
                        false,
                        2,
                        [
                            0,
                            [
                                0,
                                1,
                                false
                            ],
                            null
                        ],
                        [
                            0,
                            16,
                            false
                        ],
                        null
                    ],
                    "and",
                    [
                        false,
                        2,
                        [
                            0,
                            [
                                0,
                                12,
                                false
                            ],
                            null
                        ],
                        [
                            0,
                            12,
                            false
                        ],
                        null
                    ],
                    "and",
                    [
                        false,
                        2,
                        [
                            0,
                            [
                                0,
                                1,
                                false
                            ],
                            null
                        ],
                        [
                            0,
                            19,
                            false
                        ],
                        null
                    ]
                ],
                "table_units": [
                    [
                        "table_unit",
                        0
                    ],
                    [
                        "table_unit",
                        2
                    ],
                    [
                        "table_unit",
                        0
                    ],
                    [
                        "table_unit",
                        2
                    ]
                ]
            },
            ...
        }
    },

Please let me know if I misunderstand something of the parsed results.

Great Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions