Skip to content

wrong results in CLI 1.2.1 vs Python 1.1.2 #16780

@M2ATrail

Description

@M2ATrail

What happens?

python 1.1.2 (Correct)
CLI 1.2.1 (Wrong)

the scripts runs daily, I copied the db file from a different system that runs that function daily (created using duckdb 1.1.2 inside python inside docker inside azure linux debian vm, but don't think this is relevant.)

running same query brings wrong result in ui
for background, i have this in the prepare in python

        conn.execute("""
        UPDATE everflow_daily SET sub5 = replace(sub5, '''', '');
        UPDATE everflow_daily SET sub5 = lower(trim(sub5));
        UPDATE everflow_daily SET sub5 = NULL WHERE sub5 = 'n/a';
        UPDATE everflow_daily SET sub4 = NULL WHERE sub4 = 'N/A';
        UPDATE everflow_daily SET sub3 = NULL WHERE sub3 = 'N/A';
        """)

in regular python, bring normal results

Image

ui gave me a heartache since i have been using this for months and depending on this.

Image

turns out this is cli issue, cli:

Image

for reference: https://discord.com/channels/909674491309850675/1352739533694832743

in python, this is one of function that gets run daily to create the db file, this is the related table

def process_everflow():
    """loads everflow data, cleans up sub5, sub4, sub3


    """
    print('processing everflow')
    with conn_duckdb() as conn:
        conn.execute(
            fr"CREATE OR REPLACE TABLE 'everflow_daily' AS SELECT * FROM read_csv_auto('{get_data.everflow_daily_export_path}/*.csv', union_by_name = true, filename = true)")

        conn.execute(fr"CREATE OR REPLACE TABLE 'everflow_offer' AS SELECT * FROM read_csv_auto('{get_data.temp_export_path}/offersTable.csv', union_by_name = true, filename = true)")

        conn.execute("""
        UPDATE everflow_daily SET sub5 = replace(sub5, '''', '');
        UPDATE everflow_daily SET sub5 = lower(trim(sub5));
        UPDATE everflow_daily SET sub5 = NULL WHERE sub5 = 'n/a';
        UPDATE everflow_daily SET sub4 = NULL WHERE sub4 = 'N/A';
        UPDATE everflow_daily SET sub3 = NULL WHERE sub3 = 'N/A';
        """)

        conn.execute(fr"""
        ALTER TABLE everflow_daily
        ADD offer_type VARCHAR(32);

        UPDATE everflow_daily
        SET offer_type = CASE 
            WHEN UPPER(offer_name) LIKE '%CPA%' THEN 'CPA'
            WHEN UPPER(offer_name) LIKE '%CPC%' THEN 'CPC'
            WHEN UPPER(offer_name) LIKE '%CPL%' THEN 'CPL'
            ELSE 'Other'
        END;
        """)

To Reproduce

idk it is private data, sub4 have phone numbers, the folder is just csv files, one per day, sub4 is a column there.
Running on Azure, up to date.
OS: Windows (Windows 11 Enterprise)
Size: Standard D4ads v6 (4 vcpus, 16 GiB memory)

OS:

Windows

DuckDB Version:

1.2.1

DuckDB Client:

CLI and Python

Hardware:

No response

Full Name:

John Doe

Affiliation:

John Doe

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

No - I cannot share the data sets because they are confidential

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions