Skip to content

Memory usage never decreases after DELETE FROM #9263

@dufferzafar

Description

@dufferzafar

What happens?

I'm using an in-memory and have a workload that updates a table using the C++ Appender interface. When left running, memory usage continues to rise until the application needs to be killed.

To Reproduce

#include "vendor/duckdb/duckdb.hpp"

#include <iostream>
#include <memory>
#include <random>
#include <string>
#include <unordered_map>

using namespace duckdb;

int main()
{
    std::unordered_map<std::string, std::string> options = {{"threads", "2"}};
    auto config = DBConfig(options, false);

    DuckDB     db(nullptr, &config);
    Connection con(db);

    // Setup
    auto res = con.Query(R"(
        CREATE TABLE main(
            k1 VARCHAR, k2 VARCHAR, k3 UINTEGER, k4 VARCHAR,
            f1 DOUBLE, f2 DOUBLE, f3 DOUBLE, f4 DOUBLE, f5 DOUBLE, f6 DOUBLE,
            PRIMARY KEY(k1, k2, k3)
        );

        CREATE TABLE updates(
            id UINTEGER, k1 VARCHAR, k2 VARCHAR, k3 UINTEGER, k4 VARCHAR,
            f1 DOUBLE, f2 DOUBLE, f3 DOUBLE, f4 DOUBLE, f5 DOUBLE, f6 DOUBLE
        );
    )");

    auto update = R"(
        INSERT INTO main
        (
            SELECT * EXCLUDE (rnk, id) FROM (
                SELECT *, RANK() OVER
                    (PARTITION BY k1, k2, k3 ORDER BY id DESC) AS rnk
                FROM updates
                QUALIFY rnk = 1
            )
        )
        ON CONFLICT (k1, k2, k3)
        DO UPDATE SET 
            f1 = excluded.f1, f2 = excluded.f2, f3 = excluded.f3,
            f4 = excluded.f4, f5 = excluded.f5, f6 = excluded.f6
        ;
    )";

    std::default_random_engine generator;
    std::uniform_int_distribution<int> random_key(1, 1'000'000);

    long m_updateCnt{0};
    duckdb::Appender appender(con, "updates");
    while (true)
    {
        for (size_t i = 1; i < 500'000; i++)
        {
            appender.BeginRow();
            appender.Append<uint32_t>(++m_updateCnt);
            appender.Append<duckdb::string_t>("Group");
            appender.Append<duckdb::string_t>("Case");
            appender.Append<uint32_t>(random_key(generator));
            appender.Append<duckdb::string_t>("Record");
            appender.Append<double>(i);
            appender.Append<double>(i);
            appender.Append<double>(i);
            appender.Append<double>(i);
            appender.Append<double>(i);
            appender.Append<double>(i);
            appender.EndRow();
        }
        appender.Flush();
        
        res = con.Query(update);
        std::cout << " upsert=" << res->GetValue(0, 0).ToString();
        res = con.Query("DELETE FROM updates;");
        std::cout << " delete=" << res->GetValue(0, 0).ToString();
        std::cout << " count=" << con.Query("SELECT count(*) FROM main;")->GetValue(0, 0).ToString();

        m_updateCnt = 0;

        std::cout << std::endl;
    }

    return 0;
}

OS:

Linux

DuckDB Version:

0.8.1

DuckDB Client:

C++

Full Name:

Shadab Zafar

Affiliation:

Tower Research Capital

Have you tried this on the latest main branch?

I have tested with a release build (and could not test with a main build)

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions