Skip to content

Conversation

chrisiou
Copy link
Contributor

@chrisiou chrisiou commented Sep 18, 2024

This PR introduces a new approach for defining and generating settings and a thread-safe accessor for those settings.

Steps to add a new setting

1. Update the settings.json file:

{
    "name": "<name>",
   "description": "<description>",
   "return_type": "<cpp_type>", 		// the return type of the GetSetting function
   "sql_type": "<sql_type>",
   "scope": "<scope>",

   // (optional) configuration options
   "verification": ["set", "reset"]		// (*) adds a verification function at the start of the auto-generated Set or/and Reset methods
   "custom_conversion_and_validation": true,	// (*) enables custom methods implementations
   "aliases": ["<alias_name1>", "<alias_name2>"]

}

(*): Indicates flags that enable custom implementations.
If any of the starred options (*) are enabled, a new code section will be added to custom_settings.cpp. See Step 4 for more details. These options won’t overwrite any existing content.

2. Add a new variable in the configuration structure
To store the value of the new setting, add a new variable in the appropriate config structure:

  • For global settings, add it to src/include/duckdb/main/config.hpp.
  • For local settings, add it to src/include/duckdb/main/client_config.hpp.

note: If the setting is autogenerated, the setting.name will be used as the variable name for the configuration structure. Ensure consistency by using the same name.

3. To generate the necessary code, run:

make generate-files

Alternatively, if you only want to run the script that automates the settings auto-generation process, use the generate_settings.pyscript. You can run it: python3 scripts/generate_settings.py

4. (Optional) Implement custom functionality
If the new setting requires custom functionality, look for the newly added corresponding block in custom_settings.cpp and modify the new methods. Otherwise, everything should be auto-generated.

note: If you want the setting's value to be applied only under certain conditions, use the verification method. If the condition is not met, return false to prevent the value from being set.

5. Write tests and verify that everything works correctly.


About this PR:

Race Condition Fix: Templated Accessor for Settings

Previously, settings were accessed directly from the options, which could lead to race conditions. This PR introduces a templated accessor to improve thread safety. The GetSetting accessor now ensures proper locking by obtaining a mutex before returning the setting's value.

Example usage:

auto &db_config = DBConfig::GetConfig(context);
auto index_scan_max_count = db_config.GetSetting<IndexScanMaxCount>();

The GetSetting methods in DBConfig (they are similar in the ClientConfig):

template <class OP>
typename OP::RETURN_TYPE GetSetting(const ClientContext &context) {
	std::lock_guard<mutex> lock(config_lock);
	return OP::GetSetting(context).template GetValue<typename OP::RETURN_TYPE>();
}

template <class OP>
Value GetSettingValue(const ClientContext &context) {
	std::lock_guard<mutex> lock(config_lock);
	return OP::GetSetting(context);
}

What functionality is actually needed for adding a new setting (almost all is now being handled automatically)

  • A structure that stores all relevant information about the setting and declares its methods in the settings.hpp file.
  • A variable to store the setting's value within a configuration structure, such as DBConfig or ClientContext.
  • A declaration of the setting's scope in the ConfigurationOption internal_options list found in config.cpp.
  • The implementation of the setting's methods.

Explanation of generate_settings.py

The generate_settings.py script automates several steps when adding a new setting:

  • Parse and reorder settings
    It calls scripts/settings_scripts/parse_and_sort_settings_in_json.py, which parses and sorts entries in settings.json. The settings are stored globally as SettingsList objects.
  • Update the settings header file
    It invokes scripts/settings_scripts/update_settings_header_file.py to automatically declare new settings and their methods in settings.hpp.
  • Update setting scopes
    It runs scripts/settings_scripts/update_settings_scopes.py, which adds new setting scopes and aliases to the internal_options configuration list in config.cpp.
  • Generate source code
    It executes two scripts:
    • The scripts/settings_scripts/update_autogenerated_functions.py auto-generates method definitions in autogenerated_settings.cpp.
    • The scripts/settings_scripts/update_custom_signats_functions.py adds a block to custom_settings.cpp, only if custom validation or functionality is required.
  • Format the output
    The script also formats settings.hpp, autogenerated_settings.cpp, and config.cpp files.

Disable automatic addition of new coding section in custom_settings.cpp

The code related to settings in settings.hpp, config.cpp, and autogenerated_settings.cpp is automatically generated. However, if the addition of a new settings block in custom_settings.cpp causes issues, it can be disabled. To do so, remove the code that is mentioned in the corresponding comment in the scripts/settings_scripts/update_settings_src_code.py

@chrisiou chrisiou requested a review from samansmink September 18, 2024 15:46
@duckdb-draftbot duckdb-draftbot marked this pull request as draft September 18, 2024 16:02
@chrisiou chrisiou marked this pull request as ready for review September 18, 2024 17:45
@Giorgi
Copy link
Contributor

Giorgi commented Sep 18, 2024

Can the C API for settings get some love too? Inconsistency between duckdb_settings() and duckdb_get_config_flag

Copy link
Contributor

@samansmink samansmink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @chrisiou, Thanks for the PR, looks great! I've left a few comments.

@Mytherin Mytherin changed the base branch from main to feature September 25, 2024 14:09
@duckdb-draftbot duckdb-draftbot marked this pull request as draft September 26, 2024 12:38
@chrisiou chrisiou marked this pull request as ready for review September 26, 2024 21:22
Mytherin added a commit that referenced this pull request Oct 16, 2024
…14383)

Follow-up from #14018, finalizing
the implementation

This PR enables the generation of settings through a JSON file, avoiding
having to touch many places in the code to add a single setting.
Examples of such settings:

##### Simple

```json
        "name": "allow_extensions_metadata_mismatch",
        "description": "Allow to load extensions with not compatible metadata",
        "type": "BOOLEAN",
        "scope": "global"
```

##### More bells and whistles
```json
    {
        "name": "checkpoint_threshold",
        "description": "The WAL size threshold at which to automatically trigger a checkpoint (e.g. 1GB)",
        "type": "VARCHAR",
        "scope": "global",
        "internal_setting": "checkpoint_wal_size",
        "custom_implementation": [
            "set",
            "get"
        ],
        "aliases": [
            "wal_autocheckpoint"
        ]
    },
```


| Setting | Description |

|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
| name | Name of the setting |
| description | Description of the setting |
| type | SQL type of the setting (e.g. BIGINT) |
| return_type | C++ return type of the setting - if not set this is
derived from the SQL type (e.g. VARCHAR -> string) |
| scope | Global (DBConfig) or Local (ClientConfig) |
| internal_setting | C++ variable name of the setting - if not provided
equal to name |
| on_callbacks | Custom callbacks to generate (set/reset) - generates
OnGlobalSet/OnLocalSet/OnGlobalReset/OnLocalReset callbacks that can be
used to verify input data |
| custom_implementation | Whether or not the setting code is
automatically generated - can be either true/false, or a list providing
which has a custom implementation (set/reset/get) |
| struct | Struct name of the setting, defaults to setting name in
CamelCase |
| aliases | List of aliases to generate for the setting |

Not all settings can be fully auto-generated, therefore they are split
between `custom_settings.cpp` and `autogenerated_settings.cpp`. The
`custom_implementation` flag controls whether or not code for the
setting is automatically generated.
@Mytherin Mytherin merged commit 7a93930 into duckdb:feature Oct 16, 2024
38 of 41 checks passed
@Mytherin
Copy link
Collaborator

Thanks @chrisiou, I've rounded this off in #14383

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants