A module for TOML v1.0.0
support. It provides functionality to read/write TOML files and convert them directly to/from Jai data structures.
- Full TOML v1.0.0 support.
- All run-time data types are supported as their native type including reference types (
pointer
,array
,any
), with the exception ofuntagged unions
. - Generic data is supported through the Toml.Value struct.
- Dates/times are supported types provided in datetime.jai (until Jai has native types, Apollo_time.Calendar_Time is not usable here).
- Safe sum-types with
@SumType
notes to indicate the struct has a tag enum followed by a matching union. - Modifying the default behavior like: renaming, omitting, changing Type representation like Hash_Tables, enum as int, extra validation, or handling complex data like binary encodings are supported through custom handlers.
ok, my_struct := Toml.deserialize(toml_string, My_Struct);
- Read TOML directly into any (nested) struct or generic
Toml.Value
. - Any field not in the TOML fails unless annotated with
@TomlOptional
. Any superfluous fields in the TOML are ignored. - Compile-time constants are ignored and not compared.
- If the parser understands the input it will accept it even if it is officially invalid according to the standard. This may in the future change to be more strict. See the validation example for more details.
ok, toml_string := Toml.serialize(my_struct);
- Write any (nested) struct or
Toml.Value
to TOML string. - Null pointers/null Anys by default are written as
"~null~"
, this can be controlled via the context membertoml.null_value
. - Compile-time
constants
&imports
are skipped. #place
members (containing pointers) may not be safe! Overlapping members are all serialized.
Success -> bool
Error message -> interceptable logger
- Asserts are used to confirm correct implementation of the module. A triggered assert indicates a bug.
- Input data errors are flagged by the return boolean indicating the success of the operation. If a procedure indicates failure the reason will be found in the error log.
- To prevent or redirect the TOML module from writing to the error log the user can set a catching or wrapping logger in the context, see examples.
Set a context.allocator:
Toml.deserialize(toml_string, My_Struct,, my_alloc);
defer release(my_alloc);
Dynamically sized data like arrays and strings are returned on the provided context allocator. As such, the safe lifetime of all returned objects ends when the memory of the allocator is released.
- Allocation for returned data are in the context allocator. Instead of free_x() or deinit_x() procedures the user is expected to push an allocator such that all data can be dropped together.
- None of the returned data references allocated data of the input arguments.
- Temporary storage is only used for error messages as some of the intermediate data may be large. If there is a user-friendly way for the caller to replace the temporary allocator let me know, in which case we can just use temp.
- Most temporarily allocated memory in the module is freed in one go by releasing a pool allocator.
- In most cases this means that only the returned data will be left allocated on the context allocator.
By default this Toml modules makes several choices like, enums as string, members serialized by their name, no members omitted, etc.
These choices may be fine for the large majority of use-cases but not all. Custom handlers enable the user to modify the serialization/deserialization of types.
Custom handlers follow the same design as Print_Style.struct_printer
, except instead of writing to the builder it returns the Toml.Value to be written.
Note that these custom handler drive "what" is (de)serialized, they do not help with formatting of the toml. Formatting can be controlled via toml.default_format_int
/ toml.default_format_float
in the context, see the formatting example.
Serialization can be modified by changing the custom handler procedure in the context.
enum_to_value :: (input: Any) -> done:=true, ok:=false, toml:Value=.{} {
if input.type.type != .ENUM return false;
enum_value := Reflection.get_enum_value(input.value_pointer, xx input.type);
ok, value := Toml.type_to_value(*enum_value, type_info(s64));
return true, ok, value;
}
ok, toml := Toml.serialize(
my_struct
,, toml = .{custom_type_to_value=enum_to_value}
);
The
jai
and optionallytoml-test
executables should be in the PATH.
jai ./tests.jai
The module is tested by compiling and running the examples. The command above does that for all examples. It will also run the tests from toml-test
if the executable can be found in the PATH or the tools/toml-test directory.
The module supports both Typed as well as Generic data using Toml.Value. To avoid having to maintain 2 implementations the Typed version is implemented by going through the Generic version. As a result there are some extra data copies/allocations as well as limitations like no support for integers > S64_MAX and slightly reduced accuracy for float due to intermediate conversion through float64.
We only have a run-time reflection based implementation, e.g. Any based like: value_to_type :: (toml: Value, output: Any)
. A compile-time reflection based implementation, see the comptime branch, that has the types determined at compile-time like value_to_type :: (toml: Value, data: *$T)
would have slightly cleaner code, better run-time performance, and can handle generic custom types (like Hash_Table
) more conveniently, but it does not support dynamic types like Any, and custom handlers become more complex, with module parameters, code insertion and#poke_name
. Having multiple implementations is considered undesirable. This should be reevaluated when the #poke_name
replacement has landed in Jai.
Custom handlers have several design goals:
- Keep the core Toml module implementation clean from complexity.
- Enable custom serialization of types we do not own (external Modules) as we may not be able to add notes or remove members.
- The same type should be serializable in different ways defined at the procedure call site, not struct definition.
- Enable data tweaks during serialization as opposed to full copies of nested structure trees (separate structs for serialization and use within the application).
- The user should be able to opt-out of the default behavior of handling special types like Toml.Value, Chrono, and SumTypes. A user can opt-out by setting a procedure which is a no-op or any procedure that does not call the default_custom_handler.
In addition to handlers we currently also support making struct members optional through a @TomlOptional note. This means there are 2 ways to do the same thing, it also requires the original Type(_Info) to be modified. It is likely this note will be removed when these kinds of operations are better supported through custom handlers.
The context member toml.null_value
is introduced as a customization point as it would otherwise be relatively complicated to catch all cases where pointers can occur. toml.null_value can be any kind of Toml.Value, by default it is a string of value "~null~"
. Deserialization of the value "~null~"
for a *string
member can be surprising as it would result into a null pointer instead of a string with value "~null~"
. The ~
s are added to make it a string that is less likely to occur naturally.
Alternative sentinel values like and empty table {}
which is the equivalent of a struct without members are not chosen as due to @TomlOptional
it is also equivalent to a struct with all optional members.
The tokenization/lexer phase completes before the parser starts. As a result memory needs to be allocated to store the tokens. There is no particular reason for this implementation other than that I wanted to experiment with this type of lexer. Unlike traditional lexers the one implemented here does not parse the tokens into literals like int/float/etc it just determines a token starts and ends. It is the responsibility of the parser to interpret the string when it has more context.