Skip to content

Proposal: Decl Literals #9938

@SpexGuy

Description

@SpexGuy

Enum literals are a powerful and extremely useful feature of Zig. This proposal changes their definition slightly to make them more useful in a wider variety of cases, and renames them to "Decl Literals". I'll start with a thorough description of the feature, and then end with a discussion of the tradeoffs of this proposal.

Description

Part 1: Decl Literals

The current procedure for enum literal casting is to look in the target type for an enum field matching the name of the literal. I propose to generalize that to look instead for a type field (decl or enum/union tag) matching the name of the literal. With this change, decl literals can be coerced to any namespace type. This can be especially useful for modeling default values and common presets. For example:

const DeloreanOptions = packed struct {
    enable_flux_capacitor: bool,
    target_speed_mph: u7,
    enable_remote_control: bool = false,

    pub const time_travel: DeloreanOptions = .{
        .enable_flux_capacitor = true,
        .target_speed_mph = 88,
    };

    pub const highway: DeloreanOptions = .{
        .enable_flux_capacitor = false,
        .target_speed_mph = 60,
    };
};

pub fn startDelorean(options: DeloreanOptions) void { ... }

test {
    // coerce from decl literal to constant
    startDelorean(.time_travel);

    // late binding coercion is supported
    const highway_literal = .highway;
    startDelorean(highway_literal);

    // explicit instantiation still works too.
    startDelorean(.{
        .enable_flux_capacitor = true,
        .target_speed_mph = 88,
        .enable_remote_control = true,
    })
}

Part 2: Operations on Decl Literals

We can further define a couple of operations on decl literals, to take advantage of their ability to infer a namespace:

2A: .decl_literal()

Calling a decl literal does the following operations:

  1. Require a result type.
  2. Look up the decl literal in %1
  3. call %2
  4. coerce %3 to %1 if not forwarding result location

This can remove repetition in initialization code:

var array: std.ArrayList(u32) = .init(allocator);

2B: .decl_literal{ .field = value, ... }

Instantiating a decl literal with this syntax does the following:

  1. Require a result type.
  2. Require that the result type is a union
  3. Look up a field in %1 named decl_literal
  4. Use the struct literal to initialize field %3 of the result location

This extends the current special-case initialization for void tags to work for struct tags as well.

test {
    // init void tag (this works already)
    var t: std.builtin.TypeInfo = .Type;

    // init struct tag
    t = .Pointer{
        .size = .One,
        .is_const = true,
        .is_volatile = false,
        .is_allowzero = false,
        .alignment = 0,
        .address_space = .generic,
        .child = u32,
        .sentinel = @as(?u32, null),
    };

    // init struct tag with late binding
    const tag = if (comptime something()) .Fn else .BoundFn;
    t = tag{
        .calling_convention = .Unspecified,
        .alignment = 0,
        .is_generic = false,
        .is_var_args = false,
        .return_type = u32,
        .args = &[_]FnArg{},
    };
}

Discussion

1: Decl Literals

An extremely common pattern in C when building a bitfield enum is to create extra named constants for common sets of flags. These defaults often behave like a de-facto enum, with custom specifications being very uncommon. Zig's solution to bitfields is to use packed structs. However, a packed struct can have only one default (.{}), which in the case of a bitfield is usually reserved for the zero value. You can declare default values as decls in the bitfield namespace, but in doing so you lose a lot of the ergonomics that those decls might provide. (obj.foo(some.deeply.nested.package.FooFlags.flushCpuCaches)).

This friction causes a conflict when specifying field defaults. You can either specify defaults so that .{} is a useful value, or specify defaults so that fields must be correctly initialized. These two things are often not the same. The second one is safer, but the first is often more ergonomic. With decl literals, there is an ergonomic alternative for useful default values which lets .{} syntax be reserved for intentional initialization.

There is an additional tradeoff between modeling such a structure as a packed struct or an extensible enum. In theory, the packed struct is better on nearly all metrics. It documents the bit meanings, reflection code can understand it, and it's clearer and easier to make custom variants. But in the current language, the common case of using a preset is much less ergonomic with a packed struct than an enum. This feature solves that tradeoff, making packed struct the clear choice.

The std lib and stage 2 compiler don't make heavy use of this sort of bitfield API, but it's common in C/C++ libraries and their zig bindings. Some examples:

https://github.com/SpexGuy/Zig-ImGui/blob/1469da84a3d90e9d96a87690f0202475b0f875df/zig/imgui.zig#L53-L97

https://github.com/MasterQ32/SDL.zig/blob/f3a3384e6a7b268eccb4aa566e952b05ff7eebfc/src/wrapper/sdl.zig#L43-L56

I don't believe that this pattern comes from the language design of C, but instead from the high information density of bitfields. This property carries over to Zig, so there shouldn't be any reason that these sorts of APIs wouldn't be desirable in Zig. I suspect the current lack of them comes from the lack of ergonomics surrounding these features, not because there are "better" patterns that we choose to use instead.

2A: Call syntax

I really like this syntax for initialization, and I think it's a consistent extension of the var x: T = .{} syntax. With the current pattern,

const value = package.SomeType.init(4);

The reader does not necessarily know that the type of value is package.SomeType. This is usually true by convention, but careful readers and editor tools cannot know for sure. In contrast, with the new syntax:

const value: package.SomeType = .init(4);

The reader and tools now know for sure that value must be of type package.SomeType. This syntax conveys extra information, and is consistent with a preference for x: T = .{} over x = T{}.

Examples of code this would affect are everywhere, but here are some examples from the std lib and stage 2:


zig/lib/std/bit_set.zig

Lines 428 to 430 in f42725c

pub fn iterator(self: *const Self, comptime options: IteratorOptions) Iterator(options) {
return Iterator(options).init(&self.masks, last_item_mask);
}

 pub fn iterator(self: *const Self, comptime options: IteratorOptions) Iterator(options) { 
     return .init(&self.masks, last_item_mask); 
 } 

zig/src/Compilation.zig

Lines 1445 to 1451 in f42725c

.emit_analysis = options.emit_analysis,
.emit_docs = options.emit_docs,
.work_queue = std.fifo.LinearFifo(Job, .Dynamic).init(gpa),
.c_object_work_queue = std.fifo.LinearFifo(*CObject, .Dynamic).init(gpa),
.astgen_work_queue = std.fifo.LinearFifo(*Module.File, .Dynamic).init(gpa),
.keep_source_files_loaded = options.keep_source_files_loaded,
.use_clang = use_clang,

            .emit_analysis = options.emit_analysis,
            .emit_docs = options.emit_docs,
            .work_queue = .init(gpa),
            .c_object_work_queue = .init(gpa),
            .astgen_work_queue = .init(gpa),
            .keep_source_files_loaded = options.keep_source_files_loaded,
            .use_clang = use_clang,

zig/src/codegen/spirv.zig

Lines 247 to 261 in f42725c

pub fn init(spv: *SPIRVModule) DeclGen {
return .{
.spv = spv,
.air = undefined,
.liveness = undefined,
.args = std.ArrayList(ResultId).init(spv.gpa),
.next_arg_index = undefined,
.inst_results = InstMap.init(spv.gpa),
.blocks = BlockMap.init(spv.gpa),
.current_block_label_id = undefined,
.code = std.ArrayList(Word).init(spv.gpa),
.decl = undefined,
.error_msg = undefined,
};
}

    pub fn init(spv: *SPIRVModule) DeclGen {
        return .{
            .spv = spv,
            .air = undefined,
            .liveness = undefined,
            .args = .init(spv.gpa),
            .next_arg_index = undefined,
            .inst_results = .init(spv.gpa),
            .blocks = .init(spv.gpa),
            .current_block_label_id = undefined,
            .code = .init(spv.gpa),
            .decl = undefined,
            .error_msg = undefined,
        };
    }

// alu instructions
try expect_opcode(0x07, Insn.add(.r1, 0));
try expect_opcode(0x0f, Insn.add(.r1, .r2));
try expect_opcode(0x17, Insn.sub(.r1, 0));
try expect_opcode(0x1f, Insn.sub(.r1, .r2));
try expect_opcode(0x27, Insn.mul(.r1, 0));
try expect_opcode(0x2f, Insn.mul(.r1, .r2));
try expect_opcode(0x37, Insn.div(.r1, 0));
try expect_opcode(0x3f, Insn.div(.r1, .r2));

    // alu instructions
    try expect_opcode(0x07, .add(.r1, 0));
    try expect_opcode(0x0f, .add(.r1, .r2));
    try expect_opcode(0x17, .sub(.r1, 0));
    try expect_opcode(0x1f, .sub(.r1, .r2));
    try expect_opcode(0x27, .mul(.r1, 0));
    try expect_opcode(0x2f, .mul(.r1, .r2));
    try expect_opcode(0x37, .div(.r1, 0));
    try expect_opcode(0x3f, .div(.r1, .r2));

There may be an argument that this is too implicit, and removes information that would have previously been available. However, it is still clear where to look for the relevant function, and it's clear that a function call is being made. It's also clearer now what the return type of the function is, where that was not known before. So I think this change is still reasonable.

2B: Union struct init syntax

This syntax could be used in a large number of places in the std lib and stage 2 compiler. Search for the regex \.\{ \.\w+ = \.\{ to find them. Some examples for convenience:


zig/src/AstGen.zig

Lines 701 to 704 in f42725c

.data = .{ .@"unreachable" = .{
.safety = true,
.src_node = gz.nodeIndexToRelative(node),
} },

 .data = .@"unreachable"{ 
     .safety = true, 
     .src_node = gz.nodeIndexToRelative(node), 
 },

zig/src/AstGen.zig

Lines 6079 to 6082 in f42725c

.data = .{ .switch_capture = .{
.switch_inst = switch_block,
.prong_index = undefined,
} },

 .data = .switch_capture{ 
     .switch_inst = switch_block, 
     .prong_index = undefined, 
 }, 

Because the void tag syntax works, I intuitively expected the proposed syntax to work as well. So I think this feature has a certain amount of consistency on its side. However, it also has some significant drawbacks:

  • It makes multiple ways to initialize a union
  • It only works for structs or fixed size arrays

There are alternatives, but I don't like them either:

- The above but also .tag{ value } initializes tag to value

  • Kind of strange, we don't allow braced init anywhere else. Also it's ambiguous for an array type of length 1.

- const u: U = .tag = value;

  • This just drops the .{}. Also it's difficult to read, and it's a new syntactic form which would now be allowed in non-typechecked code.

- const u: U = .tag: value;

  • This is inconsistent, : specifies types in all other situations, not values.

- const u: U = .tag value;

  • This one looks kind of cool: val = .tag.{ .x = 4, .y = 6 };. But we don't use bare word order like this anywhere else in the language. It's probably ambiguous with something.

- const u: U = .tag(init_expr);

  • Ambiguous with a function call, would kind of break the "function calls look like function calls" rule. If we were going to use any of these options, this would be my preference. But I don't think it's needed.

Because of this, I don't think 2B should be accepted. But I wanted to put it out there anyway for completeness.

Metadata

Metadata

Assignees

No one assigned

    Labels

    acceptedThis proposal is planned.proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions