Elfy is an ELF parsing library that uses mmap (multi-platform) to map files into memory for reading. It supports both 32-bit and 64-bit ELF formats and handles little/big-endian byte ordering.
First, you must fetch the elfy
library for your project:
zig fetch --save git+https://github.com/rawC1nnamon/zig.elfy#master
Then, after b.addExecutable(...)
, add the following code to your build.zig
:
const elfy = b.dependency("elfy", .{
.target = target,
.optimize = optimize,
});
// Where 'exe' is your project executable
exe.root_module.addImport("elfy", elfy.module("elfy"));
Elfy offers a rich API that supports parsing the most relevant ELF data. The binary can be initialized in two modes: .ReadOnly
or .ReadWrite
. The latter allows basic section content modification and the creation of a new ELF with these changes.
const std = @import("std");
const elfy = @import("elfy");
pub fn main() !void {
const allocator = std.heap.page_allocator;
// The binary must be a variable
var binary = try elfy.Elf.init("/bin/cat", .ReadOnly, allocator);
defer binary.deinit();
const header = binary.getHeader();
std.debug.print("Entry {x}, Machine: {s}\n", .{ header.getEntryPoint(), @tagName(header.getMachine()) });
// Iterator return values must also be variables
var sections = try binary.getIterator(elfy.ElfSection);
while (try sections.next()) |section| {
const name = try binary.getSectionName(section);
const section_type = section.getType();
std.debug.print("Name: {s}, Type: {s}\n", .{ name, @tagName(section_type) });
}
}
Elf
is the struct that contains the API and can be accessed this way: elfy.Elf
. You must initialize the binary using the init
function: Elf.init(path: []const u8, mode: MapMode, allocator: Allocator)
(make sure to deinit
the binary using defer binary.deinit()
).
Elfy uses a hybrid parsing method (lazy and eager). Lightweight data, such as sections, string tables, and symbol names, are processed during init
. On the other hand, data like symbol content, dynamic symbols, and relocations are lazily parsed using an Iterator
, which you can obtain with the following function: try binary.getIterator(...)
. This function accepts iterable data structures, which include:
ElfProgram
: Program/segment headers.ElfSection
: Section headers.ElfSymbol
: Symbol sections.ElfDynamic
: Dynamic symbol sections.ElfRelocation
: Relocation sections.
// elfy.ElfProgram, elfy.ElfSection, elfy.ElfSymbol ...
var sections = try binary.getIterator(elfy.ElfSection);
while (try sections.next()) |section| {
// ...
}
The iterator returned by getIterator(...)
contains a next()
function to retrieve the next data structure (you should use a while
loop to iterate) and a reset()
function to reset the iterator. Additionally, the iterator includes flow control fields like index
, remaining
, count
, etc. Avoid modifying these fields.
Lastly, the binary instance returned by init(...)
contains useful functions that facilitate parsing complex structures:
createElf(name: []const u8) ElfError!void
getHeader() ElfHeader
getIterator(comptime T: type) ElfError!Iterator(T)
getSectionName(section: ElfSection) ElfError![]const u8
getSectionData(section: ElfSection) ElfError![]const u8
getSectionByIndex(index: u64) ElfError!ElfSection
getSectionByName(name: []const u8) ElfError!ElfSection
getSectionByType(section_type: SectionType) ElfError!ElfSection
getSectionDataByName(name: []const u8) ElfError![]const u8
modifySectionData(section: ElfSection, data: []const u8) ElfError!void
getSymbolName(symbol: ElfSymbol) ElfError![]const u8
getDynName(dynamic: ElfDynamic) ElfError!?[]const u8
getRelocationLinkedSymbol(relocation: ElfRelocation, reloc_index: u64) ElfError!ElfSymbol
Elfy contains six data structures: ElfHeader
, ElfProgram
, ElfSection
, ElfSymbol
, ElfDynamic
, and ElfRelocation
. These are tagged unions that contain both 32-bit and 64-bit fields, but you don't need to use switch (...)
to access the content. Each data structure has methods to retrieve information (see types). For example:
var symbols = try binary.getIterator(elfy.ElfSymbol);
while (try symbols.next()) |symbol| {
_ = symbol.getInfo();
_ = symbol.getSize();
_ = symbol.getBind();
_ = symbol.getType();
_ = symbol.getVisibility();
// ...
}
There is one special case: relocation.getType(...)
receives a machine value as a parameter (for example, header.getMachine()
) and returns a tagged union containing the relocation type for each architecture. You must use switch (reloc_type)
to unpack it. For example:
const header = binary.getHeader();
var relocations = try binary.getIterator(elfy.ElfRelocation);
while (try relocations.next()) |relocation| {
// relocation.getType() receives the ELF machine as parameter
const linked_symbol = try binary.getRelocationLinkedSymbol(relocation, relocations.index);
const symbol_name = try binary.getSymbolName(linked_symbol);
const reloc_type = try relocation.getType(header.getMachine());
switch (reloc_type) {
.X86_64 => |t| std.debug.print("Type: {s}, Symbol Name: {s}\n", .{ @tagName(t), symbol_name }),
// You can add more architectures if needed (ARM, RISCV, SPARC, etc.)
else => std.debug.print("[!] Unsupported architecture", .{}),
}
}
Supported relocation types are:
@"386"
S390
ARM
PPC
PPC64
MIPS
ALPHA
AARCH64
LOONGARCH
RISCV
SPARC
X86_64
With the binary opened in .ReadWrite
mode, you can modify the content of a section as long as the new content has a length less than or equal to the original buffer. For example:
const buf: []const u8 = &[_]u8{
0x48, 0xc7, 0xc0, 0x01, 0x00, 0x00, 0x00,
0x48, 0xc7, 0xc7, 0x01, 0x00, 0x00, 0x00,
0x48, 0x8d, 0x35, 0x0a, 0x00, 0x00, 0x00,
0x48, 0xc7, 0xc2, 0x0c, 0x00, 0x00, 0x00,
0x0f, 0x05, 0x48, 0xc7, 0xc0, 0x3c, 0x00,
0x00, 0x00, 0x48, 0xc7, 0xc7, 0x00, 0x00,
0x00, 0x00, 0x0f, 0x05, 0x48, 0x65, 0x6c,
0x6c, 0x6f, 0x20, 0x77, 0x6f, 0x72, 0x6c,
0x64,
0x0a,
// ...
};
const text = try binary.getSectionByName(".text");
try binary.modifySectionData(text, buf);
try binary.createElf("new_binary");
I created this library with the features I needed. If you're looking for a data structure that this parser doesn't contain (e.g., ElfNote
, relr
, etc.), you can create custom functions to parse it more easily. For example, if you want to parse notes:
fn parseNote(note: []const u8) !elf.Elf64_Nhdr { ... }
// ...
var sections = try binary.getIterator(elfy.ElfSection);
while (try sections.next()) |section| {
const raw_note = if (section.getType() == .SHT_NOTE) try binary.getSectionData(section);
const note = try parseNote(raw_note);
}