Skip to content

fex: Add support for library forwarding #413255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

neobrain
Copy link
Contributor

@neobrain neobrain commented Jun 2, 2025

Library forwarding (misleadingly often called "thunks") is a FEX feature that redirects API calls to specific ARM64 host libraries instead of emulating the full x86 builds of those libraries.

This has multiple benefits:

  • it improves performance due to avoiding translation overhead (particularly in games)
  • it enables support for proprietary / non-Mesa GPU drivers
  • it's required for compatibility with some specific applications

To test this feature locally, make sure to run FEXConfig and enable the various libraries in the Libraries tab.

Things done

The new derivation successfully builds support for all libraries FEX can forward by creating a development RootFS containing glibc/stdlibc++ headers and some additional library headers. As far as the build-side is concerned, this is all perfectly working. (I was actually surprised how easy nix makes this!)

Unfortunately there is a fundamental roadblock I'm unsure how to resolve: To enable library forwarding (for example for libvulkan), FEX uses a libvulkan-host.so wrapper library that dlopen()s the true ARM64 libvulkan.so installed on the host. Currently that dlopen call fails, since it only searches the nix store. Looking in the nix store doesn't really make sense here though, since FEX is more of a runtime and the true library consumer is the emulated application (which itself is typically not built by nix).

This problem isn't exclusive to Vulkan but probably applies to all other libraries FEX can forward. I'm guessing there is a "right way" to do this on NixOS, but ideally a non-NixOS nix-shell should be able to just load an externally managed /usr/lib64/libvulkan.so without further ado.

Does anyone with more nix experience than me have any advice on how to proceed here?

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • Nixpkgs 25.11 Release Notes (or backporting 24.11 and 25.05 Nixpkgs Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
  • NixOS 25.11 Release Notes (or backporting 24.11 and 25.05 NixOS Release notes)
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.


src = fetchFromGitHub {
owner = "FEX-Emu";
owner = "neobrain";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently required for neobrain/FEX@5811914 , which will be included in the next FEX release (2506).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like quite a simple patch. Would it be possible to use substituteInPlace for that instead, like this? Or is there something else in that branch that is needed?

substituteInPlace ThunkLibs/GuestLibs/CMakeLists.txt \
  --replace-fail "--target=i686-linux-unknown" "--target=i686-linux-gnu"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch just got merged upstream and a release is scheduled later this week, so we can drop the use of my fork soon :)

@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Jun 2, 2025
@nix-owners nix-owners bot requested a review from andre4ik3 June 2, 2025 15:16
@NixOSInfra NixOSInfra added the 12.first-time contribution This PR is the author's first one; please be gentle! label Jun 2, 2025
Copy link
Member

@andre4ik3 andre4ik3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I compiled it and ran the tests, seems to work fine. Tested running basic binaries too. However I'm not able to test the library forwarding functionality as I don't (currently) have any graphical aarch64 host. I'll try to set up a VM to test it, but for now here is some feedback on just the package file :)


src = fetchFromGitHub {
owner = "FEX-Emu";
owner = "neobrain";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like quite a simple patch. Would it be possible to use substituteInPlace for that instead, like this? Or is there something else in that branch that is needed?

substituteInPlace ThunkLibs/GuestLibs/CMakeLists.txt \
  --replace-fail "--target=i686-linux-unknown" "--target=i686-linux-gnu"

llvmPackages.stdenv.mkDerivation (finalAttrs: {
pname = "fex";
version = "2505";
# version = "2505";
version = "5811914a784cc6fdfdc90242ab7f8ea6ff39ec38";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to prefix it with 2505 or 2506 to at least indicate what base version it is? I'm not sure having just a raw commit hash as a version is good...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should just wait for 2506 to be released

@andre4ik3
Copy link
Member

Oh also nixfmt is failing, please run nix fmt :P

@neobrain
Copy link
Contributor Author

neobrain commented Jun 2, 2025

I compiled it and ran the tests, seems to work fine. Tested running basic binaries too. However I'm not able to test the library forwarding functionality as I don't (currently) have any graphical aarch64 host. I'll try to set up a VM to test it, but for now here is some feedback on just the package file :)

Thanks for the suggestions and for testing so quickly!

FWIW I think for Vulkan you should be able to run vulkaninfo even on headless setups, assuming you can install at least the lvp (lavapipe) driver on that system.

@kuruczgy
Copy link
Contributor

kuruczgy commented Jun 2, 2025

To test this feature locally, make sure to run FEXConfig and set up the appropriate paths in the Libraries tab (fex-emu/HostThunks and fex-emu/GuestThunks from the nix-store, respectively).

Could elaborate a bit more on this? What paths should be set exactly? Are the defaults appropriate?

For me this PR works and I can run Portal 2 fine, but this feels a bit too easy... How can I verify that it's actually running with the forwarded libraries?

Also I noticed that the HostThunks are located at /nix/store/...-fex-.../nix/store/...-fex.../lib/fex-emu/HostThunks, somehow the $out path is getting appended to itself.

Unfortunately there is a fundamental roadblock I'm unsure how to resolve: To enable library forwarding (for example for libvulkan), FEX uses a libvulkan-host.so wrapper library that dlopen()s the true ARM64 libvulkan.so installed on the host. Currently that dlopen call fails, since it only searches the nix store. Looking in the nix store doesn't really make sense here though, since FEX is more of a runtime and the true library consumer is the emulated application (which itself is typically not built by nix).

This problem isn't exclusive to Vulkan but probably applies to all other libraries FEX can forward. I'm guessing there is a "right way" to do this on NixOS, but ideally a non-NixOS nix-shell should be able to just load an externally managed /usr/lib64/libvulkan.so without further ado.

Could you explain in a little more detail how all this is supposed to work? Here is my current understanding:

  • When a guest program tries to open something that looks like a libvulkan.so, FEX instead opens libvulkan-host.so (which is an aarch64-linux lib), and sets up all the magic required for the guest to call into it.
  • libvulkan-host.so is just a shim that dlopens a libvulkan.so from the host, and just forwards all the calls.
  • The path to libvulkan.so should be known be known at compile time (it's "${vulkan-loader}/lib/libvulkan.so"), if necessary this can be hardcoded into the dlopen call. But also does dlopen not respect RPATH? That should also contain "${vulkan-loader}/lib", and then dlopen("libvulkan.so") or something similar should just work.

@neobrain
Copy link
Contributor Author

neobrain commented Jun 2, 2025

To test this feature locally, make sure to run FEXConfig and set up the appropriate paths in the Libraries tab (fex-emu/HostThunks and fex-emu/GuestThunks from the nix-store, respectively).

Could elaborate a bit more on this? What paths should be set exactly? Are the defaults appropriate?

For me this PR works and I can run Portal 2 fine, but this feels a bit too easy... How can I verify that it's actually running with the forwarded libraries?

Thanks for checking back on this: The defaults will not enable any library forwarding, so instead do the following:

  • In FEXConfig's Libraries tab, tick the checkboxes for libraries that should use forwarding
  • In FEXConfig's Libraries tab, set Host library folder to /nix/store/<fex>/lib/fex-emu/HostThunks and Guest library folder to /nix/store/<fex>/share/fex-emu/GuestThunks (nixpkgs should probably just patch FEX to hardcode this path) (no longer needed)
  • For 32-bit applications you need to manually edit ~/.fex-emu/Config.json and point the ThunkHostLibs32 and ThunkGuestLibs32 keys to the corresponding HostThunk_32 and GuestThunks_32 paths. (no longer needed)

If set up properly and library forwarding successfully kicks in, FEX should log a LoadLib: message for each forwarded library loaded by the emulated application (this is printed to FEXServer or stderr, depending on your logging configuration in FEXConfig):

[DEBUG] LoadLib: libvulkan -> /nix/store//lib/fex-emu/HostThunks/libvulkan-host.so

Could you explain in a little more detail how all this is supposed to work? Here is my current understanding:

You got it mostly right. We can't entirely rely on magic though, so each of the layers is a bit thicker than you might expect:

  • FEX never exposes the "true" x86 libvulkan.so to the guest. Instead, FEX intercepts any filesystem syscalls that touch libvulkan.so and returns a custom libvulkan-guest.so to the guest. This is still an x86 library, which is why cross-compilation is required in this PR.
  • libvulkan-guest.so exports the same functions that the original library exported, but each function is a stub that triggers a FEX-specific architecture transition to call into the corresponding function from libvulkan-host.so. (Depending on the library and on the function, more bespoke logic may be involved.)
  • libvulkan-host.so (indeed an aarch64-linux lib) also exports the same functions as the original library. This library will query function pointers from the host aarch64 libvulkan.so via dlopen and dlsym, so that its function implementations can invoke the true host implementation. (Again, depending on the library and on the function, more bespoke logic may be involved.)

Due to how flexible the system needs to be to deal with all the oddities in each library, I implemented custom tooling for FEX to automate part of the boilerplate required in all this process:

  • This is the thunkgen tool in FEX/ThunkLibs/Generator
  • It parses library headers using libclang, parses function declarations along with parameters, and generates the boilerplate used for libvulkan-guest.so and for libvulkan-host.so
  • thunkgen parses the library headers four times: Once in the 32-bit guest context (x86), once in the 64-bit guest context (x86_64), and twice in the host context (arm64). This is done to detect any potential data layout differences between the architectures and emit struct repacking logic as needed. That's also why a dev rootfs is needed with x86 headers.

It gets more complicated once you involve things like function pointers, but as far as packaging is concerned I've already gone into more detail than you probably cared about :)

The path to libvulkan.so should be known be known at compile time (it's "${vulkan-loader}/lib/libvulkan.so"), if necessary this can be hardcoded into the dlopen call. But also does dlopen not respect RPATH? That should also contain "${vulkan-loader}/lib", and then dlopen("libvulkan.so") or something similar should just work.

I'm not familiar with RPATH , but I do remember some messages in the build logs about it that only came up with nix. Perhaps related? I'll look again once I fix my nix environment. I remembered incorrectly. Doesn't look like LD_DEBUG=libs works with nix, so I'm not sure how to get more information out of it than "it doesn't work".

@neobrain
Copy link
Contributor Author

neobrain commented Jun 2, 2025

Also I noticed that the HostThunks are located at /nix/store/...-fex-.../nix/store/...-fex.../lib/fex-emu/HostThunks, somehow the $out path is getting appended to itself.

FEX's CMake files use ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}/fex-emu to determine the installation path for the -host.so libraries. Apparently CMAKE_INSTALL_LIBDIR is an absolute path into the nix store, whereas CMake documentation says it's either lib or lib64. This is easy to solve with another substituteAll line, but I'm surprised you're not hitting this in other packages.

EDIT: Using CMAKE_INSTALL_FULL_LIBDIR works and is actually shorter anyway. Added this as a commit in my fork, hopefully will be included in the upcoming release as well.

@neobrain
Copy link
Contributor Author

neobrain commented Jun 3, 2025

I've been looking some more into the dlopen issue. As suspected, dlopen will search in all sorts of paths in the nix store, but it will skip /usr/lib64 on my openSUSE setup (confirmed this via strace).

The path to libvulkan.so should be known be known at compile time (it's "${vulkan-loader}/lib/libvulkan.so"),

Why though? It's the emulated application that depends on Vulkan, not FEX itself, so FEX seems like the wrong place to inject a vulkan-loader package dependency into. Note that we're planning on massively scaling up the number of libraries we forward in FEX, so such a dependency would need to be added for every single supported library.

if necessary this can be hardcoded into the dlopen call

Sadly using dlopen("/usr/lib64/libvulkan.so") will just fail later down the road when the Vulkan loader attempts to load its own dependencies (libz.so & friends).

But also does dlopen not respect RPATH? That should also contain "${vulkan-loader}/lib", and then dlopen("libvulkan.so") or something similar should just work.

Now that I understand a little bit better what RPATH is: Does this require adding vulkan-loader as a dependency to FEX, and if not then what use cases would this help with? (Note that FEX is typically ran on proprietary programs, which generally aren't built with nix.)

EDIT: Actually, even when using vulkan-loader the same problem comes up: libvulkan.so.1 will be found from the nix-store, but the actual host driver (as listed by absolute path in /usr/share/vulkan/icd.d/xxx_icd.aarch64.json) can't be loaded since it still depends on e.g. /usr/lib64/libz.so.1. Meanwhile, it seems that I can't use LD_LIBRARY_PATH=/usr/lib64 since that overrides libc as well, which breaks the nix-built FEX entirely.

@kuruczgy
Copy link
Contributor

kuruczgy commented Jun 3, 2025

Why though? It's the emulated application that depends on Vulkan, not FEX itself, so FEX seems like the wrong place to inject a vulkan-loader package dependency into.

Well the emulated application depends on the x86 vulkan-loader. How would you guarantee that an aarch64 vulkan-loader is even present on the system? I don't see any other way than just adding it as a dep of FEX.

Note that we're planning on massively scaling up the number of libraries we forward in FEX, so such a dependency would need to be added for every single supported library.

Indeed that seems like an issue. For now I would just kick this issue down the road, and focus on the currently available forwarded libraries.

Sadly using dlopen("/usr/lib64/libvulkan.so")

Nothing guarantees that /usr/lib64/libvulkan.so either exists, or if it does it's an aarch64 and not an arm lib.

Now that I understand a little bit better what RPATH is: Does this require adding vulkan-loader as a dependency to FEX, and if not then what use cases would this help with?

Yes, I think for now we should add vulkan-loader as a dep.

@neobrain
Copy link
Contributor Author

neobrain commented Jun 3, 2025

Yes, I think for now we should add vulkan-loader as a dep.

Seems like an okay (if unsatisfactory) solution for NixOS deployments. What would non-NixOS users do, where vulkan-loader will still fail?

@neobrain
Copy link
Contributor Author

neobrain commented Jun 3, 2025

With the latest update, FEXConfig will properly initialize HostThunks and GuestThunks automatically if ~/.fex-emu/Config.json does not already exist. The paths won't be updated automatically when updating the FEX package though, so we'll still want to fix this by hardcoding the proper paths in the FEX source.

@andre4ik3 andre4ik3 mentioned this pull request Jun 4, 2025
13 tasks
@neobrain
Copy link
Contributor Author

neobrain commented Jun 9, 2025

The paths won't be updated automatically when updating the FEX package though, so we'll still want to fix this by hardcoding the proper paths in the FEX source.

What would be a good way to do this? Is it possible to know and query the final store path of FEX itself and patch its own sources with that? Otherwise, I think I would have to fall back to inferring the location at runtime based on the FEX executable path or similar.

@andre4ik3
Copy link
Member

Is it possible to know and query the final store path of FEX itself

If you are talking about patching FEX sources so it has a reference to the store path where it's installed, you can try substituteInPlace/replaceVars with $out


# Add include paths for thunkgen invocation
substituteInPlace ThunkLibs/HostLibs/CMakeLists.txt \
--replace-fail "-- " "-- $(cat ${llvmPackages.stdenv.cc}/nix-support/libc-cflags) $(cat ${llvmPackages.stdenv.cc}/nix-support/libcxx-cxxflags) $NIX_CFLAGS_COMPILE"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I move from stdenvGcc13 to stdenv(Gcc14), this fails to find C++ headers (which are located e.g. in 5nyv2zzsm5ckfrxji7jgmhf5nb31snwq-x86_64-unknown-linux-gnu-gcc-13.3.0/include/c++/13.3.0/type_traits). Any advice how I can retrieve that path symbolically? I tried using gcc-unwrapped, but that is somehow not recognized within the llvmPackages context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually turns out the HostLibs build is fine and it was the GuestLibs that were failing. I'm added the aarch64 libstdc++ header paths to the latter and it seems to compile fine (presumably it's the same set of headers as in pkgsCross.gnu64/32 anyway).

@rowanG077
Copy link
Member

rowanG077 commented Jun 9, 2025

Just as a comment. I'm running on a apple silicon linux with 16k pages and I get an error when building this PR about a jemalloc pagesize mismatch:

[2626/6715] Linking CXX executable EmitterTests/Emitter_SVE_Tests
FAILED: EmitterTests/Emitter_SVE_Tests FEXCore/unittests/Emitter/Emitter_SVE_Tests-b12d07c_tests.cmake /build/source/build/FEXCore/unittests/Emitter/Emitter_SVE_Tests-b12d07c_tests.cmake 
: && /nix/store/vp6zgsnwkqjn89lgdkizr2vpxp6zvv9m-clang-wrapper-19.1.7/bin/clang++ -O3 -DNDEBUG -fomit-frame-pointer -flto=thin -fuse-ld=lld -fPIE -pie -Xlinker --dependency-file=FEXCore/unittests/Emitter/CMakeFiles/Emitter_SVE_Tests.dir/link.d External/vixl/src/CMakeFiles/vixl.dir/code-buffer-vixl.cc.o External/vixl/src/CMakeFiles/vixl.dir/compiler-intrinsics-vixl.cc.o External/vixl/src/CMakeFiles/vixl.dir/cpu-features.cc.o External/vixl/src/CMakeFiles/vixl.dir/utils-vixl.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/assembler-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/assembler-sve-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/debugger-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/decoder-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/disasm-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/cpu-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/instructions-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/macro-assembler-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/macro-assembler-sve-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/operands-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/pointer-auth-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/registers-aarch64.cc.o FEXCore/unittests/Emitter/CMakeFiles/Emitter_SVE_Tests.dir/SVE_Tests.cpp.o -o EmitterTests/Emitter_SVE_Tests  External/Catch2/src/libCatch2Main.a  FEXCore/Source/libFEXCore_Base.a  FEXCore/Source/libJemallocLibs.a  External/Catch2/src/libCatch2.a  /nix/store/pi9j5cp4jdqb8y0fwbky0cp16vzcddi3-fmt-10.2.1/lib/libfmt.so.10.2.1  /nix/store/4bz4qd9lardx0fmnc43h3wimsj5bagci-xxHash-0.8.3/lib/libxxhash.so  External/cephes/libcephes_128bit.a  External/SoftFloat-3e/libsoftfloat_3e.a  -ldl  External/jemalloc/libFEX_jemalloc.a  External/jemalloc_glibc/libFEX_jemalloc_glibc.a  -lpthread && cd /build/source/build/FEXCore/unittests/Emitter && /nix/store/9cbpyw2256lkll43i97438fhb26h0xvc-cmake-3.31.6/bin/cmake -D TEST_TARGET=Emitter_SVE_Tests -D TEST_EXECUTABLE=/build/source/build/EmitterTests/Emitter_SVE_Tests -D TEST_EXECUTOR= -D TEST_WORKING_DIR=/build/source/build/FEXCore/unittests/Emitter -D TEST_SPEC= -D TEST_EXTRA_ARGS= -D TEST_PROPERTIES= -D TEST_PREFIX= -D TEST_SUFFIX=.SVE_Tests.Emitter -D TEST_LIST=Emitter_SVE_Tests_TESTS -D TEST_REPORTER= -D TEST_OUTPUT_DIR= -D TEST_OUTPUT_PREFIX= -D TEST_OUTPUT_SUFFIX= -D TEST_DL_PATHS= -D CTEST_FILE=/build/source/build/FEXCore/unittests/Emitter/Emitter_SVE_Tests-b12d07c_tests.cmake -P /build/source/External/Catch2/extras/CatchAddTests.cmake
<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
terminate called without an active exception
CMake Error at /build/source/External/Catch2/extras/CatchAddTests.cmake:70 (message):
  Error running test executable
  '/build/source/build/EmitterTests/Emitter_SVE_Tests':

    Result: Subprocess aborted
    Output: 

@kuruczgy
Copy link
Contributor

kuruczgy commented Jun 9, 2025

16k pages

AFAIK the official upstream stance is that only 4k pages are supported by FEX.

@rowanG077
Copy link
Member

rowanG077 commented Jun 9, 2025

Right when running programs. I would expect to build this at least to be able to use muvm to run fex. But if the case that a build does not work it will mean the entire fex build with need to be done in muvm as well.

@neobrain
Copy link
Contributor Author

neobrain commented Jun 9, 2025

Just as a comment. I'm running on a apple silicon linux with 16k pages and I get an error when building this PR about a jemalloc pagesize mismatch:

You would have to set doCheck=false to skip the tests to finish the build, but FEX indeed doesn't support running on kernels with 16K page sizes. The common Asahi workaround is to run FEX in a microvm. https://github.com/nrabulinski/nixos-muvm-fex might help, but it's in experimental state AFAIK.

@neobrain
Copy link
Contributor Author

neobrain commented Jun 9, 2025

If you are talking about patching FEX sources so it has a reference to the store path where it's installed, you can try substituteInPlace/replaceVars with $out

Perfect, thanks! The correct paths are now baked into the FEX executables, so now the only configuration needed is ticking the FEXConfig checkboxes for libraries that should be forwarded.

Copy link
Member

@andre4ik3 andre4ik3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, builds and runs the test suite OK. Also verified there's no longer a nested Nix directory in the result.

@neobrain neobrain force-pushed the fex_library_forwarding branch from 476ba94 to fb7f952 Compare June 14, 2025 10:43
@neobrain
Copy link
Contributor Author

LGTM, builds and runs the test suite OK. Also verified there's no longer a nested Nix directory in the result.

Thanks for re-checking! I rebased and squashed the patches in the PR now (no other change).

I've been thinking some more about the "dlopen won't find /usr/lib64/libXYZ.so" problem some more. I think the most reasonable (if unsatisfying) option is to offload that problem to the user, who need to ensure the libraries are visible at runtime. For NixOS, this could be done using nixGL for OpenGL and Vulkan, and NIX_LD_LIBRARY_PATH for other libraries.

If that's okay, this should be ready to merge now!

@github-actions github-actions bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Jun 14, 2025
@andre4ik3
Copy link
Member

the "dlopen won't find /usr/lib64/libXYZ.so" problem some more. I think the most reasonable (if unsatisfying) option is to offload that problem to the user, who need to ensure the libraries are visible at runtime. For NixOS, this could be done using nixGL for OpenGL and Vulkan, and NIX_LD_LIBRARY_PATH for other libraries.

In NixOS this can probably be abstracted with a module and FHS environment for user-specified packages that should be wrapped, similar to how Steam is wrapped, although I'm not sure. I'll try and implement it after this PR is merged.

@nixpkgs-ci nixpkgs-ci bot added 12.approved-by: package-maintainer This PR was reviewed and approved by a maintainer listed in any of the changed packages. and removed 12.first-time contribution This PR is the author's first one; please be gentle! labels Jun 25, 2025
@valpackett
Copy link

valpackett commented Jul 6, 2025

Currently testing on non-NixOS (postmarketOS), with a flake that generates a dev shell, I want it to just use Mesa from nixpkgs:

      devShells.aarch64-linux.default =
        armpkgs.mkShell { packages = [ armpkgs.fex armpkgs.mesa pkgs.vulkan-tools ]; };
Qt's X11 backend doesn't like freedreno(?)
❯ FEXConfig
Warning: Ignoring XDG_SESSION_TYPE=wayland on Gnome. Use QT_QPA_PLATFORM=wayland to run on Wayland anyway.

(process:2017432): Gtk-WARNING **: 03:24:45.559: Locale not supported by C library.
	Using the fallback 'C' locale.
Opening /home/val/.config/.fex-emu/Config.json
qt.glx: qglx_findConfig: Failed to finding matching FBConfig for QSurfaceFormat(version 2.0, options QFlags<QSurfaceFormat::FormatOption>(), depthBufferSize -1, redBufferSize 1, greenBufferSize 1, blueBufferSize 1, alphaBufferSize -1, stencilBufferSize -1, samples -1, swapBehavior QSurfaceFormat::SingleBuffer, swapInterval 1, colorSpace QSurfaceFormat::DefaultColorSpace, profile  QSurfaceFormat::NoProfile)
qt.glx: qglx_findConfig: Failed to finding matching FBConfig for QSurfaceFormat(version 2.0, options QFlags<QSurfaceFormat::FormatOption>(), depthBufferSize -1, redBufferSize 1, greenBufferSize 1, blueBufferSize 1, alphaBufferSize -1, stencilBufferSize -1, samples -1, swapBehavior QSurfaceFormat::SingleBuffer, swapInterval 1, colorSpace QSurfaceFormat::DefaultColorSpace, profile  QSurfaceFormat::NoProfile)
qt.glx: qglx_findConfig: Failed to finding matching FBConfig for QSurfaceFormat(version 2.0, options QFlags<QSurfaceFormat::FormatOption>(), depthBufferSize -1, redBufferSize 1, greenBufferSize 1, blueBufferSize 1, alphaBufferSize -1, stencilBufferSize -1, samples -1, swapBehavior QSurfaceFormat::SingleBuffer, swapInterval 1, colorSpace QSurfaceFormat::DefaultColorSpace, profile  QSurfaceFormat::NoProfile)
Could not initialize GLX
fish: Job 1, 'FEXConfig' terminated by signal SIGABRT (Abort)

❯ QT_QPA_PLATFORM=wayland FEXConfig
Warning: Ignoring XDG_SESSION_TYPE=wayland on Gnome. Use QT_QPA_PLATFORM=wayland to run on Wayland anyway.
qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in ""
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, xcb

So I've also had to add armpkgs.libsForQt5.qt5.qtwayland.

Then in FEXConfig I ticked logging + host libraries (drm, Vulkan, WaylandClient, GL) and saved the config.

❯ file (which vulkaninfo)
/nix/store/pm6i90ajmz3xc6d1aiynr3z9xf8a89fi-vulkan-tools-1.4.313.0/bin/vulkaninfo: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/cg9s562sa33k78m63njfn1rw47dp9z0i-glibc-2.40-66/lib/ld-linux-x86-64.so.2, for GNU/Linux 3.10.0, not stripped
❯ FEXInterpreter (which vulkaninfo)
ERROR: [Loader Message] Code 0 : /nix/store/bpsn4p46ga7pfmprmbicmvl7j00xdsjd-mesa-25.1.3/lib/libvulkan_asahi.so: cannot open shared object file: No such file or directory
ERROR: [Loader Message] Code 0 : loader_icd_scan: Failed loading library associated with ICD JSON /nix/store/bpsn4p46ga7pfmprmbicmvl7j00xdsjd-mesa-25.1.3/lib/libvulkan_asahi.so. Ignoring this JSON
ERROR: [Loader Message] Code 0 : /nix/store/bpsn4p46ga7pfmprmbicmvl7j00xdsjd-mesa-25.1.3/lib/libvulkan_broadcom.so: cannot open shared object file: No such file or directory
ERROR: [Loader Message] Code 0 : loader_icd_scan: Failed loading library associated with ICD JSON /nix/store/bpsn4p46ga7pfmprmbicmvl7j00xdsjd-mesa-25.1.3/lib/libvulkan_broadcom.so. Ignoring this JSON
[…]
ERROR: [Loader Message] Code 0 : vkCreateInstance: Found no drivers!
Cannot create Vulkan instance.
❯ file /nix/store/bpsn4p46ga7pfmprmbicmvl7j00xdsjd-mesa-25.1.3/lib/libvulkan_freedreno.so
/nix/store/bpsn4p46ga7pfmprmbicmvl7j00xdsjd-mesa-25.1.3/lib/libvulkan_freedreno.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, BuildID[sha1]=64ebb13d42c0577bad0610d4e9d38e54566f8185, not stripped

Somehow the config is ignored, there's no logging nor are thunks used..?! Looking at strace, config is read, thunk libs are checked.. but x86-64 libvulkan is indeed loaded without replacement:

openat(AT_FDCWD, "/home/val/.config/.fex-emu/AppConfig/vulkaninfo.json", O_RDONLY) = -1 ENOENT (No such file or directory)
faccessat(AT_FDCWD, "/nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/share/fex-emu/GuestThunks//libdrm-guest.so", F_OK) = 0
faccessat(AT_FDCWD, "/nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/share/fex-emu/GuestThunks//libvulkan-guest.so", F_OK) = 0
faccessat(AT_FDCWD, "/nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/share/fex-emu/GuestThunks//libwayland-client-guest.so", F_OK) = 0
faccessat(AT_FDCWD, "/nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/share/fex-emu/GuestThunks//libGL-guest.so", F_OK) = 0
[...]
openat(AT_FDCWD, "/nix/store/nszik1q8ffmvsqk54kbc75dwyxwvi2nm-vulkan-loader-1.4.313.0/lib/libvulkan.so.1", O_RDONLY|O_CLOEXEC) = 5

Adding the explicit path in a ~/.config/.fex-emu/ThunksDB.json

{
  "DB": {
    "Vulkan": {
      "Library": "libvulkan-guest.so",
      "Overlay": [
        "/nix/store/nszik1q8ffmvsqk54kbc75dwyxwvi2nm-vulkan-loader-1.4.313.0/lib/libvulkan.so",
        "/nix/store/nszik1q8ffmvsqk54kbc75dwyxwvi2nm-vulkan-loader-1.4.313.0/lib/libvulkan.so.1",
        "/nix/store/nszik1q8ffmvsqk54kbc75dwyxwvi2nm-vulkan-loader-1.4.313.0/lib/libvulkan.so.1.4.313",
        "@PREFIX_LIB@/libvulkan.so",
        "@PREFIX_LIB@/libvulkan.so.1",
        "@HOME@/.local/share/Steam/ubuntu12_32/steam-runtime/pinned_libs_64/libvulkan.so.1"
      ]
    }
  }
}

makes it attempt the replacement…

and crash :(
❯ FEXInterpreter (which vulkaninfo)
[DEBUG] LoadLib: libvulkan -> /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libvulkan-host.so
[ERROR] LoadLib: Failed to initialize thunk library libvulkan. Check if the corresponding host library is installed or disable thunking of this library.
fish: Job 1, 'FEXInterpreter (which vulkaninf…' terminated by signal SIGILL (Illegal instruction)
❯ lldb (which FEXInterpreter) -- (which vulkaninfo)
(lldb) r
Process 2022991 launched: '/nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/bin/FEXInterpreter' (aarch64)
[DEBUG] LoadLib: libvulkan -> /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libvulkan-host.so
[ERROR] LoadLib: Failed to initialize thunk library libvulkan. Check if the corresponding host library is installed or disable thunking of this library.
Process 2022991 stopped
* thread #1, name = 'FEXInterpreter', stop reason = signal SIGILL: illegal opcode
    frame #0: 0x0000aaaaaae1c0ec FEXInterpreter`___lldb_unnamed_symbol6557
FEXInterpreter`___lldb_unnamed_symbol6557:
->  0xaaaaaae1c0ec <+0>: hlt    #0x1

FEXInterpreter`LogMan::Throw::InstallHandler:
    0xaaaaaae1c0f0 <+0>: ret    

FEXInterpreter`LogMan::Throw::UnInstallHandler:
    0xaaaaaae1c0f4 <+0>: ret    

FEXInterpreter`LogMan::Msg::InstallHandler:
    0xaaaaaae1c0f8 <+0>: adrp   x8, 1311
(lldb) bt
* thread #1, name = 'FEXInterpreter', stop reason = signal SIGILL: illegal opcode
  * frame #0: 0x0000aaaaaae1c0ec FEXInterpreter`___lldb_unnamed_symbol6557
    frame #1: 0x0000aaaaaacc0e24 FEXInterpreter`FEX::HLE::ThunkHandler_impl::LoadLib(std::basic_string_view<char, std::char_traits<char>>) + 3368

(same with libwayland-client first if I also add that)

…wait, does the thunk even have a way to find the Nix aarch64 libs?

❯ ldd /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libwayland-client-host.so
	linux-vdso.so.1 (0x0000fffff7ffe000)
	libdl.so.2 => /nix/store/7kpxf47mzykkdn39lcnhj9z9ngpihamf-glibc-2.40-66/lib/libdl.so.2 (0x0000fffff7f40000)
	libstdc++.so.6 => /nix/store/gd2iijcd4yaglkmkz3csbbvg81nd3k7x-gcc-14.2.1.20250322-lib/lib/libstdc++.so.6 (0x0000fffff7cd0000)
	libm.so.6 => /nix/store/7kpxf47mzykkdn39lcnhj9z9ngpihamf-glibc-2.40-66/lib/libm.so.6 (0x0000fffff7c20000)
	libgcc_s.so.1 => /nix/store/gd2iijcd4yaglkmkz3csbbvg81nd3k7x-gcc-14.2.1.20250322-lib/lib/libgcc_s.so.1 (0x0000fffff7be0000)
	libc.so.6 => /nix/store/7kpxf47mzykkdn39lcnhj9z9ngpihamf-glibc-2.40-66/lib/libc.so.6 (0x0000fffff7a00000)
	/nix/store/7kpxf47mzykkdn39lcnhj9z9ngpihamf-glibc-2.40-66/lib/ld-linux-aarch64.so.1 (0x0000fffff7fb0000)
❯ patchelf --print-rpath /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libwayland-client-host.so
/nix/store/7kpxf47mzykkdn39lcnhj9z9ngpihamf-glibc-2.40-66/lib:/nix/store/gd2iijcd4yaglkmkz3csbbvg81nd3k7x-gcc-14.2.1.20250322-lib/lib
❯ strings /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libwayland-client-host.so | rg nix
/nix/store/7kpxf47mzykkdn39lcnhj9z9ngpihamf-glibc-2.40-66/lib:/nix/store/gd2iijcd4yaglkmkz3csbbvg81nd3k7x-gcc-14.2.1.20250322-lib/lib

uuuhh.. maybe on NixOS somehow? (0.o) oh, this is the dlopen issue discussed above. My first thought was to run some cursed commands to mess directly with the nix store to add the rpaths :D

❯ doas patchelf --add-rpath /nix/store/f7gkj7w3yryys36dldvixff26sy44y69-wayland-1.23.1/lib/ /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libwayland-client-host.so
❯  doas patchelf --add-rpath /nix/store/pxkv9p2rf4phqckl6iaq4ni6myi4b2w7-vulkan-loader-1.4.313.0/lib /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libvulkan-host.so
❯ lldb (which FEXInterpreter) -- (which vulkaninfo)
(lldb) r
Process 2025631 launched: '/nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/bin/FEXInterpreter' (aarch64)
[DEBUG] LoadLib: libwayland-client -> /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libwayland-client-host.so
[DEBUG] Loaded 41 syms
[DEBUG] LoadLib: libvulkan -> /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libvulkan-host.so
[DEBUG] Loaded 1357 syms
[DEBUG] Thunks: Adding host trampoline for guest function 0x7ffff7d85660 via unpacker 0x7ffff7697ab0
[…]
[DEBUG] Thunks: Adding guest trampoline from address 0x7ffff7505200 to guest function 0x7ffff7698090
Process 2025631 stopped
* thread #1, name = 'FEXInterpreter', stop reason = signal SIGSEGV: address not mapped to object (fault address=0x0)
    frame #0: 0x0000000000000000
error: memory read failed for 0x0
(lldb) bt
* thread #1, name = 'FEXInterpreter', stop reason = signal SIGSEGV: address not mapped to object (fault address=0x0)
  * frame #0: 0x0000000000000000
    frame #1: 0x00007ffff75e58cc libvulkan-host.so`X11Manager::GuestToHostDisplay(_XDisplay*) + 668
    frame #2: 0x00007ffff75edc7c libvulkan-host.so`void GuestWrapperForHostFunction<VkResult (VkInstance_T*, VkXlibSurfaceCreateInfoKHR const*, VkAllocationCallbacks const*, unsigned long*), VkInstance_T*, VkXlibSurfaceCreateInfoKHR const*, VkAllocationCallbacks const*, unsigned long*>::Call<ParameterAnnotations{}, ParameterAnnotations{false, true}, ParameterAnnotations{}, ParameterAnnotations{false, true}, ParameterAnnotations{}>(void*) + 112
    frame #3: 0x0000aaab6bee33a8

hm. X11 is busted (not good for Steam…) but let's see:

❯ set -e DISPLAY
❯ FEXInterpreter (which vulkaninfo)
[DEBUG] LoadLib: libwayland-client -> /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libwayland-client-host.so
[DEBUG] Loaded 41 syms
[DEBUG] LoadLib: libvulkan -> /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libvulkan-host.so
[DEBUG] Loaded 1357 syms
[…]
GPU0:
VkPhysicalDeviceProperties:
---------------------------
	apiVersion        = 1.4.311 (4210999)
	driverVersion     = 25.1.3 (104861699)
	vendorID          = 0x5143
	deviceID          = 0x43050c01
	deviceType        = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
	deviceName        = Adreno X1-85
[…]
fish: Job 1, 'FEXInterpreter (which vulkaninf…' terminated by signal SIGSEGV (Address boundary error)

Success! \o/ (Still crashes after listing all the info)

vkcube next?
❯ FEXInterpreter (which vkcube) --wsi wayland
vkEnumerateInstanceExtensionProperties failed to find the VK_KHR_surface instance extension.

This indicates that no compatible Vulkan installable client driver (ICD) is present or that the system is not configured to present to the screen.

Oops, let's add /nix/store/…/lib/libvulkan.so (without .1) to the ThunksDB..

❯ FEXInterpreter (which vkcube) --wsi wayland
Unable to find the Vulkan runtime on the system.
❯ strace FEXInterpreter (which vkcube) --wsi wayland
[…]
openat2(AT_FDCWD, "/nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/share/fex-emu/GuestThunks//libvulkan-guest.so", {flags=O_RDONLY|O_CLOEXEC, resolve=0}, 24) = 6
readlinkat(AT_FDCWD, "/proc/self/fd/6", "/nix/store/8wijrw312xlplqgysfj05"..., 4096) = 97
read(6, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
fstat(6, {st_mode=S_IFREG|0555, st_size=501280, ...}) = 0
mmap(NULL, 294280, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 6, 0) = 0x7ffff7b7f000
fstat(6, {st_mode=S_IFREG|0555, st_size=501280, ...}) = 0
readlinkat(AT_FDCWD, "/proc/self/fd/6", "/nix/store/8wijrw312xlplqgysfj05"..., 4096) = 97
mmap(0x7ffff7ba5000, 131072, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 6, 0x25000) = 0x7ffff7ba5000
fstat(6, {st_mode=S_IFREG|0555, st_size=501280, ...}) = 0
readlinkat(AT_FDCWD, "/proc/self/fd/6", "/nix/store/8wijrw312xlplqgysfj05"..., 4096) = 97
mmap(0x7ffff7bc5000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 6, 0x44000) = 0x7ffff7bc5000
fstat(6, {st_mode=S_IFREG|0555, st_size=501280, ...}) = 0
readlinkat(AT_FDCWD, "/proc/self/fd/6", "/nix/store/8wijrw312xlplqgysfj05"..., 4096) = 97
mmap(0x7ffff7bc6000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 6, 0x44000) = 0x7ffff7bc6000
fstat(6, {st_mode=S_IFREG|0555, st_size=501280, ...}) = 0
readlinkat(AT_FDCWD, "/proc/self/fd/6", "/nix/store/8wijrw312xlplqgysfj05"..., 4096) = 97
close(6)                                = 0
munmap(0x7ffff7b7f000, 294280)          = 0
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x4), ...}) = 0
write(1, "Unable to find the Vulkan runtim"..., 49Unable to find the Vulkan runtime on the system.
) = 49

wat??? Hm, let's try something else-

❯ FEXInterpreter (which vkquake)
/nix/store/xirk3k08dm9ikwg20y2in7nwx27664x1-vkquake-1.32.2/bin/vkquake: symbol lookup error: /nix/store/nszik1q8ffmvsqk54kbc75dwyxwvi2nm-vulkan-loader-1.4.313.0/lib/libvulkan.so.1: undefined symbol: __gxx_personality_v0

Lovely.

❯ FEXInterpreter (which gzdoom)
/nix/store/gknkk8wdjdnc9h39qn6jmpng4caqw3fm-gzdoom-4.14.2/share/games/doom/gzdoom: symbol lookup error: /nix/store/42dlyn8qbxdp55gxd35xymik0nhy0hdw-gtk+3-3.24.49/lib/libgdk-3.so.0: undefined symbol: wl_data_device_interface

Missing the newest changes I see. And without host libwayland?

❯ FEXInterpreter (which gzdoom)
GZDoom g4.14.2 -  - SDL version
Compiled on Jan  1 1980

[INFO] CLONE_CLEAR_SIGHAND passed to clone3. Returning EINVAL.
[INFO] clone: Unsupported flags w/o CLONE_THREAD (Shared Resources), 4100
OS: postmarketOS edge, Linux 6.11.0 on x86_64
GZDoom version g4.14.2
[DEBUG] LoadLib: libvulkan -> /nix/store/8wijrw312xlplqgysfj056c56bcczyf8-fex-2506/lib/fex-emu/HostThunks//libvulkan-host.so
[…]
[DEBUG] Thunks: Adding guest trampoline from address 0x7fffe1060980 to guest function 0x7ffff0137290


*** Fatal Error ***
Illegal opcode (signal 4)
Address: 0x7ffff54b9690

Generating gzdoom-crash.log and killing process 2033240, please wait... [INFO] CLONE_CLEAR_SIGHAND passed to clone3. Returning EINVAL.
[INFO] clone: Unsupported flags w/o CLONE_THREAD (Shared Resources), 4100
[INFO] CLONE_CLEAR_SIGHAND passed to clone3. Returning EINVAL.
[INFO] clone: Unsupported flags w/o CLONE_THREAD (Shared Resources), 4100
sh: gdb: not found

…actually, vkmark with X11 backend works!

Screenshot From 2025-07-06 05-38-05-min


So, it seems like FHS wrapping might not be necessary, if you make sure the HostThunks have matching host libraries in RPATH (e.g. libvulkan-host.so/nix/store/<the aarch64 one here>-vulkan-loader-1.4.…/lib) these host libraries do get loaded! But also ThunksDB.json must point to full x86 paths in the nix store, or the @PREFIX_LIB@ runtime expansion must be changed to accept nix paths.

--replace-fail "FEX_CONFIG_OPT(ThunkGuestLibs32, THUNKGUESTLIBS32);" "fextl::string ThunkGuestLibs32() { return \"$out/share/fex-emu/GuestThunks_32/\"; }"
substituteInPlace Source/Tools/LinuxEmulation/Thunks.cpp \
--replace-fail "FEX_CONFIG_OPT(ThunkHostLibsPath, THUNKHOSTLIBS);" "fextl::string ThunkHostLibsPath() { return \"$out/lib/fex-emu/HostThunks/\"; }" \
--replace-fail "FEX_CONFIG_OPT(ThunkHostLibsPath32, THUNKHOSTLIBS32);" "fextl::string ThunkHostLibsPath32() { return \"$out/lib/fex-emu/HostThunks/\"; }"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh oh! should be ../HostThunks_32/ here!!

[ERROR] Tried to call guest function with arguments packed to a 64-bit address
Illegal instruction

Copy link
Contributor Author

@neobrain neobrain Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof yes, thanks for catching this!

EDIT: Fixed!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, looks like the just-released 2507 also changed from having a separate option for 32-bit to appending _32 to the ThunkHostLibsPath, which would've fixed this anyway!

@neobrain neobrain force-pushed the fex_library_forwarding branch from fb7f952 to 2099d24 Compare July 8, 2025 11:01
@neobrain
Copy link
Contributor Author

neobrain commented Jul 8, 2025

Currently testing on non-NixOS (postmarketOS), with a flake that generates a dev shell, I want it to just use Mesa from nixpkgs:

Awesome, thanks for testing!

So, it seems like FHS wrapping might not be necessary, if you make sure the HostThunks have matching host libraries in RPATH (e.g. libvulkan-host.so → /nix/store/-vulkan-loader-1.4.…/lib) these host libraries do get loaded! But also ThunksDB.json must point to full x86 paths in the nix store, or the @PREFIX_LIB@ runtime expansion must be changed to accept nix paths.

Yeah we're basically stumbling upon all the same reasons out of which nixGL was developed. Sadly we can't use the store paths without adding false dependencies in FEX to each library that it supports forwarding for. I like andre4ik3's idea of using a module on NixOS to transparently add an FHS env, but non-NixOS users will probably be stuck with nixGL and manually juggling LD_LIBRARY_PATHs :(

@valpackett
Copy link

stuck with nixGL and manually juggling LD_LIBRARY_PATHs

well, nixGL does a bunch of stuff we don't care about (nvidia, vdpau…) and like 2-5 lines that actually matter.

To make the host GLVND libGL find Mesa, all we need is either an LD_LIBRARY_PATH to the host Mesa, or a manual /run/opengl-driver symlink to the same. The host libGL thunk actually already has the host GLVND on its RPATH (I've used patchelf for other libraries though they don't all always work well). I also modified the ThunksDB.json to intercept actual Nix store paths to emulated libGL:

ad-hoc hacks turned into nix code that modifies the fex package from here
  let
    thunksConf = armpkgs.writeText "nixthunks.json" (builtins.toJSON { DB = {
      GL = {
        Library = "libGL-guest.so";
        Overlay = builtins.map (m: "${m.pre}/${m.name}") (armpkgs.lib.cartesianProduct {
          pre = ["${x64pkgs.libglvnd}/lib" "${i686pkgs.libglvnd}/lib" "@PREFIX_LIB@"];
          name = ["libGL.so" "libGL.so.1" "libGL.so.1.2.0" "libGL.so.1.7.0"];
        });
      };
      Vulkan = {
        Library = "libvulkan-guest.so";
        Overlay = builtins.map (m: "${m.pre}/${m.name}") (armpkgs.lib.cartesianProduct {
          pre = [
            "${x64pkgs.vulkan-loader}/lib" "${i686pkgs.vulkan-loader}/lib"
            # does not actually catch it by that path:
            "/usr/lib/pressure-vessel/overrides/lib/x86_64-linux-gnu"
            "/usr/lib/pressure-vessel/overrides/lib/i386-linux-gnu"
            "@HOME@/.local/share/Steam/ubuntu12_32/steam-runtime/usr/lib/x86_64-linux-gnu"
            "@HOME@/.local/share/Steam/ubuntu12_32/steam-runtime/usr/lib/i386-linux-gnu"
            "@PREFIX_LIB@"
          ];
          name = ["libvulkan.so" "libvulkan.so.1" "libvulkan.so.1.3.239" "libvulkan.so.${x64pkgs.lib.getVersion x64pkgs.vulkan-loader}"];
        });
      };
      drm = {
        Library = "libdrm-guest.so";
        Overlay = builtins.map (m: "${m.pre}/${m.name}") (armpkgs.lib.cartesianProduct {
          pre = ["${x64pkgs.libdrm}/lib" "@PREFIX_LIB@"];
          name = ["libdrm.so" "libdrm.so.2" "libdrm.so.2.4.0" "libdrm.so.${x64pkgs.lib.getVersion x64pkgs.libdrm}"];
        });
      };
      asound = {
        Library = "libasound-guest.so";
        Overlay = builtins.map (m: "${m.pre}/${m.name}") (armpkgs.lib.cartesianProduct {
          pre = ["${x64pkgs.alsa-lib}/lib" "@PREFIX_LIB@"];
          name = ["libasound.so" "libasound.so.2" "libasound.so.2.0.0"];
        });
      };
      WaylandClient = {
        Library = "libwayland-client-guest.so";
        Overlay = builtins.map (m: "${m.pre}/${m.name}") (armpkgs.lib.cartesianProduct {
          pre = ["${x64pkgs.wayland}/lib" "${i686pkgs.wayland}/lib" "@PREFIX_LIB@"];
          name = ["libwayland-client.so" "libwayland-client.so.0" "libwayland-client.so.0.20.0"
            "libwayland-client.so.0.${x64pkgs.lib.removePrefix "1." (x64pkgs.lib.getVersion x64pkgs.wayland)}"];
        });
      };
    }; });
    # at least with turnip, steamwebhelper using vulkan results in a lot of glitches
    # XXX: Vulkan=0 does not apply from there?? have to copy the file to ~/.config/.fex-emu/AppConfig/ ????
    steamwebConf = armpkgs.writeText "steamweb.json" (builtins.toJSON { ThunksDB = { GL = 0; Vulkan = 0; }; });
    patchedFex = armpkgs.fex.overrideAttrs (old: {
      postFixup = ''
        # GL already includes the path to host libglvnd!

        patchelf --add-rpath ${armpkgs.lib.makeLibraryPath [armpkgs.vulkan-loader]} \
          $out/lib/fex-emu/HostThunks/libvulkan-host.so \
          $out/lib/fex-emu/HostThunks_32/libvulkan-host.so

        patchelf --add-rpath ${armpkgs.lib.makeLibraryPath [armpkgs.libdrm]} \
          $out/lib/fex-emu/HostThunks/libdrm-host.so

        patchelf --add-rpath ${armpkgs.lib.makeLibraryPath [armpkgs.wayland]} \
          $out/lib/fex-emu/HostThunks/libwayland-client-host.so \
          $out/lib/fex-emu/HostThunks_32/libwayland-client-host.so

        patchelf --add-rpath ${armpkgs.lib.makeLibraryPath [armpkgs.alsa-lib]} \
          $out/lib/fex-emu/HostThunks/libasound-host.so

        cp ${thunksConf} $out/share/fex-emu/ThunksDB.json
        cp ${steamwebConf} $out/share/fex-emu/AppConfig/steamwebhelper.json
      '';
    });
  in

…and I got Steam with Portal 2 (GL) working eventually! However, the one really cursed issue I've encountered is that the steamapps' shebangs are not respected, everything is forcibly launched with a Nix bash, so I've had to do awful things like adding

echo "$0, $SHELL, $BASH"
[ -z "$BASH_VERSION" ] && exec /bin/bash "$0" "$@"

to .local/share/Steam/steamapps/common/Portal 2/portal2.sh (the echo just to see what's going on: that's where I saw $SHELL being a /nix/store path, when this is supposed to be running in a container).

@neobrain
Copy link
Contributor Author

Not entirely familiar with the nixpkgs process - what's left to be done for this to be merged? Besides the now-fixed 32-bit issue, this feature seems to be working fine for everyone who tested (thanks again!). The runtime setup needs followup work but that's fine since libraries need to be manually enabled after installation.


@valpackett Pretty cool you got it working in the end! I'm assuming the main intent was to document your research and progress here, but if I've overlooked proposed changes to the PR let me know.

@pbsds pbsds requested a review from RossComputerGuy July 14, 2025 18:21
Copy link
Member

@RossComputerGuy RossComputerGuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need quite a bit of improvements, will give it a proper review later today.

@neobrain
Copy link
Contributor Author

neobrain commented Jul 20, 2025

This will need quite a bit of improvements, will give it a proper review later today.

@RossComputerGuy Sounds good, thanks - happy to adapt as needed whenever you find time for the review.

@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 26, 2025
@neobrain neobrain force-pushed the fex_library_forwarding branch from 2099d24 to 469fab0 Compare July 29, 2025 08:24
@neobrain
Copy link
Contributor Author

Rebased on the recently merged FEX 2507.1 update, which includes my upstream changes that allow cutting down the substituteInPlaces a little.

@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 29, 2025
@qbisi
Copy link
Contributor

qbisi commented Aug 9, 2025

Sorry to ask in this post, but do we need a NixOS rootfs to run OpenGL/Vulkan-related applications?
I tried this recipe on my Rockchip RK3588 board, but glxinfo (x86_64 ELF) does not work with library forwarding.

@neobrain
Copy link
Contributor Author

Sorry to ask in this post, but do we need a NixOS rootfs to run OpenGL/Vulkan-related applications?
I tried this recipe on my Rockchip RK3588 board, but glxinfo (x86_64 ELF) does not work with library forwarding.

FEX isn't designed to work without a RootFS, so this is expected. Library forwarding only removes the need for having x86 GL/Vulkan drivers (and other forwarded libraries) in the rootfs.

@qbisi
Copy link
Contributor

qbisi commented Aug 10, 2025

FEX isn't designed to work without a RootFS, so this is expected. Library forwarding only removes the need for having x86 GL/Vulkan drivers (and other forwarded libraries) in the rootfs.

Any prebuilt rootfs for nixos? A Nixos rootfs stores almost everything in /nix/store, is it suffice to provide a rootfs containing /run/opengl-driver.

@valpackett
Copy link

FEX's RootFS feature is only really needed for resolving the same paths differently based on emulated vs. native execution. FEX absolutely works fine without a RootFS in a properly mixed/multilib system – I managed to get it to run Steam games both using Nix-on-non-NixOS (see research above) and Flatpak with hacks.

@qbisi what you're probably missing for library forwarding is making ThunksDB.json point to the Nix store paths of the x86_64 libGL that your glxinfo loads, see #413255 (comment) above.

(Specifically for forwarding, you're not required to have an /run/opengl-driver pointing to x86 Mesa in the x86 emulated world – the /run/opengl-driver path comes from nixpkgs' build of GLVND, which is what applications load when they load libGL. GLVND then dynamically loads the actual driver to delegate to; what we intercept with FEX forwarding is that first libGL directly loaded by the application, i.e. GLVND itself.)

…ThunksDB is actually the most "painful" part of using Nix for this. Maybe the package could provide a ThunksDB generator function that takes in x86/64 pkgs and makes a json file for them that the outer user nix code would install? To improve the whole situation with

we can't use the store paths without adding false dependencies in FEX to each library that it supports forwarding for

@RossComputerGuy
Copy link
Member

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 413255
Commit: 469fab0d260d82050b679f326fe75e1e0b7588be


aarch64-linux

❌ 2 packages failed to build:
  • fex
  • muvm

Error logs: `aarch64-linux`
fex
[6649/6718] Linking CXX executable EmitterTests/Emitter_Loadstore_Tests
FAILED: EmitterTests/Emitter_Loadstore_Tests FEXCore/unittests/Emitter/Emitter_Loadstore_Tests-b12d07c_tests.cmake /build/source/build/FEXCore/unittests/Emitter/Emitter_Loadstore_Tests-b12d07c_tests.cmake 
: && /nix/store/wq44hhz0hv7700q0b8yjc8zr6cj3sdxq-clang-wrapper-19.1.7/bin/clang++ -O3 -DNDEBUG -fomit-frame-pointer -flto=thin -fuse-ld=lld -fPIE -pie -Xlinker --dependency-file=FEXCore/unittests/Emitter/CMakeFiles/Emitter_Loadstore_Tests.dir/link.d External/vixl/src/CMakeFiles/vixl.dir/code-buffer-vixl.cc.o External/vixl/src/CMakeFiles/vixl.dir/compiler-intrinsics-vixl.cc.o External/vixl/src/CMakeFiles/vixl.dir/cpu-features.cc.o External/vixl/src/CMakeFiles/vixl.dir/utils-vixl.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/assembler-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/assembler-sve-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/debugger-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/decoder-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/disasm-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/cpu-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/instructions-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/macro-assembler-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/macro-assembler-sve-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/operands-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/pointer-auth-aarch64.cc.o External/vixl/src/CMakeFiles/vixl.dir/aarch64/registers-aarch64.cc.o FEXCore/unittests/Emitter/CMakeFiles/Emitter_Loadstore_Tests.dir/Loadstore_Tests.cpp.o -o EmitterTests/Emitter_Loadstore_Tests  External/Catch2/src/libCatch2Main.a  FEXCore/Source/libFEXCore_Base.a  FEXCore/Source/libJemallocLibs.a  External/Catch2/src/libCatch2.a  /nix/store/07ymgn51nwfvsngh9zxnd2nhwj13ylzn-fmt-10.2.1/lib/libfmt.so.10.2.1  /nix/store/20mkvm8mcfyg1n8b5f059h7qgvg2zs1w-xxHash-0.8.3/lib/libxxhash.so  External/cephes/libcephes_128bit.a  External/SoftFloat-3e/libsoftfloat_3e.a  -ldl  External/jemalloc/libFEX_jemalloc.a  External/jemalloc_glibc/libFEX_jemalloc_glibc.a  -lpthread && cd /build/source/build/FEXCore/unittests/Emitter && /nix/store/2pbjx4gij8dvgs4cmh01dny1727jy5mi-cmake-3.31.7/bin/cmake -D TEST_TARGET=Emitter_Loadstore_Tests -D TEST_EXECUTABLE=/build/source/build/EmitterTests/Emitter_Loadstore_Tests -D TEST_EXECUTOR= -D TEST_WORKING_DIR=/build/source/build/FEXCore/unittests/Emitter -D TEST_SPEC= -D TEST_EXTRA_ARGS= -D TEST_PROPERTIES= -D TEST_PREFIX= -D TEST_SUFFIX=.Loadstore_Tests.Emitter -D TEST_LIST=Emitter_Loadstore_Tests_TESTS -D TEST_REPORTER= -D TEST_OUTPUT_DIR= -D TEST_OUTPUT_PREFIX= -D TEST_OUTPUT_SUFFIX= -D TEST_DL_PATHS= -D CTEST_FILE=/build/source/build/FEXCore/unittests/Emitter/Emitter_Loadstore_Tests-b12d07c_tests.cmake -P /build/source/External/Catch2/extras/CatchAddTests.cmake
<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
terminate called without an active exception
CMake Error at /build/source/External/Catch2/extras/CatchAddTests.cmake:70 (message):
  Error running test executable
  '/build/source/build/EmitterTests/Emitter_Loadstore_Tests':
Result: Subprocess aborted
Output: 

Call Stack (most recent call first):
/build/source/External/Catch2/extras/CatchAddTests.cmake:175 (catch_discover_tests_impl)

ninja: build stopped: subcommand failed.

@RossComputerGuy
Copy link
Member

Fex uses jemalloc which doesn't like my desktop's 64k page size...

@neobrain
Copy link
Contributor Author

neobrain commented Aug 15, 2025

Fex uses jemalloc which doesn't like my desktop's 64k page size...

Does nixpkgs-review override doCheck? The FEX derivation explicitly disables it after all...

The FEX derivation has a note about non-4k systems, and this is already an issue in the existing FEX derivation:

  # Unsupported on non-4K page size kernels (e.g. Apple Silicon)
  doCheck = true;

I'm not very familiar with nixpkgs-review, but perhaps you can post the thoughts you had without the tool since your system does not support running the tests?

@andre4ik3
Copy link
Member

The FEX derivation explicitly disables it

enables* (It used to be explicitly disabled I think)

But yes, on non-4k pagesize you need to do fex.overrideAttrs { doCheck = false; } or run in a 4k pagesize VM.

@neobrain
Copy link
Contributor Author

neobrain commented Aug 15, 2025

enables* (It used to be explicitly disabled I think)

Oops, yes I was just remembering the comment in the derivation and thought it was still disabled hence, but you're right. Thanks 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 12.approvals: 1 This PR was reviewed and approved by one person. 12.approved-by: package-maintainer This PR was reviewed and approved by a maintainer listed in any of the changed packages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants