Skip to content

Option for disabling mmap for safetensors loading for network storage users #2288

@0xymoro

Description

@0xymoro

Hi - few weeks ago I opened an issue on CPU bottleneck, finally found out the root cause. It wasn't the CPU bottleneck really - it was the CPU managing frantically the mmap over network volume bottleneck.

For network storage, the code in comfy/utils.py line 13

sd = safetensors.torch.load_file(ckpt, device=device.type)

uses mmap and on network volumes this is hugely inefficient - it's about a 30-50x slowdown. A single SDXL safetensors takes 1-2 seconds with the following over network volume, but 40-50s in the vanilla way above.

I hacked together this:

        try:
            sd = safetensors.torch.load(open(ckpt, 'rb').read())
        except:
            sd = safetensors.torch.load_file(ckpt, device=device.type)

so it worked on my SDXL safetensors, while also falling back to the normal for certain controlnet checkpoints.

This issue has been referenced already in #1992 (comment)

I think a way to disable mmap in the first way is necessary otherwise models are extremely inefficient to load on any cloud provider platform that runs on K8s with network PVCs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions