-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Checklist
- My issue is specific & actionable.
- I am not suggesting a protocol enhancement.
- I have searched on the issue tracker for my issue.
Description
This is a followup on user @endomorphosis's comment in the Filecoin community discussions about IPFS hashing being slow.
I noticed that when trying to index large ML models that the IPFS daemon hashing seems to be single threaded, and therefore somewhat slow when indexing large files. If this is funded, it is my hope that someone in your org can try to create a new spec, to parallelize the hashing of large files.
Per @lidel in an ipfs-steering conversation on 2 April 2024:
In my mind this is not about inventing new hashing specifications, this is about making the most popular implementation majority of ecosystem uses for data onboarding (Kubo) better. My translation:the IPFS daemon hashing [..] slow when indexing large files
→ Kubo's commands likeipfs add
are not as fast as they "should be", when comparing withsha256sum
over the number of chunksparallelize the hashing of large files.
→ improve implementation, make core commands likeipfs dag import|export
andipfs add
as fast as possible (we know they are not)Once we have reference implementation, we can add some rules of thumb how to implement UnixFs hashing and chunking to "notes for implementers" section of wip Unix specification.