Skip to content

fix!: require path in addFile call #754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 13, 2025
Merged

Conversation

achingbrain
Copy link
Member

@achingbrain achingbrain commented Mar 6, 2025

To ensure metadata is always stored, the CID returned from addFile will resolve to a directory.

The path property is now required of the import candidate passed to addFile, if it's not necessary to store a path as part of a file, use addBytes or addByteStream.

Since urlSource returns a { content, path } candidate, it can be used with addFile.

To import directly from a URL without a path, a urlByteStream function has been added for use with addByteStream.

Fixes #707

BREAKING CHANGE: addFile requires the import candidate to have a path property and the returned CID will always resolve to a directory

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

If the user passes a `FileCandidate` with a `path` property or passes
the `wrapWithDirectoryOption` the return type of `fs.addFile` will
be a `DirectoryEntry`.

This breaks existing behaviour where it will always be a `FileEntry`

Since `URLSource` returns a `{ content, path }` candidatem this can
cause a `DirectoryEntry` to be returned whereas it used to always be
a `FileEntry`.  An `ignorePath` option has been added to restore the
previous behavior.

Fixes #707

BREAKING CHANGE: addFile can sometimes return a directory, URLSource returns directory when a path is present
@achingbrain achingbrain requested a review from a team as a code owner March 6, 2025 17:20
@achingbrain
Copy link
Member Author

achingbrain commented Mar 6, 2025

cc @2color - the other option here is to return a DirectoryEntry only if wrapWithDirectory is passed - this means it will ignore the path field if it is present. Feels like that might be user unfriendly though.

E.g.

addFile({ content }) -> FileEntry
addFile({ content, path }) -> FileEntry
addFile({ content }, { wrapWithDirectory }) -> DirectoryEntry
addFile({ content, path }, { wrapWithDirectory }) -> DirectoryEntry

As implemented it is:

addFile({ content }) -> FileEntry
addFile({ content, path }) -> DirectoryEntry
addFile({ content }, { wrapWithDirectory }) -> DirectoryEntry
addFile({ content, path }, { wrapWithDirectory }) -> DirectoryEntry

@2color
Copy link
Member

2color commented Mar 7, 2025

I'm pretty confused about the role and interplay between path and wrapWithDirectory.

IIUC, path implies wrapWithDirectory is true, because you can only retain a path by wrapping in a directory.

However, if you only pass wrapWithDirectory, how is the path for file chosen?

I added the following test (locally), but I don't understand why only passing wrapWithDirectory returns a directory, but enumerating it (with ls) returns no results?

const cid = await fs.addFile({
      content: Uint8Array.from([0, 1, 2, 3, 4])
    }, {
      wrapWithDirectory: true
    })

    // spellchecker:disable-next-line
    expect(cid.toString()).to.equal('bafybeiczsscdsbs7ffqz55asqdf3smv6klcw3gofszvwlyarci47bgf354')

    await expect(fs.stat(cid)).to.eventually.have.property('type', 'directory')

    const contents = await all(fs.ls(cid)) // ⛔️ why is contents of length 0?

and

As implemented it is:

addFile({ content }) -> FileEntry
addFile({ content, path }) -> DirectoryEntry
addFile({ content }, { wrapWithDirectory }) -> DirectoryEntry
addFile({ content, directory }, { wrapWithDirectory }) -> DirectoryEntry

In the last example, do you mean path rather than directory?

Either way, that seems to make sense. If you pass either path or directory, you should expect to get back a directory.

@achingbrain
Copy link
Member Author

achingbrain commented Mar 10, 2025

The directory containing a raw entry without a path being empty seems to be a bug in the importer.

wrapWithDirectory has never made much sense, perhaps we just need to exclude it from the options?

That said I’d really like the output type to be consistent.

Maybe we also make path be required so you always get a DirectoryEntry? wrapWithDirectory is redundant then.

If the user just wants to add file contents without metadata (file name, mode/mtime/etc) they can use fs.addBytes/fs.addBytesStream.

@2color
Copy link
Member

2color commented Mar 10, 2025

wrapWithDirectory has never made much sense, perhaps we just need to exclude it from the options?

I can't really comment on it, since I am not sure I fully understood how it's meant to be used. From my understanding, path is a more precise version of what wrapWithDirectory would do.

Maybe we also make path be required so you always get a DirectoryEntry? wrapWithDirectory is redundant then.

Seems sensible.

@achingbrain
Copy link
Member Author

achingbrain commented Mar 10, 2025

I am not sure I fully understood how it's meant to be used.

The intended use-case is reflected in this test.

That said, I'm not sure why we couldn't require use of something like:

await addAll([{
  path: '/foo.txt',
  content: Uint8Array.from([0, 1, 2, 3])
}, {
  path: '/bar.txt',
  content: Uint8Array.from([0, 1, 2, 3])
}])

instead of:

await addAll([{
  path: 'foo.txt',
  content: Uint8Array.from([0, 1, 2, 3])
}, {
  path: 'bar.txt',
  content: Uint8Array.from([0, 1, 2, 3])
}], {
  wrapWithDirectory: true
})

Either way, I think wrapWithDirectory doesn't make a whole lot of sense for adding individual entries with addFile.

@achingbrain
Copy link
Member Author

achingbrain commented Mar 10, 2025

We should pull the mfs/interop changes out of this PR and merge them first as they are backwards compatible and we can avoid a major in the mfs module.

Done in #758

@achingbrain achingbrain changed the title fix: support wrapWithDirectory in addFile fix!: require path in addFile call Mar 10, 2025
achingbrain added a commit that referenced this pull request Mar 11, 2025
Switch away from `addFile` as it will require paths after #754
achingbrain added a commit that referenced this pull request Mar 11, 2025
Switch away from `addFile` as it will require paths after #754
@SgtPooki
Copy link
Member

After reading the conversation here, I feel awkward about addFile returning a directory, and addByteStream adding a single file.

could we be even more explicit and just create a addDirectory method, and make addFile call addBytes/etc under the hood?

@2color
Copy link
Member

2color commented Mar 11, 2025

could we be even more explicit and just create a addDirectory method, and make addFile call addBytes/etc under the hood?

That's a fair point. It's worth pointing out that w3storage/storacha learned from user feedback and wrap by default to ensure filename is retained.

Either way, we already have addDirectory.

Copy link
Member

@2color 2color left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@2color
Copy link
Member

2color commented Mar 13, 2025

Any blockers @achingbrain?

@2color
Copy link
Member

2color commented Mar 13, 2025

As implemented it is:

addFile({ content }) -> FileEntry
addFile({ content, path }) -> DirectoryEntry
addFile({ content }, { wrapWithDirectory }) -> DirectoryEntry
addFile({ content, path }, { wrapWithDirectory }) -> DirectoryEntry

Actually, I can't get addFile to return DirectoryEntry with the latest release using any of these combination.

@2color
Copy link
Member

2color commented Mar 13, 2025

Here's something that's confusing about the current addFile API: path can refer to the path on the filesystem of the file you want to import.

This will return a directory with the filename retained.

const helia = await createHeliaHTTP()
const ufs = unixfs(helia)

// add a file and wrap in a directory
const readmeCid = await fs.addFile({
  path: './huge-file.html'
})

However, any other combination using a bytestream (even including wrapWithDirectory: true) will never retain a directory:

const helia = await createHeliaHTTP()

// create a filesystem on top of Helia, in this case it's UnixFS
const ufs = unixfs(helia)

const readmeCid = await ufs.addFile({
  content: fs.createReadStream('./huge-file.html'),
  path: './huge-file.html'
}, {
  wrapWithDirectory: true
}) // RETURNS file
const readmeCid = await ufs.addFile({
  content: fs.createReadStream('./huge-file.html'),
  path: './huge-file.html'
}) // RETURNS file

@bumblefudge
Copy link

Here's something that's confusing about the current addFile API: path can refer to the path on the filesystem of the file you want to import.

This will return a directory with the filename retained.

const helia = await createHeliaHTTP()
const ufs = unixfs(helia)

// add a file and wrap in a directory
const readmeCid = await fs.addFile({
  path: './huge-file.html'
})

However, any other combination using a bytestream (even including wrapWithDirectory: true) will never retain a directory:

const helia = await createHeliaHTTP()

// create a filesystem on top of Helia, in this case it's UnixFS
const ufs = unixfs(helia)

const readmeCid = await ufs.addFile({
  content: fs.createReadStream('./huge-file.html'),
  path: './huge-file.html'
}, {
  wrapWithDirectory: true
}) // RETURNS file
const readmeCid = await ufs.addFile({
  content: fs.createReadStream('./huge-file.html'),
  path: './huge-file.html'
}) // RETURNS file

reproduced all of the above today, with a fresh build (by accident)

@achingbrain achingbrain merged commit c0bf36e into main Mar 13, 2025
18 checks passed
@achingbrain achingbrain deleted the fix/support-wrap-with-directory branch March 13, 2025 17:57
@achingbrain achingbrain mentioned this pull request Mar 13, 2025
@achingbrain
Copy link
Member Author

Thanks for the input all, you can try out the changes here ahead of final release (see #766) with npm i @helia/unixfs@next

@bumblefudge
Copy link

bumblefudge commented Mar 14, 2025

using @next, all of the above is still true:

  • adding from addFile(path: {local fs} is creating a directory, regardless of wrapWithDirectory: false being passed in the options
  • adding from addFile(content: {readStream} is creating a file, regardless of wrapWithDirectory: true being passed in the options

(the CID changes, tho, depending on the wrapWithDirectory flag, so it IS being passed and affecting the contents of that root UnixFS CID, but none of the stats change?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

addFile in @helia/unixfs doesn't respect the path and field and wrapWithDirectory option
4 participants