Skip to content

Content routing issues with "Reprovider.Strategy" set to "roots" #9416

@AnnaArchivist

Description

@AnnaArchivist

Checklist

Installation method

ipfs-update or dist.ipfs.tech

Version

Kubo version: 0.16.0
Repo version: 12
System version: amd64/darwin
Golang version: go1.19.1

Config

{
  "API": {
    "HTTPHeaders": {}
  },
  "Addresses": {
    "API": "/ip4/127.0.0.1/tcp/5001",
    "Announce": [],
    "AppendAnnounce": [],
    "Gateway": "/ip4/127.0.0.1/tcp/8080",
    "NoAnnounce": [],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/54957",
      "/ip6/::/tcp/54957",
      "/ip4/0.0.0.0/udp/54957/quic",
      "/ip6/::/udp/54957/quic"
    ]
  },
  "AutoNAT": {},
  "Bootstrap": [
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
    "/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    "/ip4/104.131.131.82/udp/4001/quic/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ"
  ],
  "DNS": {
    "Resolvers": {}
  },
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": true
    }
  },
  "Experimental": {
    "AcceleratedDHTClient": false,
    "FilestoreEnabled": true,
    "GraphsyncEnabled": false,
    "Libp2pStreamMounting": false,
    "P2pHttpProxy": false,
    "StrategicProviding": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": [],
    "HTTPHeaders": {
      "Access-Control-Allow-Headers": [
        "X-Requested-With",
        "Range",
        "User-Agent"
      ],
      "Access-Control-Allow-Methods": [
        "GET"
      ],
      "Access-Control-Allow-Origin": [
        "*"
      ]
    },
    "NoDNSLink": false,
    "NoFetch": false,
    "PathPrefixes": [],
    "PublicGateways": null,
    "RootRedirect": "",
    "Writable": false
  },
  "Identity": {
    "PeerID": "<redacted>"
  },
  "Internal": {},
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Migration": {
    "DownloadSources": [],
    "Keep": ""
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {
    "RemoteServices": {}
  },
  "Plugins": {
    "Plugins": null
  },
  "Provider": {
    "Strategy": ""
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": ""
  },
  "Reprovider": {
    "Interval": "12h",
    "Strategy": "roots"
  },
  "Routing": {
    "Methods": null,
    "Routers": null,
    "Type": "dhtclient"
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {
      "GracePeriod": "20s",
      "HighWater": 900,
      "LowWater": 600,
      "Type": "basic"
    },
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "RelayClient": {},
    "RelayService": {},
    "ResourceMgr": {},
    "Transports": {
      "Multiplexers": {},
      "Network": {},
      "Security": {}
    }
  }
}

Description

I'm seeing unexpected behavior when using Reprovider.Strategy "roots".

Now, this might be one or multiple bugs in kubo, in the documentation, in the gateway, in specs, or in some interaction between them all. This is kind of an umbrella bug, since I'm not too familiar with the different subsystems that might be involved. I'm writing this from a user's perspective, not from an IPFS developer perspective.

Hopefully someone can help figure out if this needs to be split up in smaller tickets, or filed elsewhere. I'd be happy to work with you all to make that happen!

Background

When (re)providing very large directories with lots of data, it can take a long time to complete. My understanding from the config docs is that you can use Reprovider.Strategy "roots" to only provide the directory CID itself, provided that it is pinned. Fantastic, that should make (re)providing much faster!

This blog post then suggests that IPFS nodes should still be able to find the content that is linked to in this directory, e.g. when using <directory CID>/filename (emphasis mine):

If you are adding folders of files to IPFS, only the CID for the pinned folder will be advertised (all the blocks will still be retrievable with Bitswap once a connection to the node is established).

Therefore, I would expect that I can do this:

  1. Add new files with a wrapping directory, pinning only the directory.
  2. Provide using Reprovider.Strategy "roots", so that only the "directory CID" is provided.
  3. On a gateway, go to /ipfs/<directory CID>/filename (NOT /ipfs/<file CID>, since that will surely not work, since it has never been provided)

Steps taken

Some of these steps might be unnecessary, but I tried to stick to my production use case as close as possible, while trying to make a minimal example. When debugging this, you might be able to make it even more minimal.

$ ipfs init # Create a new repo.
$ ipfs config --json Experimental.FilestoreEnabled true # Since we do that in my production setup.
$ ipfs config --json Reprovider.Strategy '"roots"' # Enable our roots strategy.
$ ipfs daemon --offline # First start in offline mode, so that "ipfs add" doesn't accidentally provide all our CIDs.

# In a different window:
$ head -c 1000 < /dev/urandom > 1.txt # Fill 1.txt with random data.
$ head -c 1000 < /dev/urandom > 2.txt # Fill 2.txt with random data.
$ ipfs add --wrap-with-directory --nocopy --recursive --hash=blake2b-256 --chunker=size-1048576 1.txt 2.txt # Using the same command line parameters as we use in production.
# Note that this gives us our directory CID, as well as the CIDs of 1.txt and 2.txt.

# In the first window again:
$ ipfs daemon # Start the daemon properly now (be sure to kill the previous one)

# In window 2 again:
$ ipfs bitswap reprovide # This often crashes the daemon (see below).
$ ipfs dht provide -v <directory CID> # Alternative if `bitswap reprovide` crashes. Note that we omit "-r".
$ ipfs stats provide # Verify that providing happened.

Ok, we have now provided just the directory CID onto the IPFS network. We can verify that it can be found by using this diagnostic tool. Indeed, I can consistently confirm that the directory can be found and accessed on my node. Similarly, we can consistently confirm that the CIDs for 1.txt and 2.txt cannot be found, which is in line with our expectations.

First bug: 2.txt not accessible

Now it becomes a bit more iffy. The next steps I can't always fully reproduce, but most of the time I can.

  1. I go to a gateway. I've tested this with ipfs.io, cloudflare-ipfs.com, gateway.pinata.cloud, and ipfs.fleek.co. It would be good to also test with a locally hosted gateway, but I haven't done that yet.
  2. On the gateway, I go to /ipfs/<directory CID>/1.txt. Usually, this works!
  3. Now I try to go to /ipfs/<directory CID>/2.txt. Usually, this times out. No matter how long I wait, I can't get it to work. Even when using a different gateway it often still times out! And sometimes I can't even access the directory itself; /ipfs/<directory CID>.

Interestingly, on a different IPFS node, ipfs resolve -r /ipfs/<directory CID>/2.txt usually does work, and a subsequent ipfs dag get on its output also usually does give me the data from 2.txt.

Second bug: crash during ipfs bitswap reprovide

@lidel moved this to #9418

Related discussions

In my investigations so far, I've found a couple of people who have mentioned potentially related issues:

My hypothesis is that once an IPFS node has cached the directory CID locally, it doesn't remember the IPFS node that it originally connected to when looking up 2.txt. And since the CID for 2.txt itself is not cached, or provided on IPFS at all, it simply can't find it.

If I'm right about this, I see two possibilities, depending on whether this is the intended behavior:

  1. Yes, this is the intended behavior. In that case, what is the purpose of Reprovider.Strategy "roots" in the first place? Is it only useful if we know that other nodes will always fully recursively fetch all the "children" of a root node? If so, then that should be more clearly documented. It would also be fairly brittle behavior, because what if a node goes down while doing the recursive fetching (which could potentially be many terabytes!). It would be hard to resume in that case.
  2. No, this is not the intended behavior. In that case, you probably know better than me how to resolve this. I would imagine something like, either remembering which node we originally got the "directory CID" from, and re-establishing a connection with them when trying to fetch a child. Or checking all the nodes that provide the "directory CID" to see if they have any child CIDs. Or something else, I dunno!

I hope this is helpful! Again, happy to provide more details and think through this with you all.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium: Good to have, but can wait until someone steps upeffort/weeksEstimated to take multiple weeksexp/expertHaving worked on the specific codebase is importantkind/bugA bug in existing code (including security flaws)need/author-inputNeeds input from the original author

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions