Skip to content

Conversation

AudriusButkevicius
Copy link
Member

Not sure if I am doing the invalidation right, yet you get the idea of what I am trying to do.

@AudriusButkevicius
Copy link
Member Author

@calmh comments on the generic idea?

@calmh
Copy link
Member

calmh commented Feb 21, 2018

I'm not sure I like a protocol message for this. As far as I'm concerned this is a bug on the sender side - we're specifically asking for a block with a certain hash (or, I forget, maybe just implicitly because it claimed it had that hash to begin with) and it's giving us something else. It should not be our responsibility, as the receiver, to tell it what to do about that.

We could / should instead verify the hash as we're sending and trigger a rescan as appropriate. Of course we don't want to do that all the time as it's expensive. Potential ways to mitigate;

  • If we have a weak hash, use that. We just need a CRC check here, really...
  • Keep track of block hashes requested per peer device, in something like a size limited LRU list. Verify the hash of the block we're sending if it's a repeat request (and then mark it as verified if it actually does check out so we don't do it for requests three, four, ...).
  • Just suck it up and always do the hash, it's going to be faster than the network speed anyhow and the receiver also does it...

@AudriusButkevicius
Copy link
Member Author

Well my argument is that receiver already does that, so why not use that as a mechanism for signaling?

@calmh
Copy link
Member

calmh commented Feb 21, 2018 via email

@AudriusButkevicius
Copy link
Member Author

Well in that case the last option sounds most reasonable.

@AudriusButkevicius
Copy link
Member Author

So do we agree with the last one as the measure?
I guess we could check scan speed and if it's below let's say 20MB/s, not do that?

@AudriusButkevicius
Copy link
Member Author

To be honest, my heart breaks thinking about all the phones and RPIs doing that, versus just one lousy protocol message.

@calmh
Copy link
Member

calmh commented Feb 27, 2018

If you think that's too painful, why not choose the middle one (check the block the second time the same peer asks for it - this should never happen, really). That still keeps the bug workaround where it belongs, on the buggy side.

@AudriusButkevicius
Copy link
Member Author

I don't think this is always a bug, it's also interactions with other apps.

@AudriusButkevicius
Copy link
Member Author

Do you agree or disagree that it's not always a bug?

@calmh
Copy link
Member

calmh commented Mar 4, 2018 via email

@AudriusButkevicius
Copy link
Member Author

Ok

@AudriusButkevicius
Copy link
Member Author

So I was thinking about this, but any LRU idea seems it's going to be a hit and miss with many small files situation.

Bloom filter would probably work okay'ish but getting the size right is hard, as at some point you'll just start flagging everything.

What about this.

We make weak hash always be there (in the scans atleast, we can still skip it on rolling local file to reduce transfers).
We send weak hashes part of the request.
We check that what we've read matches weak hash they given us, if it doesn't we go on a venture of looking stuff up in our own database, seeing if those weak hashes match, if they don't doing a strong hash comparison, rescans and what not.

Alternative would be to check if the weak hash sent is non-zero, and only do this checking in that case, but it's somewhat half-assed as then we sort of don't have a problem anymore and have it at the same time, based on the current moon phase.

@calmh
Copy link
Member

calmh commented Mar 21, 2018

Yeah I agree on the weak hash thing. Make it mandatory for scans. Check it on transmission. I don't think we should do anything fancy on failure though - return an error code to the requestor, tag the file as invalid in the database (this will also fire an index update at some point, so we stop getting requests for it), fire a rescan (which should pick up that it's invalid in the db and do a full hash).

@AudriusButkevicius
Copy link
Member Author

Have a look at this once you have time.

Well tested as usual.

@AudriusButkevicius
Copy link
Member Author

I have no idea how to fix the serialization test, I probably need to downgrade something but it's not obvious what.

@AudriusButkevicius
Copy link
Member Author

Or was that fixed by hand?

@AudriusButkevicius
Copy link
Member Author

@imsodin have a look

Copy link
Member

@imsodin imsodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the added benchs and updated vendor should go into a separate PR.

} else {
l.Debugln("Finder failed to verify buffer", err)
}
if err := verifyBuffer(buf, block); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we trigger a rescan of the file here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are mid way through a pull, so if the stuff doesn't match, scanning fill pick it up next I suspect.

This block map fixing stuff was to deal with me fking up the first version and storing half of the hash in the map, so I am just removing the 2-3 year old code that was fixing the damage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah right, I didn't get the context right. And thanks for the map fix info, I was sort of wondering about that but didn't see the point of the removed code, so kept silent :)


// The file probably did not get rescanned, as mtimes haven't changed.
// Invalidate it, and trigger another scan.
cf.Invalidate(m.shortID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get where mtime comes into play. Neither do I understand the need to invalidate (that's maybe linked to not getting the mtime bit).
In general wouldn't it be simpler and use less resources to rescan the file directly instead of explicitly hashing after checking the block hash. The current flow hashes a changed file twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is as follows:

  1. We get what other side expects to get as part of the request.
  2. We validate that what we've read matches the weak hash, if it does, nothing happens.
  3. If it doesn't match, we validate using SHA256, if it matches it's all good.
  4. If it still doesn't match, look up our own file in the database, and compare hashes for that block. If the our hashes don't match request hashes, it means the requestor is asking for something different, so we do nothing.

You are right that we don't need to hash here, as we know that what we've read doesn't match the hash the user has asked us for, and we also know that our hash matches his, which means the file is somehow changed, so we can skip the HashFile and go straight to invalidate and rescan.

The idea behind the mtime is the fact that if something modifies files in place, or artificially adjusts mtimes, we could miss some changes, hence we double check what we're reading.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I get the invalidation now. One little nick there: In almost all cases it will result in an index update sent with the invalid file info directly followed by an update valid file info. Only in the rare case of a file that has unchanged size and mtime, but changed content, will the invalidation have any effect, in all others it is useless overhead. You could pass the file info from Request and check whether it is equivalent to the current file in the database (as in (f FileInfo) IsEquivalent) and invalidate if it is and scan otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request does not have a FileInfo, it just has folder, name, offset, size, hash, weakHash.
Essentially the goal is to force rescan a file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether force rescanning is a good idea. The only real case where a file is has changed content without changed size&mtime I can think of is bitrot. In that case we'd rather not scan and send the data.

As to what I meant with checking whether the file is changed: Within Request the file is stated and within recheckFile the current file info is obtained from the db. That's all the information needed to check whether the two are "equivalent".

@AudriusButkevicius
Copy link
Member Author

AudriusButkevicius commented Mar 26, 2018

This will have a negative effect that the files sourced from devices that did not weak hash will now be hit with constant SHA256 checks for every request.

The alternative would be to:

  1. Files scanned prior this feature are not double checked.
  2. Force everyone to rehash everything to populate weak hashes.

@imsodin
Copy link
Member

imsodin commented Mar 26, 2018

Please not 2. Other options:
3. Add database transition where any file without weak hashes is rehashed fully when changed or just weak hashed if unchanged. Clean but cumbersome.
4. Don't ever use SHA256 for validation, just assume everything is fine as before if there are no weak hashes from either side. Makes this best effort, but generally cheaper.

@AudriusButkevicius
Copy link
Member Author

4 is essentially 1.

@calmh ?

@calmh
Copy link
Member

calmh commented Mar 26, 2018 via email

@calmh
Copy link
Member

calmh commented Apr 11, 2018

I'm not sure why we need to pass the weak hash in the request. As far as I understand it, we only need to check that the hash of the data we are going to send matches the hash we have on file locally, and by extension matches the weak hash we have on file locally. Adding it to the protocol requires us to nail down what the weak hash is exactly and require everyone to use it etc.

@AudriusButkevicius
Copy link
Member Author

Right, it seems build.go updated gogoproto or something, as I can't repro running go generate locally anymore.

@AudriusButkevicius
Copy link
Member Author

Huh, still fails.

@calmh
Copy link
Member

calmh commented Apr 21, 2018 via email

@calmh
Copy link
Member

calmh commented Apr 22, 2018

I pushed a commit to your branch. From the diff it looks like I have newer protoc stuff and I guess that does not come from vendoring

Copy link
Member

@calmh calmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this makes weak hashes non-optional right? So we should remove the config, auto detection / benchmarking, environment variable, etc?

@@ -106,7 +104,7 @@ func (ph *parallelHasher) hashFiles(ctx context.Context) {
panic("Bug. Asked to hash a directory or a deleted file.")
}

blocks, err := HashFile(ctx, ph.fs, f.Name, f.BlockSize(), ph.counter, ph.useWeakHashes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove this parameter altogether

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still used in some cases I think

@AudriusButkevicius
Copy link
Member Author

The autodetection et all is still useful as it still can be disabled when pulling.

@calmh
Copy link
Member

calmh commented Apr 22, 2018

So the autodetection is about scanning speed, and with this in we should always scan with both the real and weak hash, no? For pulling we have the min-percentage-changed threshold which I guess should remain, and can be set >100 to disable the check for weak hashes.

I mean, we could still avoid scanning with the weak hash, but that shifts the burden as the sender then always has to use the strong hash for verification, which does not seem fair?

@AudriusButkevicius
Copy link
Member Author

Yeah, sure, we can just keep the percentage.

@AudriusButkevicius
Copy link
Member Author

So the hashFile(useWeakHash) parameter is used in cpuBench which is then reported back to UR.
If we remove the parameter, we end up with sha256perf meaning sha256+adler32

Copy link
Member

@imsodin imsodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With verification I believe it is beneficial to always use weak hashing in the long run even for devices that would previously have it disabled, as the verification will be cheaper.

@@ -1382,9 +1382,54 @@ func (m *Model) Request(deviceID protocol.DeviceID, folder, name string, offset
return protocol.ErrGeneric
}

if !scanner.Validate(buf, hash, weakHash) {
m.recheckFile(deviceID, folderFs, folder, name, int(offset)/len(buf), hash)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return with ErrNoSuchFile here.

// The hashes provided part of the request match what we expect to find according
// to what we have in the database, yet the content we've read off the filesystem doesn't
// Something is fishy, invalidate the file and rescan it.
cf.Invalidate(m.shortID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern from an outdated comment is still applies:

I don't know whether force rescanning is a good idea. The only real case where a file has changed content without changed size&mtime I can think of is bitrot. In that case we'd rather not scan and send the data.

The only benefit of invalidation that I see is that the other device won't try to pull from us again until we update. That might even be bad, if other blocks are still fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't force a rescan if the mtime has not changed for example, so invalidation allows us to force rescan.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware it is necessary for force rescanning, my concern is exactly about force rescanning. It feels inconsistent, as such changes will only be detected and scanned if they are requested from another device. And I still don't know about another typical case of modified file with mtime&size unchanged except bitrot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmaped files, files in fat32 where the mtime granularity is 2s etc.

Copy link
Member

@calmh calmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the common case is the file changed without mtime changing; because mmap, low resolution mtime and unlucky scan time, intentional reset of mtime (id3 taggers), etc.

I’m fine with hash perf meaning strong+weak. It fits with the spirit of the benchmark being “how fast is this machine at hashing a file”.

@AudriusButkevicius
Copy link
Member Author

So I'll leave hash performance as it is, and leave the flag in there, as I don't see much harm in reporting two numbers rather than one, just for the same of having a flag.

What is left to do here?

imsodin
imsodin previously approved these changes Apr 22, 2018
Copy link
Member

@imsodin imsodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I am convinced.

I still propose to check whether the db file and file info from lstat are "equivalent" (as in scanner won't hash) and only invalidate in that case, but can file a followup PR to discuss that.

@calmh
Copy link
Member

calmh commented Apr 22, 2018

A test? I think the reques_tests.go thing has a couple of tests that runs the request path, something in there maybe could be used for inspiration

@AudriusButkevicius
Copy link
Member Author

Added a crude test

@AudriusButkevicius
Copy link
Member Author

So, what's up with this.

@calmh calmh merged commit ef0dcea into syncthing:master May 5, 2018
@AudriusButkevicius AudriusButkevicius deleted the rescanreq branch May 5, 2018 09:42
@st-review st-review added the frozen-due-to-age Issues closed and untouched for a long time, together with being locked for discussion label May 6, 2019
@syncthing syncthing locked and limited conversation to collaborators May 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
frozen-due-to-age Issues closed and untouched for a long time, together with being locked for discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants