-
Notifications
You must be signed in to change notification settings - Fork 37.8k
leveldb: Win32WritableFile without memory mapping #6917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Use a simple Win32WritableFile, equivalent to PosixWritableFile in posix_env.cc
Please submit to https://github.com/bitcoin/leveldb first? |
Thought about that, but I think it gets more exposure and testing here. |
@laanwj Sure! |
ut ACK |
1 similar comment
ut ACK |
We should probably benchmark a reindex with this?
|
@sipa Yes, and if its much slower it's reason to figure out whats strictly required for the syncs. And if it's not, we could call it done. Reindex time right now is ruined by signature validation, if someone benchmarks that they must be sure to bypass. |
This is the same as how it works on POSIX already, directly converted to WIN32 API. By all means, benchmark, but IMO we should apply this nevertheless (if it has no new bugs). We should first get rid of stability issues and only then worry about optimization. This will likely save many people from having to do a reindex at all. |
Concept ACK |
Flush() is supposed to clear application-side buffers (like fflush). As this writable file implementation uses the OS function directly, there are no buffers to flush.
Last commit is optional and potentially risky. From what I understand, leveldb's WritableFiles are supposed to do their own caching, and ... it's of course also possible to use libc Then again these are all performance concerns. As said above it may be wrong to worry about that here, I'm mostly concerned with this being correct. |
Isn't LevelDB dead at all? Their Github repo seemed to be not actively developed on the last time I checked. |
@Diapolo That is a good observation. That's why I'm doing an attempt at maintaining it. |
@laanwj Yes, will first do a -reindex on testnet to be able to compare before and after... after that I'm going to kill the testnet instance and see if it's corrupting. |
Thanks! |
Here is an alternative implementation of |
@jonasschnelli it would be useful to have executables here, mind pointing your nightly builder at this? |
Here we go: https://bitcoin.jonasschnelli.ch/pulls/6917/ |
I am so glad about it you can't imagine the time I lost because of this bug. This is the only reason I don't have a full node on my laptop. Trying it right now. |
return ((x + y - 1) / y) * y; | ||
std::wstring path; | ||
ToWidePath(fname, path); | ||
DWORD Flag = PathFileExistsW(path.c_str()) ? OPEN_EXISTING : CREATE_ALWAYS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use OPEN_ALWAYS (which should open if exists, and create if not, according to https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858%28v=vs.85%29.aspx)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this code is just moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal is to have f=fopen(filename, "wb")
semantics. It's possible that this could be improved, but indeed I just copied the code from the previous WritableFile implementation.
@laanwj We don't do small writes anyway, so I doubt an application-level cache would help. |
Concept ACK in any case. |
We don't, but maybe leveldb does internally? (WritableFiles are used for all of leveldb's file creation and writing, also when building/rebuilding tables, not just to write out database batches. Not sure how careful it is to not, say, call Append multiple times to write different parts of data structures) |
I just have started a fresh node (mainnet / this PR) on a win8.1pro with enabled mcafee antivirus (SSD / 2GB RAM / 1.4 GHZ Core i5) ... will report soon. |
Have synced up to 318301,.. twice restarted bitcoin-qt (just to see if stop / start / checkblocks works). No problems with the database so far. My node is still syncing (very slow, serval blocks per minutes). |
@jonasschnelli that's a known issue and unrelated to these changes, see #5664 |
Pro tip for corruption problem : Attach a disk, index the blockchain data on it, unplug savagely. I had corruption everytimes my cat played with the ethernet cable between my computer and my network disk. I did not managed to reproduce the problem so far by just killing the process, I'll try once I find a spare disk somewhere. |
I'm pretty surprised that having the blockchain database on a network disk
works at all...
|
Mapped Drive, then just pointing to SMB share. I'm not at my office, so I'll try with external drive unplugged savagely instead of real network disk as soon as I can. |
Well it's good to know that it seems to work even with a network |
Yes I agree, this is why I switched to a normal VM later on Microsoft Azure. But since I've moved to VM they are sometimes rebooted automatically for maintainance purposes, and it got corrupted several time. The good thing about mapped disk is that thanks to weaker synchronization guarantee, it makes easier to reproduce data corruption problems ! :) |
The issue that this change tries to improve is extremely easy to reproduce already: it happens nearly every time bitcoind or bitcoin-qt running on Windows crashes, or the Windows crashes, or the power is pulled. This happens for local filesystems on drives connected through SATA. The PR does not purport to fix issues with external harddisks, network filesystems, corrupting cables, felines, and other additional sources of complications. Being robust to those would be great, if realistically possible, but is not the immediate goal here. |
I have powered down (unclean sudden shutdown) my Win8.1 VM and restarted Windows and Bitcoin-Qt. Bitcoin-Qt started without issues and is ~ on the same height (353900). Would it be worth to do the same procedure with the current master WITHOUT this PR to compare it? |
Sure. If you have a VM environment where you can take a snapshot, that's helpful, you can return to that after messing up the db. |
I made a snapshot before the first sudden shutdown. But just powered down the host by unplugging the power cable... Best chain could be activated and the node continues syncing. Will do some more shutdowns and see if nothing happens,... then switch to current master and try again. |
Re: "bad undo data", see #6923 - it's a potential issue, but not something that could be solved by this (as it isn't a leveldb problem) |
I now did restore serval times back to my snapshot and did some sudden power downs. ~50% it will end up with a corrupt database (100% with "bad undo data"). Will now try to run my snapshot with current master without this PR. |
That's strange - I only managed to get the undo data error once in all my tries, whereas (without this PR) it produced consistent leveldb errors. |
Difference in VM flushing behavior, I guess --- it may just be that the VM buffers and reorders writes and ignores fsync. Although... we appear to be missing a FileCommit in UndoWriteToDisk-- I think we need to have synced the blocks and undo before calling the insert. It would probably be better for performance if it wrote the block then undo, then did the syncs on both however. Edit: oh nevermind, FlushBlockFile hits the undo file too. :-/ |
With the current master (without this PR), after every sudden shut down, i get "Corruptions: error in middle of record". This PR looks after an improvement but getting #6923 for roughly 50% of power-off shutdowns. |
If it improves things, that is good. Filed this for the leveldb repository: bitcoin-core/leveldb-old#9 |
Closing the pull here, as it has been moved to the leveldb repository. It should come back to the bitcoin repository in a subtree update. |
Use a simple Win32WritableFile, equivalent to PosixWritableFile in posix_env.cc
The goal is to solve leveldb corruption issues on windows - this issue has been reported many times, see #5865, #6727, #5610, ...
In my testing with this change I was unable to cause the database to be corrupted when crashing a windows laptop in various ways using notmyfault. Without the change it happened the first time I tried.