Skip to content

Conversation

jianhuangli
Copy link
Collaborator

@jianhuangli jianhuangli commented Jul 29, 2025

This is the cherry-picks based on branch redfs-ubuntu-noble-6.8.0-58.60@62241a58cec8

Testing results,

xfstests-rocky95-uring-dio-2507261601.log:Failures: generic/099 generic/120 generic/193 generic/237 generic/263 generic/317 generic/319 generic/344 generic/346 generic/355 generic/375 generic/389 generic/444 generic/617 generic/631 generic/633 generic/647 generic/683 generic/697 generic/729
xfstests-rocky95-uring-dio-2507261601.log-Failed 20 of 756 tests
--
xfstests-rocky95-uring-dio-2507261601.log:Failures: generic/521
xfstests-rocky95-uring-dio-2507261601.log-Failed 1 of 2 tests
--
xfstests-rocky95-uring-nodio-2507261355.log:Failures: generic/099 generic/192 generic/209 generic/237 generic/263 generic/317 generic/319 generic/355 generic/363 generic/366 generic/375 generic/389 generic/444 generic/451 generic/617 generic/631 generic/633 generic/647 generic/683 generic/697 generic/729 generic/749
xfstests-rocky95-uring-nodio-2507261355.log-Failed 22 of 755 tests
--
xfstests-rocky95-uring-nodio-2507261355.log:Failures: generic/521
xfstests-rocky95-uring-nodio-2507261355.log-Failed 1 of 2 tests
--
xfstests-rocky95-without-uring-dio-2507261238.log:Failures: generic/099 generic/120 generic/193 generic/237 generic/263 generic/317 generic/319 generic/344 generic/346 generic/355 generic/375 generic/389 generic/444 generic/531 generic/617 generic/631 generic/633 generic/647 generic/683 generic/697 generic/729
xfstests-rocky95-without-uring-dio-2507261238.log-Failed 21 of 756 tests
--
xfstests-rocky95-without-uring-dio-2507261238.log:Failures: generic/521
xfstests-rocky95-without-uring-dio-2507261238.log-Failed 1 of 2 tests
--
xfstests-rocky95-without-uring-nodio-2507261159.log:Failures: generic/099 generic/192 generic/209 generic/237 generic/263 generic/317 generic/319 generic/355 generic/363 generic/366 generic/375 generic/389 generic/444 generic/451 generic/531 generic/617 generic/631 generic/633 generic/647 generic/683 generic/697 generic/729
xfstests-rocky95-without-uring-nodio-2507261159.log-Failed 22 of 755 tests
--
xfstests-rocky95-without-uring-nodio-2507261159.log:Failures: generic/521
xfstests-rocky95-without-uring-nodio-2507261159.log-Failed 1 of 2 tests

bsbernd and others added 11 commits July 25, 2025 08:26
When having writeback cache enabled it is beneficial for data consistency
to communicate to the FUSE server when the kernel prepares a page for caching.
This lets the FUSE server react and lock the page.

Additionally the kernel lets the FUSE server decide how much data it locks by the
same call and keeps the given information in the dlm lock management.

If the feature is not supported it will be disabled after first unsuccessful use.

- Add DLM_LOCK fuse opcode
- Add cache page lock caching for writeback cache functionality.
This means sending out a FUSE call whenever the kernel prepares a page
for writeback cache. The kernel will manage the cache so that it will keep
track of already acquired locks.
(except for the case that is documented in the code)
- Use rb-trees for the management of the already 'locked' page ranges
- Use rw_semaphore for synchronization in fuse_dlm_cache

(cherry picked from commit 287c884)
Add support to invalidate inode aliases when doing inode invalidation.
This is useful for distributed file systems, which use DLM for cache
coherency. So, when a client losts its inode lock, it should invalidate
its inode cache and dentry cache since the other client may delete
this file after getting inode lock.

Signed-off-by: Yong Ze Chen <yochen@ddn.com>
(cherry picked from commit 49720b5)
Renumber the operation code to a high value to avoid conflicts with upstream.

(cherry picked from commit 27a0e9e)
Send a DLM_WB_LOCK request in the page_mkwrite handler to enable FUSE
filesystems to acquire a distributed lock manager (DLM) lock for
protecting upcoming dirty pages when a previously read-only mapped
page is about to be written.

Signed-off-by: Cheng Ding <cding@ddn.com>
(cherry picked from commit ec36c45)
Allow read_folio to return EAGAIN error and translate it to
AOP_TRUNCATE_PAGE to retry page fault and read operations.
This is used to prevent deadlock of folio lock/DLM lock order reversal:
 - Fault or read operations acquire folio lock first, then DLM lock.
 - FUSE daemon blocks new DLM lock acquisition while it invalidating
   page cache. invalidate_inode_pages2_range() acquires folio lock
To prevent deadlock, the FUSE daemon will fail its DLM lock acquisition
with EAGAIN if it detects an in-flight page cache invalidating
operation.

Signed-off-by: Cheng Ding <cding@ddn.com>
(cherry picked from commit 8ecf118)
generic/488 fails with fuse2fs in the following fashion:

generic/488       _check_generic_filesystem: filesystem on /dev/sdf is inconsistent
(see /var/tmp/fstests/generic/488.full for details)

This test opens a large number of files, unlinks them (which really just
renames them to fuse hidden files), closes the program, unmounts the
filesystem, and runs fsck to check that there aren't any inconsistencies
in the filesystem.

Unfortunately, the 488.full file shows that there are a lot of hidden
files left over in the filesystem, with incorrect link counts.  Tracing
fuse_request_* shows that there are a large number of FUSE_RELEASE
commands that are queued up on behalf of the unlinked files at the time
that fuse_conn_destroy calls fuse_abort_conn.  Had the connection not
aborted, the fuse server would have responded to the RELEASE commands by
removing the hidden files; instead they stick around.

Create a function to push all the background requests to the queue and
then wait for the number of pending events to hit zero, and call this
before fuse_abort_conn.  That way, all the pending events are processed
by the fuse server and we don't end up with a corrupt filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
(cherry picked from commit d4262f9)
This is a preparation to allow fuse-io-uring bg queue
flush from flush_bg_queue()

This does two function renames:
fuse_uring_flush_bg -> fuse_uring_flush_queue_bg
fuse_uring_abort_end_requests -> fuse_uring_flush_bg

And fuse_uring_abort_end_queue_requests() is moved to
fuse_uring_stop_queues().

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
(cherry picked from commit e70ef24)
This is useful to have a unique API to flush background requests.
For example when the bg queue gets flushed before
the remaining of fuse_conn_destroy().

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
(cherry picked from commit fc4120c)
When writing back pages while using writeback caching the code did a copy of data into
temporary pages to avoid a deadlock in reclaiming of memory.

This is an adaptation and backport of a patch by Joanne Koong joannelkoong@gmail.com.

Since we use pinned memory with io_uring we don't need the temporary copies
and we don't use the AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM flag in the pagemap.

Link: https://www.spinics.net/lists/linux-mm/msg407405.html
Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
(cherry picked from commit 114c4df)
When calling the fuse server with a dlm request and the fuse server
responds with some other error than ENOSYS most likely the lock size
will be set to zero. In that case the kernel will abort the fuse
connection. This is completely unnecessary.

Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
(cherry picked from commit 0bc2f9c)
Check whether dlm is still enabled when interpreting the returned
error from fuse server.

Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
(cherry picked from commit f6fbf7c)
@jianhuangli jianhuangli requested review from bsbernd and openunix July 29, 2025 03:11
@bsbernd bsbernd merged commit 1b5f902 into DDNStorage:redfs-rhel9_5-503.40.1 Aug 6, 2025
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants