Skip to content

Conversation

tstabrawa
Copy link
Contributor

@tstabrawa tstabrawa commented Nov 5, 2024

Motivation and Context

Starting in kernel version 6.3.0, Linux page migration is done in a batched manner, where several pages are first unmapped, and then the pages moved. See @RodoMa92's bisection test results in #15140 for details on the kernel change. As a result of this change, and due to ZFS's implicit use of fallback_migrate_folio (by not specifying a .migrate_folio function in its address_space_operations), when dirty pages are migrated, they are written back during the "move" step. Then, when the "move" step is retried, fallback_migrate_folio calls into migrate_folio, which in turn calls into migrate_folio_extra, and migrate_folio_extra BUG's out due to the page still being under writeback.

Note: The kernel page migration code will actually wait for any previously-started writeback during the "unmap" step, which could explain why unbatched retries didn't hit this problem prior to kernel 6.3.0. For example, in 6.2.16, unmap_and_move will wait for writeback to complete each time it retries migration on a page (such as would be the case if fallback_migrate_folio called writeout for the page). In contrast, in 6.3.1, since only the "move" step is repeated on move-related retries, the kernel doesn't wait for writeback operations started by fallback_migrate_folio.

A previous attempt at fixing this problem was merged as part of PR #16568, but this turned out not to be a complete fix. @RodoMa92 re-raised the problem and explained that, while the problem occurred less frequently, it still occurred approximately once per 20 attempts.

Description

Rather than ZFS using the default fallback_migrate_folio function as a result of not specifying a .migrate_folio function, this PR causes it to use the simple folio migration function (migrate_folio) instead. Notably, migrate_folio (since it's intended for filesystems that don't use buffers / private data) doesn't start writeback like fallback_migrate_folio does, and therefore migrate_folio_extra wouldn't BUG() on pages still under writeback, even without us setting PagePrivate on every page.

Additionally, this PR reverts the changes merged as part of PR #16568, as these changes are no longer necessary when using migrate_folio instead of fallback_migrate_folio.

A side effect of this change could be that ZFS page migration gets noticeably faster and/or more effective (as it would no longer be skipping and/or starting writeback on all dirty pages being migrated).

How Has This Been Tested?

Tested by user @RodoMa92 - results in the following comments on #16568:

Also, regression-tested by running ZFS Test Suite (on Ubunutu 23.10, running kernel version 6.5.0-44-generic. No crashes were observed, though several tests were skipped/failed/killed. I don't expect these errors were caused by my changes but would defer to your expertise if you have any concerns:

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

This reverts commit b052035.

Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
@AllKind AllKind mentioned this pull request Nov 5, 2024
13 tasks
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
@tstabrawa
Copy link
Contributor Author

Looks like I had a typo in the code for older kernel APIs (.migrate_page should be .migratepage). Amending my commit to fix.

diff --git a/module/os/linux/zfs/zpl_file.c b/module/os/linux/zfs/zpl_file.c
index f7ae381..f6e0143 100644
--- a/module/os/linux/zfs/zpl_file.c
+++ b/module/os/linux/zfs/zpl_file.c
@@ -1094,7 +1094,7 @@ const struct address_space_operations zpl_address_space_operations = {
 #ifdef HAVE_VFS_MIGRATE_FOLIO
 	.migrate_folio	= migrate_folio,
 #else
-	.migrate_page	= migrate_page,
+	.migratepage	= migrate_page,
 #endif
 };

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Nov 5, 2024
@behlendorf behlendorf requested a review from bwatkinson November 6, 2024 01:22
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for running this down so quickly. This is a much cleaner fix and clearly the intended way to handle this.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Nov 6, 2024
@behlendorf behlendorf closed this in f38e2d2 Nov 6, 2024
behlendorf pushed a commit that referenced this pull request Nov 6, 2024
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes #16568
Closes #16723
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Nov 6, 2024
This reverts commit b052035.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes openzfs#16568
Closes openzfs#16723
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Nov 6, 2024
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes openzfs#16568
Closes openzfs#16723
@RodoMa92
Copy link

RodoMa92 commented Nov 9, 2024

Thanks for this, @tstabrawa. Still held up fine as of today. I feel much more confident now to say it's finally closed, and allow me again to use the amazing features of ZFS.

Again, thanks a lot :)

Marco

ixhamza pushed a commit to truenas/zfs that referenced this pull request Nov 11, 2024
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes openzfs#16568
Closes openzfs#16723
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Jan 26, 2025
This reverts commit b052035.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes openzfs#16568
Closes openzfs#16723
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Jan 26, 2025
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes openzfs#16568
Closes openzfs#16723
robn pushed a commit to robn/zfs that referenced this pull request May 12, 2025
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes openzfs#16568
Closes openzfs#16723
(cherry picked from commit 7b6e967)
robn pushed a commit to robn/zfs that referenced this pull request May 12, 2025
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes openzfs#16568
Closes openzfs#16723
(cherry picked from commit 7b6e967)
@robn robn mentioned this pull request May 12, 2025
13 tasks
tonyhutter pushed a commit that referenced this pull request May 27, 2025
Avoids using fallback_migrate_folio, which starts unnecessary writeback
(leading to BUG in migrate_folio_extra).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com>
Closes #16568
Closes #16723
(cherry picked from commit 7b6e967)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants