From 1e09e214f25631cd32a6b048f76b9c2abfc32502 Mon Sep 17 00:00:00 2001 From: Yuanchu Xie Date: Tue, 25 Feb 2025 22:59:22 +0900 Subject: [PATCH] mm: multi-gen LRU: ignore non-leaf pmd_young for force_scan=true BugLink: https://bugs.launchpad.net/bugs/2099996 [ Upstream commit bceeeaed4817ba7ad9013b4116c97220a60fcf7c ] When non-leaf pmd accessed bits are available, MGLRU page table walks can clear the non-leaf pmd accessed bit and ignore the accessed bit on the pte if it's on a different node, skipping a generation update as well. If another scan occurs on the same node as said skipped pte. The non-leaf pmd accessed bit might remain cleared and the pte accessed bits won't be checked. While this is sufficient for reclaim-driven aging, where the goal is to select a reasonably cold page, the access can be missed when aging proactively for workingset estimation of a node/memcg. In more detail, get_pfn_folio returns NULL if the folio's nid != node under scanning, so the page table walk skips processing of said pte. Now the pmd_young flag on this pmd is cleared, and if none of the pte's are accessed before another scan occurs on the folio's node, the pmd_young check fails and the pte accessed bit is skipped. Since force_scan disables various other optimizations, we check force_scan to ignore the non-leaf pmd accessed bit. Link: https://lkml.kernel.org/r/20240813163759.742675-1-yuanchu@google.com Signed-off-by: Yuanchu Xie Acked-by: Yu Zhao Cc: "Huang, Ying" Cc: Lance Yang Signed-off-by: Andrew Morton Stable-dep-of: ddd6d8e975b1 ("mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats") Signed-off-by: Sasha Levin Signed-off-by: Koichiro Den Signed-off-by: Stefan Bader --- mm/vmscan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 37a89d54cb2b..dc8929303742 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3429,7 +3429,7 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long addr, struct vm_area goto next; if (!pmd_trans_huge(pmd[i])) { - if (should_clear_pmd_young()) + if (!walk->force_scan && should_clear_pmd_young()) pmdp_test_and_clear_young(vma, addr, pmd + i); goto next; } @@ -3516,7 +3516,7 @@ restart: walk->mm_stats[MM_NONLEAF_TOTAL]++; - if (should_clear_pmd_young()) { + if (!walk->force_scan && should_clear_pmd_young()) { if (!pmd_young(val)) continue;