CVE Details
Description
In the Linux kernel, the following vulnerability has been resolved:\nmm/swap: fix race when skipping swapcache\nWhen skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads\nswapin the same entry at the same time, they get different pages (A, B). \nBefore one thread (T0) finishes the swapin and installs page (A) to the\nPTE, another thread (T1) could finish swapin of page (B), swap_free the\nentry, then swap out the possibly modified page reusing the same entry. \nIt breaks the pte_same check in (T0) because PTE value is unchanged,\ncausing ABA problem. Thread (T0) will install a stalled page (A) into the\nPTE and cause data corruption.\nOne possible callstack is like this:\nCPU0 CPU1\n---- ----\ndo_swap_page() do_swap_page() with same entry\n \n \nswap_read_folio() <- read to page A swap_read_folio() <- read to page B\n \n... set_pte_at()\nswap_free() <- entry is free\n\n\npte_same() <- Check pass, PTE seems\nunchanged, but page A\nis stalled!\nswap_free() <- page B content lost!\nset_pte_at() <- staled page A installed!\nAnd besides, for ZRAM, swap_free() allows the swap device to discard the\nentry content, so even if page (B) is not modified, if swap_read_folio()\non CPU0 happens later than swap_free() on CPU1, it may also cause data\nloss.\nTo fix this, reuse swapcache_prepare which will pin the swap entry using\nthe cache flag, and allow only one thread to swap it in, also prevent any\nparallel code from putting the entry in the cache. Release the pin after\nPT unlocked.\nRacers just loop and wait since it's a rare and very short event. A\nschedule_timeout_uninterruptible(1) call is added to avoid repeated page\nfaults wasting too much CPU, causing livelock or adding too much noise to\nperf statistics. A similar livelock issue was described in commit\n029c4628b2eb ('mm: swap: get rid of livelock in swapin readahead')\nReproducer:\nThis race issue can be triggered easily using a well constructed\nreproducer and patched brd (with a delay in read path) [1]:\nWith latest 6.8 mainline, race caused data loss can be observed easily:\n$ gcc -g -lpthread test-thread-swap-race.c && ./a.out\nPolulating 32MB of memory region...\nKeep swapping out...\nStarting round 0...\nSpawning 65536 workers...\n32746 workers spawned, wait for done...\nRound 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss!\nRound 0: Error on 0x395200, expected 32746, got 32743, 3 data loss!\nRound 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss!\nRound 0 Failed, 15 data loss!\nThis reproducer spawns multiple threads sharing the same memory region\nusing a small swap device. Every two threads updates mapped pages one by\none in opposite direction trying to create a race, with one dedicated\nthread keep swapping out the data out using madvise.\nThe reproducer created a reproduce rate of about once every 5 minutes, so\nthe race should be totally possible in production.\nAfter this patch, I ran the reproducer for over a few hundred rounds and\nno data loss observed.\nPerformance overhead is minimal, microbenchmark swapin 10G from 32G\nzram:\nBefore: 10934698 us\nAfter: 11157121 us\nCached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag)\n[kasong@tencent.com: v4]\nLink: https://lkml.kernel.org/r/20240219082040.7495-1-ryncsn@gmail.com
See more information about CVE-2024-26759 from MITRE CVE dictionary and NIST NVD
CVSS Scoring
NOTE: The following CVSS v3.1 metrics and score provided are preliminary and subject to review.
Base Score: |
5.5 |
CVSS Vector: |
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H |
Attack Vector: |
Local network |
Attack Complexity: |
Low |
Privileges Required: |
Low |
User Interaction: |
None |
Scope: |
Unchanged |
Confidentiality Impact: |
None |
Integrity Impact: |
None |
Availability Impact: |
High |
Errata information