ack-tegra

Author	SHA1	Message	Date
Kalesh Singh	15b48eb602	Revert "ANDROID: 16K: Use vma_area slab cache for pad VMA" This reverts aosp/I24c5f5d0eb3b06acf506f18f5eb57cd497b13d6d. Bug: 440210631 Bug: 432564748 Change-Id: I936ae92313fa32fed80efe1bb35c9b4da0afd8d2 Signed-off-by: Kalesh Singh <kaleshsingh@google.com>	2025-08-21 15:43:26 -07:00
Greg Kroah-Hartman	1741b1e583	Merge android16-6.12 into android16-6.12-lts This merges the android16-6.12 branch into the -lts branch, catching it up with the latest changes in there. It contains the following commits: * `21ed84930c` UPSTREAM: Revert "usb: xhci: Implement xhci_handshake_check_state() helper" * `5b3ae3bcbe` BACKPORT: usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removed * `5c72e9faba` ANDROID: rust_binder: adjust errors from death notifications * `9e02edea7f` ANDROID: rust_binder: use u64 for death cookie * `4317f0aeff` ANDROID: f2fs: fixup ABI break due to reserved_pin_section * `25bdb4a624` Revert "ANDROID: ABI: update symbol list for honor" * `a76eb2b67b` ANDROID: GKI: Update oplus symbol list * `6222007a04` ANDROID: mm/readahead: add for bypass high order allocation * `659d7bb454` ANDROID: ABI: Update symbol list for exynos * `26937a37f5` ANDROID: MODVERSIONS: hide type definition in drivers/usb/core/driver.c * `8760b6e4f5` ANDROID: usb: Add vendor hook for usb suspend and resume * `da662aecc8` FROMLIST: KVM: Avoid synchronize_srcu() in kvm_io_bus_register_dev() * `4be05c6524` FROMLIST: KVM: arm64: vgic: Explicitly implement vgic_dist::ready ordering * `d6045efc66` FROMLIST: KVM: arm64: vgic-init: Remove vgic_ready() macro * `f06dd0cd35` ANDROID: rust_binder: release threads before refs * `5bbd30a60b` ANDROID: ABI: Update pixel symbol list * `bafbebf2ab` ANDROID: GKI: Update symbol list for xiaomi * `b7b130b7cc` ANDROID: export folio_deactivate() for GKI purpose. * `41f730f9c4` ANDROID: GKI: update exynos symbol list * `766ecae19f` UPSTREAM: xhci: dbctty: disable ECHO flag by default * `8ea40f5243` ANDROID: GKI: Update xiaomi symbol list. * `5594b4731d` ANDROID: vendor_hooks: export tracepoint symbols * `0d4cc1daff` ANDROID: KVM: arm64: Don't update IOMMU under memory pressure * `672185e575` ANDROID: iommu/iommu: Handle multi-page deferred sg mappings * `740d42d181` ANDROID: vendor_hooks: Add vendor_hook in futex to fix the OEM scheduling priority bug * `6eb6f346ac` ANDROID: ABI: Update symbol list for mtk * `c302079179` ANDROID: vendor_hooks: Add vendor hook for GenieZone demand paging * `5c1cddc983` ANDROID: vendor_hooks: Add vendor hook for GenieZone para-virtualization * `d893caf112` ANDROID: ashmem_rust: Add support for retrieving an ashmem area's vmfile * `0be74214c0` ANDROID: ashmem_rust: Add support for querying the size of an ashmem region * `eb50f663c4` ANDROID: ashmem_rust: Add support for providing an ashmem region's name * `6bdbae6ea9` ANDROID: ashmem_rust: Add is_ashmem_file() * `0d890f867e` ANDROID: ABI: update symbol list for honor * `12727f8a4b` FROMGIT: f2fs: introduce reserved_pin_section sysfs entry * `286cd9d628` ANDROID: GKI: Update RTK STB KMI symbol list * `7b4f7682b5` ANDROID: GKI: Update symbol list for Amlogic * `862ce4b2c4` ANDROID: KVM: arm64: iommu: Fix power tracking * `61184996a8` ANDROID: drivers/iommu: Fix return value in iommu_map_sg * `acad0cd51d` ANDROID: ABI: update symbol list for galaxy * `393dbad32c` ANDROID: vendor_hook: add condition to call for freezing fail * `b62fe47ba2` ANDROID: fix ashmem_rust return EINVAL bug in ashmem_rust.rs * `a7e1300b95` ANDROID: Revert "cpufreq: Avoid using inconsistent policy->min and policy->max" * `15d2fe0544` ANDROID: qcom: Update the ABI symbol list * `f6ca783ba2` UPSTREAM: scsi: ufs: qcom: Check gear against max gear in vop freq_to_gear() * `237708e9d3` ANDROID: GKI: Update symbols list file for honor White list the vm_normal_folio_pmd * `f18e354aa9` ANDROID: mm: export vm_normal_folio_pmd to allow vendors to implement simplified smaps * `c181c478b0` ANDROID: vendor_hooks: add hook to record slab free * `d2e452e197` ANDROID: Build fixups with PROXY_EXEC v18 + !CONFIG_SMP * `4f9e4406e4` ANDROID: Update proxy-exec logic from v14 to v18 * `3fa8dabe1a` ANDROID: GKI: update asr symbols list * `94310b3f77` ANDROID: Add the dma header to aarch64 allowlist * `880d6538c5` UPSTREAM: usb: gadget: u_serial: Fix race condition in TTY wakeup * `b115bf2302` ANDROID: ABI: Update symbol list for mtk * `e87018c5f9` FROMGIT: sched/deadline: Fix dl_server runtime calculation formula * `e2bf362ee2` FROMGIT: sched/core: Fix migrate_swap() vs. hotplug * `06ca12d7d2` ANDROID: GKI: update the ABI symbol list * `55972ed83a` ANDROID: Fixup init_user_ns CRC change * `4e873ad607` ANDROID: user: Add vendor hook to user for GKI purpose * `a097cd9c30` ANDROID: export find_user() for GKI purpose. * `85b8233f7e` ANDROID: rust_binder: use euid from the task * `969c904869` ANDROID: ashmem: rename VmAreaNew->VmaNew * `2ab3e5f283` ANDROID: rust_binder: rename VmAreaNew->VmaNew * `2ef75ab83a` ANDROID: rust_binder: use tgid_nr_ns for getting pid * `6a2be11026` UPSTREAM: task: rust: rework how current is accessed * `602e2300de` UPSTREAM: rust: add PidNamespace * `12dfc1d9cb` UPSTREAM: rust: miscdevice: add mmap support * `8e67cb756f` UPSTREAM: mm: rust: add VmaNew for f_ops->mmap() * `bd140ddf75` UPSTREAM: mm: rust: add mmput_async support * `0c50773076` UPSTREAM: mm: rust: add lock_vma_under_rcu * `0b5465bb31` UPSTREAM: mm: rust: add vm_insert_page * `d7f52612c5` UPSTREAM: mm: rust: add vm_area_struct methods that require read access * `f03d4f7490` UPSTREAM: mm: rust: add abstraction for struct mm_struct * `2ef6dbc73e` BACKPORT: rust: miscdevice: change how f_ops vtable is constructed * `1acd3b312f` Revert "FROMLIST: mm: rust: add abstraction for struct mm_struct" * `a012c15566` Revert "FROMLIST: mm: rust: add vm_area_struct methods that require read access" * `3be00a9bf8` Revert "FROMLIST: mm: rust: add vm_insert_page" * `3aed88205e` Revert "FROMLIST: mm: rust: add lock_vma_under_rcu" * `a121b6e72f` Revert "FROMLIST: mm: rust: add mmput_async support" * `9248564a81` Revert "FROMLIST: mm: rust: add VmAreaNew for f_ops->mmap()" * `6de3ace5b5` Revert "FROMLIST: rust: miscdevice: add mmap support" * `b7f54dd23b` Revert "BACKPORT: FROMLIST: task: rust: rework how current is accessed" * `5913c80b22` ANDROID: iommu/arm-smmu-v3-kvm: Fix idmap free_leaf * `c40c54e669` UPSTREAM: erofs: impersonate the opener's credentials when accessing backing file * `4d0200d0a9` BACKPORT: erofs: add 'fsoffset' mount option to specify filesystem offset * `399deda7b5` ANDROID: scsi: ufs: add UFSHCD_ANDROID_QUIRK_NO_IS_READ_ON_H8 * `f6b1ab83f6` ANDROID: rust_binder: remove binder_logs/procs/pid immediately * `dd35623c83` ANDROID: ABI: update symbol list for mtktv * `58beebb30f` FROMLIST: fuse: give wakeup hints to the scheduler * `0f917e4066` ANDROID: virt: gunyah: Replace arm_smccc_1_1_smc with arm_smccc_1_1_invoke * `33429dd323` UPSTREAM: posix-cpu-timers: fix race between handle_posix_cpu_timers() and posix_cpu_timer_del() * `6483832947` ANDROID: GKI: Update symbol list file for xiaomi * `668635cd34` UPSTREAM: usb: gadget: uvc: dont call usb_composite_setup_continue when not streaming Change-Id: I64074144d1a6da9fdd3b4dd5f8314ccea4f9d9e8 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2025-07-13 12:17:44 +00:00
John Stultz	4f9e4406e4	ANDROID: Update proxy-exec logic from v14 to v18 This updates the proxy-exec logic in android16-6.12 which was added at v14, to be synced with the v18 series of the patchset. v14 series: https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v14-6.12 v18 series: https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v18-6.12 Changes since v14: * Improved naming consistency and using the guard macro where appropriate * Improved comments * Build fixes for !CONFIG_SMP * Fixes for when sched_proxy_exec() is disabled * Renamed update_curr_se to update_se_times, as suggested by Steven Rostedt. * Use put_prev_set_next_task as suggested by K Prateek Nayak * Try to rework find_proxy_task() locking to use guard and proxy_deactivate_task() in the way Peter suggested. * Simplified changes to enqueue_task_rt to match deadline's logic, as pointed out by Peter * Get rid of preserve_need_resched flag and rework per Peter's suggestion * Rework find_proxy_task() to use guard to cleanup the exit gotos as Peter suggested. * Reworked the forced return-migration from find_proxy_task to use Peter’s dequeue+wakeup approach, which helps resolve the cpuhotplug issues I had also seen, caused by the manual return migration sending tasks to offline cpus. * A number of improvements to the commit messages and comments suggested by Juri Lelli and Peter Zijlstra * Added missing logic to put_prev_task_dl as pointed out by K Prateek Nayak * Add lockdep_assert_held_once and drop the READ_ONCE in __get_task_blocked_on(), as suggested by Juri Lelli * Moved update_curr_task logic into update_curr_se to simplify things * Renamed update_se_times to update_se, as suggested by Peter * Reworked logic to fix an issue Peter pointed out with thread group accounting being done on the donor, rather than the running execution context. * Fixed typos caught by Metin Kaya * Suleiman Souhlal noticed an inefficiency in that we evaluate if the lock owner’s task_cpu() is the current cpu, before we look to see if the lock owner is on_rq at all. With v17 this would result in us proxy-migrating a donor to a remote cpu, only to then realize the task wasn’t even on the runqueue, and doing the sleeping owner enqueuing. Suleiman suggested instead that we evaluate on_rq first, so we can immediately do sleeping owner enqueueing. Then only if the owner is on a runqueue do we proxy-migrate the donor (which requires the more costly lock juggling). While not a huge logical change, it did uncover other problems, which needed to be resolved. * One issue found was there was a race where if do_activate_blocked_waiter() from the sleeping owner wakeup was delayed and the task had already been woken up elsewhere. It’s possible if that task was running and called into schedule() to be blocked, it would be dequeued from the runqueue, but before we switched to the new task, do_activate_blocked_waiter() might try to activate it on a different cpu. Clearly the do_activate_blocked_waiter() needed to check the task on_cpu value as well. * I found that we still can hit wakeups that end up skipping the BO_WAKING -> BO_RUNNALBE transition (causing find_proxy_task() to end up spinning waiting for that transition), so I re-added the logic to handle doing return migrations from find_proxy_task() if we hit that case. * Hupu suggested a tweak in ttwu_runnable() to evaluate proxy_needs_return() slightly earlier. * Kuyo Chang reported and isolated a fix for a problem with __task_is_pushable() in the !sched_proxy_exec case, which was folded into the “sched: Fix rt/dl load balancing via chain level balance” patch * Reworked some of the logic around releasing the rq->donor reference on migrations, using rq->idle directly. * Sueliman also pointed out that some added task_struct elements were not being initialized in the init_task code path, so that was good to fix. Bug: 427820735 Change-Id: I20ce778e474124a917dbf51378dc1301535ac858 Signed-off-by: John Stultz <jstultz@google.com>	2025-07-07 12:27:42 -07:00
Greg Kroah-Hartman	69f799168c	Merge 6.12.31 into android16-6.12-lts GKI (arm64) relevant 137 out of 624 changes, affecting 192 files +1647/-1035 `a4f865ecdb` nvmem: core: fix bit offsets of more than one byte [1 file, +17/-7] `4327479e55` nvmem: core: verify cell's raw_len [1 file, +12/-0] `410f8b72e0` nvmem: core: update raw_len if the bit reading is required [1 file, +3/-1] `7aea1517fb` scsi: ufs: Introduce quirk to extend PA_HIBERN8TIME for UFS devices [2 files, +35/-0] `b730cb1096` virtio_ring: Fix data race by tagging event_triggered as racy for KCSAN [1 file, +1/-1] `2998813177` dma/mapping.c: dev_dbg support for dma_addressing_limited [1 file, +10/-1] `3eec42a17a` dma-mapping: avoid potential unused data compilation warning [1 file, +8/-4] `97edaa0ec6` cgroup: Fix compilation issue due to cgroup_mutex not being exported [1 file, +1/-1] `f93675793b` vhost_task: fix vhost_task_create() documentation [1 file, +1/-1] `e22034cbee` dma-mapping: Fix warning reported for missing prototype [1 file, +8/-8] `4f5553a08f` fs/buffer: split locking for pagecache lookups [1 file, +25/-16] `e138fc2316` fs/buffer: introduce sleeping flavors for pagecache lookups [2 files, +17/-0] `a49a4a87ce` fs/buffer: use sleeping version of __find_get_block() [1 file, +9/-2] `f1c5aa614b` fs/jbd2: use sleeping version of __find_get_block() [1 file, +9/-6] `9ece099e95` fs/ext4: use sleeping version of sb_find_get_block() [1 file, +2/-1] `64f505b08e` block: fix race between set_blocksize and read paths [4 files, +43/-1] `218c838d03` io_uring: don't duplicate flushing in io_req_post_cqe [1 file, +8/-3] `8014d3e56e` bpf: fix possible endless loop in BPF map iteration [1 file, +1/-1] `d40ca27602` fuse: Return EPERM rather than ENOSYS from link() [1 file, +2/-0] `bab0bd1389` exfat: call bh_read in get_block only when necessary [1 file, +77/-82] `01677e7ee1` io_uring/msg: initialise msg request opcode [1 file, +1/-0] `e506751b7d` arm64: Add support for HIP09 Spectre-BHB mitigation [2 files, +3/-0] `4f427ca9ed` tracing: Mark binary printing functions with __printf() attribute [4 files, +18/-21] `15787ab82a` mailbox: use error ret code of of_parse_phandle_with_args() [1 file, +4/-3] `f48ee562c0` Bluetooth: Disable SCO support if READ_VOICE_SETTING is unsupported/broken [1 file, +3/-0] `44b79041c4` dql: Fix dql->limit value when reset. [1 file, +1/-1] `ac30595154` lockdep: Fix wait context check on softirq for PREEMPT_RT [1 file, +18/-0] `e63b634806` PCI: dwc: ep: Ensure proper iteration over outbound map windows [1 file, +1/-1] `37ac2434aa` ext4: on a remount, only log the ro or r/w state when it has changed [1 file, +4/-3] `1d1e1efad1` libnvdimm/labels: Fix divide error in nd_label_data_init() [1 file, +2/-1] `123bcd8f42` pidfs: improve multi-threaded exec and premature thread-group leader exit polling [3 files, +9/-9] `8f82cf305e` cgroup/rstat: avoid disabling irqs for O(num_cpu) [1 file, +5/-7] `a5a507fa5f` blk-cgroup: improve policy registration error handling [1 file, +12/-10] `94c3cbc69a` ext4: reorder capability check last [1 file, +2/-2] `e658f2d94a` bpf: Return prog btf_id without capable check [1 file, +2/-2] `e2520cc19b` PCI: dwc: Use resource start as ioremap() input in dw_pcie_pme_turn_off() [1 file, +1/-1] `50452704ec` jbd2: do not try to recover wiped journal [1 file, +6/-5] `dab35f4921` tcp: reorganize tcp_in_ack_event() and tcp_count_delivered() [1 file, +32/-24] `555c0b713c` bpf: Allow pre-ordering for bpf cgroup progs [5 files, +30/-9] `572ed3fb99` kconfig: do not clear SYMBOL_VALID when reading include/config/auto.conf [1 file, +12/-7] `174dedce64` dm: restrict dm device size to 2^63-512 bytes [1 file, +4/-0] `2f5f326214` ext4: reject the 'data_err=abort' option in nojournal mode [1 file, +12/-0] `d0dc233fe2` posix-timers: Add cond_resched() to posix_timer_add() search loop [1 file, +1/-0] `ae22452d15` posix-timers: Ensure that timer initialization is fully visible [1 file, +14/-7] `3fb9ee05ec` timer_list: Don't use %pK through printk() [1 file, +2/-2] `21153e0974` netfilter: conntrack: Bound nf_conntrack sysctl writes [1 file, +9/-3] `236a87e9d2` PNP: Expand length of fixup id string [1 file, +1/-1] `6215143ad3` arm64/mm: Check pmd_table() in pmd_trans_huge() [1 file, +12/-12] `8ad58a7eba` arm64/mm: Check PUD_TYPE_TABLE in pud_bad() [1 file, +2/-1] `28306c58da` mmc: sdhci: Disable SD card clock before changing parameters [1 file, +7/-2] `3a75fe58a1` usb: xhci: Don't change the status of stalled TDs on failed Stop EP [1 file, +11/-1] `101a3b9920` printk: Check CON_SUSPEND when unblanking a console [1 file, +12/-2] `faba68a86a` wifi: cfg80211: allow IR in 20 MHz configurations [5 files, +46/-25] `c1502fc84d` ipv6: save dontfrag in cork [2 files, +6/-4] `75ae2a3553` badblocks: Fix a nonsense WARN_ON() which checks whether a u64 variable < 0 [1 file, +3/-2] `7caad075ac` crypto: lzo - Fix compression buffer overrun [6 files, +106/-28] `73d01bcbf2` tcp: bring back NUMA dispersion in inet_ehash_locks_alloc() [1 file, +26/-11] `1c17190880` usb: xhci: set page size to the xHCI-supported size [2 files, +22/-20] `93f581d763` drm/gem: Test for imported GEM buffers with helper [2 files, +16/-2] `c4525b513d` net: phylink: use pl->link_interface in phylink_expects_phy() [1 file, +1/-1] `f29c876d72` perf/core: Clean up perf_try_init_event() [1 file, +38/-27] `af73c8fd73` ublk: enforce ublks_max only for unprivileged devices [1 file, +27/-15] `592ba27580` perf/hw_breakpoint: Return EOPNOTSUPP for unsupported breakpoint type [1 file, +3/-2] `3de322a98b` scsi: logging: Fix scsi_logging_level bounds [1 file, +3/-1] `f33b310eac` ipv4: fib: Move fib_valid_key_len() to rtm_to_fib_config(). [2 files, +16/-24] `564f03a797` block: mark bounce buffering as incompatible with integrity [2 files, +5/-2] `82209faa87` ublk: complete command synchronously on error [1 file, +6/-5] `b98aad5e5e` media: uvcvideo: Add sanity check to uvc_ioctl_xu_ctrl_map [1 file, +6/-0] `2d6231d5ce` media: uvcvideo: Handle uvc menu translation inside uvc_get_le_value [1 file, +32/-45] `e359d62886` perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters [1 file, +2/-2] `673dde8d3c` bpf: Search and add kfuncs in struct_ops prologue and epilogue [1 file, +24/-1] `083383aba0` cpuidle: menu: Avoid discarding useful information [1 file, +12/-1] `20a53c3689` loop: check in LO_FLAGS_DIRECT_IO in loop_default_blocksize [1 file, +1/-1] `b55a97d1bd` dm: fix unconditional IO throttle caused by REQ_PREFLUSH [1 file, +6/-2] `9f27b38771` crypto: ahash - Set default reqsize from ahash_alg [2 files, +7/-0] `897c98fb32` crypto: skcipher - Zap type in crypto_alloc_sync_skcipher [1 file, +1/-0] `4d9fa2ebc0` net: ipv6: Init tunnel link-netns before registering dev [4 files, +9/-7] `53f42776e4` genirq/msi: Store the IOMMU IOVA directly in msi_desc instead of iommu_cookie [2 files, +25/-36] `2b129e89b8` bpf: don't do clean_live_states when state->loop_entry->branches > 0 [1 file, +4/-0] `46ba5757a7` bpf: copy_verifier_state() should copy 'loop_entry' field [1 file, +3/-0] `82b54455b6` PCI: Fix old_size lower bound in calculate_iosize() too [1 file, +2/-4] `dc5f5c9d2b` hrtimers: Replace hrtimer_clock_to_base_table with switch-case [1 file, +12/-17] `000dd6e344` ASoC: ops: Enforce platform maximum on initial value [1 file, +28/-1] `c4260bf83b` ASoC: soc-dai: check return value at snd_soc_dai_set_tdm_slot() [1 file, +5/-3] `5b1b4cb46d` pinctrl: devicetree: do not goto err when probing hogs in pinctrl_dt_to_map [1 file, +8/-2] `69689d1138` media: v4l: Memset argument to 0 before calling get_mbus_config pad op [2 files, +5/-1] `e6e31b0182` sched: Reduce the default slice to avoid tasks getting an extra tick [1 file, +3/-3] `ef31dc41cf` phy: core: don't require set_mode() callback for phy_get_mode() to work [1 file, +4/-3] `06daedb443` xfrm: prevent high SEQ input in non-ESN mode [1 file, +12/-0] `9f2911868a` ip: fib_rules: Fetch net from fib_rule in fib[46]_rule_configure(). [2 files, +4/-4] `7fea5a9140` r8152: add vendor/device ID pair for Dell Alienware AW1022z [2 files, +2/-0] `16ddd67bb5` pstore: Change kmsg_bytes storage size to u32 [3 files, +9/-8] `73733c2fdb` ext4: don't write back data before punch hole in nojournal mode [1 file, +5/-13] `1d15319323` f2fs: introduce f2fs_base_attr for global sysfs entries [1 file, +52/-22] `ded26f9e4c` ipv4: ip_gre: Fix set but not used warning in ipgre_err() if IPv4-only [1 file, +10/-6] `76e56dbe50` net: flush_backlog() small changes [1 file, +8/-4] `58cdd1ee65` bridge: mdb: Allow replace of a host-joined group [2 files, +2/-2] `fcabb69674` rcu: handle unstable rdp in rcu_read_unlock_strict() [2 files, +11/-2] `d402437cde` rcu: fix header guard for rcu_all_qs() [1 file, +1/-1] `887e39ac47` perf: Avoid the read if the count is already updated [3 files, +24/-18] `c80b2d159c` bpf: Use kallsyms to find the function name of a struct_ops's stub function [1 file, +44/-54] `46f1c2b508` firmware: arm_scmi: Relax duplicate name constraint across protocol ids [1 file, +6/-13] `1351052877` drm/atomic: clarify the rules around drm_atomic_state->allow_modeset [1 file, +21/-2] `9fddd1f154` drm: Add valid clones check [1 file, +28/-0] `ff214b079d` nvme-pci: add quirks for device 126f:1001 [1 file, +3/-0] `6d196cae4b` nvme-pci: add quirks for WDC Blue SN550 15b7:5009 [1 file, +3/-0] `6a09b6bad0` ALSA: usb-audio: Fix duplicated name in MIDI substream names [1 file, +12/-4] `ad3e83a6c8` io_uring/fdinfo: annotate racy sq/cq head/tail reads [1 file, +2/-2] `7f7c8c03fe` btrfs: correct the order of prelim_ref arguments in btrfs__prelim_ref [1 file, +1/-1] `8cafd7266f` __legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock [1 file, +1/-5] `28756f22de` espintcp: fix skb leaks [3 files, +9/-3] `9cbca30102` espintcp: remove encap socket caching to avoid reference leak [4 files, +8/-94] `b1a687eb15` xfrm: Fix UDP GRO handling for some corner cases [2 files, +20/-16] `447c8f0c06` kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for fork() [1 file, +5/-4] `252f78a931` xfrm: Sanitize marks before insert [2 files, +6/-0] `7207effe47` driver core: Split devres APIs to device/devres.h [2 files, +125/-118] `1e8b7e96f7` Bluetooth: L2CAP: Fix not checking l2cap_chan security level [1 file, +8/-7] `cd7f022296` loop: don't require ->write_iter for writable files in loop_configure [1 file, +0/-3] `873ebaf3c1` io_uring: fix overflow resched cqe reordering [1 file, +1/-0] `689a205cd9` net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done [1 file, +5/-0] `adb05149a9` can: slcan: allow reception of short error messages [1 file, +20/-6] `cc55dd28c2` can: bcm: add locking for bcm_op runtime updates [1 file, +45/-21] `63567ecd99` can: bcm: add missing rcu read protection for procfs content [1 file, +9/-4] `bf85e49aaf` ALSA: pcm: Fix race of buffer access at PCM OSS layer [3 files, +14/-2] `e78908caf1` pmdomain: core: Fix error checking in genpd_dev_pm_attach_by_id() [1 file, +1/-1] `dc9bdfb9b0` drm/edid: fixed the bug that hdr metadata was not reset [1 file, +1/-0] `cb9a1019a6` Input: xpad - add more controllers [1 file, +3/-0] `9b8263cae6` highmem: add folio_test_partial_kmap() [2 files, +12/-5] `314bf771cb` memcg: always call cond_resched() after fn() [1 file, +2/-4] `9da33ce114` mm/page_alloc.c: avoid infinite retries caused by cpuset race [1 file, +8/-0] `9f9517f156` mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled [1 file, +2/-0] `94efb0d656` mm: vmalloc: actually use the in-place vrealloc region [1 file, +1/-0] `483ac74183` mm: vmalloc: only zero-init on vrealloc shrink [1 file, +7/-5] `1d45e0170c` spi: use container_of_cont() for to_spi_device() [1 file, +1/-4] `d28b0305f7` err.h: move IOMEM_ERR_PTR() to err.h [2 files, +3/-2] `80eb73778d` bpf: abort verification if env->cur_state->loop_entry != NULL [1 file, +4/-2] `85fb1edd05` drm/gem: Internally test import_attach for imported objects [1 file, +1/-2] Changes in 6.12.31 drm/amd/display: Configure DTBCLK_P with OPTC only for dcn401 drm/amd/display: Do not enable replay when vtotal update is pending. drm/amd/display: Correct timing_adjust_pending flag setting. drm/amd/display: Defer BW-optimization-blocked DRR adjustments i2c: designware: Use temporary variable for struct device i2c: designware: Fix an error handling path in i2c_dw_pci_probe() phy: renesas: rcar-gen3-usb2: Move IRQ request in probe phy: renesas: rcar-gen3-usb2: Lock around hardware registers and driver data phy: renesas: rcar-gen3-usb2: Assert PLL reset on PHY power off cpufreq: Add SM8650 to cpufreq-dt-platdev blocklist nvmem: rockchip-otp: Move read-offset into variant-data nvmem: rockchip-otp: add rk3576 variant data nvmem: core: fix bit offsets of more than one byte nvmem: core: verify cell's raw_len nvmem: core: update raw_len if the bit reading is required nvmem: qfprom: switch to 4-byte aligned reads scsi: target: iscsi: Fix timeout on deleted connection scsi: ufs: Introduce quirk to extend PA_HIBERN8TIME for UFS devices virtio_ring: Fix data race by tagging event_triggered as racy for KCSAN dma/mapping.c: dev_dbg support for dma_addressing_limited intel_th: avoid using deprecated page->mapping, index fields mei: vsc: Use struct vsc_tp_packet as vsc-tp tx_buf and rx_buf type dma-mapping: avoid potential unused data compilation warning cgroup: Fix compilation issue due to cgroup_mutex not being exported vhost_task: fix vhost_task_create() documentation vhost-scsi: protect vq->log_used with vq->mutex scsi: mpi3mr: Add level check to control event logging net: enetc: refactor bulk flipping of RX buffers to separate function dma-mapping: Fix warning reported for missing prototype ima: process_measurement() needlessly takes inode_lock() on MAY_READ fs/buffer: split locking for pagecache lookups fs/buffer: introduce sleeping flavors for pagecache lookups fs/buffer: use sleeping version of __find_get_block() fs/ocfs2: use sleeping version of __find_get_block() fs/jbd2: use sleeping version of __find_get_block() fs/ext4: use sleeping version of sb_find_get_block() drm/amd/display: Enable urgent latency adjustment on DCN35 drm/amdgpu: Allow P2P access through XGMI selftests/bpf: Mitigate sockmap_ktls disconnect_after_delete failure block: fix race between set_blocksize and read paths io_uring: don't duplicate flushing in io_req_post_cqe bpf: fix possible endless loop in BPF map iteration samples/bpf: Fix compilation failure for samples/bpf on LoongArch Fedora kconfig: merge_config: use an empty file as initfile x86/fred: Fix system hang during S4 resume with FRED enabled s390/vfio-ap: Fix no AP queue sharing allowed message written to kernel log cifs: Add fallback for SMB2 CREATE without FILE_READ_ATTRIBUTES cifs: Fix querying and creating MF symlinks over SMB1 cifs: Fix negotiate retry functionality smb: client: Store original IO parameters and prevent zero IO sizes fuse: Return EPERM rather than ENOSYS from link() exfat: call bh_read in get_block only when necessary io_uring/msg: initialise msg request opcode NFSv4: Check for delegation validity in nfs_start_delegation_return_locked() NFS: Don't allow waiting for exiting tasks SUNRPC: Don't allow waiting for exiting tasks arm64: Add support for HIP09 Spectre-BHB mitigation iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability tracing: Mark binary printing functions with __printf() attribute ACPI: PNP: Add Intel OC Watchdog IDs to non-PNP device list tpm: Convert warn to dbg in tpm2_start_auth_session() mailbox: pcc: Use acpi_os_ioremap() instead of ioremap() mailbox: use error ret code of of_parse_phandle_with_args() riscv: Allow NOMMU kernels to access all of RAM fbdev: fsl-diu-fb: add missing device_remove_file() fbcon: Use correct erase colour for clearing in fbcon fbdev: core: tileblit: Implement missing margin clearing for tileblit cifs: Set default Netbios RFC1001 server name to hostname in UNC cifs: add validation check for the fields in smb_aces cifs: Fix establishing NetBIOS session for SMB2+ connection NFSv4: Treat ENETUNREACH errors as fatal for state recovery SUNRPC: rpc_clnt_set_transport() must not change the autobind setting SUNRPC: rpcbind should never reset the port to the value '0' spi-rockchip: Fix register out of bounds access ASoC: codecs: wsa884x: Correct VI sense channel mask ASoC: codecs: wsa883x: Correct VI sense channel mask mctp: Fix incorrect tx flow invalidation condition in mctp-i2c net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus thermal/drivers/mediatek/lvts: Start sensor interrupts disabled thermal/drivers/qoriq: Power down TMU on system suspend Bluetooth: btmtksdio: Prevent enabling interrupts after IRQ handler removal Bluetooth: Disable SCO support if READ_VOICE_SETTING is unsupported/broken dql: Fix dql->limit value when reset. lockdep: Fix wait context check on softirq for PREEMPT_RT objtool: Properly disable uaccess validation PCI: dwc: ep: Ensure proper iteration over outbound map windows r8169: disable RTL8126 ZRX-DC timeout tools/build: Don't pass test log files to linker pNFS/flexfiles: Report ENETDOWN as a connection error drm/amdgpu/discovery: check ip_discovery fw file available drm/amdkfd: set precise mem ops caps to disabled for gfx 11 and 12 PCI: vmd: Disable MSI remapping bypass under Xen xen/pci: Do not register devices with segments >= 0x10000 ext4: on a remount, only log the ro or r/w state when it has changed libnvdimm/labels: Fix divide error in nd_label_data_init() pidfs: improve multi-threaded exec and premature thread-group leader exit polling staging: vchiq_arm: Create keep-alive thread during probe mmc: host: Wait for Vdd to settle on card power off drm/amdgpu: Skip pcie_replay_count sysfs creation for VF cgroup/rstat: avoid disabling irqs for O(num_cpu) wifi: mt76: only mark tx-status-failed frames as ACKed on mt76x0/2 wifi: mt76: mt7996: fix SER reset trigger on WED reset wifi: mt76: mt7996: revise TXS size wifi: mt76: mt7925: load the appropriate CLC data based on hardware type wifi: mt76: mt7925: fix fails to enter low power mode in suspend state x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP x86/smpboot: Fix INIT delay assignment for extended Intel Families x86/microcode: Update the Intel processor flag scan check x86/mm: Check return value from memblock_phys_alloc_range() i2c: qup: Vote for interconnect bandwidth to DRAM i2c: pxa: fix call balance of i2c->clk handling routines btrfs: make btrfs_discard_workfn() block_group ref explicit btrfs: avoid linker error in btrfs_find_create_tree_block() btrfs: run btrfs_error_commit_super() early btrfs: fix non-empty delayed iputs list on unmount due to async workers btrfs: get zone unusable bytes while holding lock at btrfs_reclaim_bgs_work() btrfs: send: return -ENAMETOOLONG when attempting a path that is too long blk-cgroup: improve policy registration error handling drm/amdgpu: release xcp_mgr on exit drm/amd/display: Guard against setting dispclk low for dcn31x drm/amdgpu: adjust drm_firmware_drivers_only() handling i3c: master: svc: Fix missing STOP for master request s390/tlb: Use mm_has_pgste() instead of mm_alloc_pgste() dlm: make tcp still work in multi-link env clocksource/drivers/timer-riscv: Stop stimecmp when cpu hotplug um: Store full CSGSFS and SS register from mcontext um: Update min_low_pfn to match changes in uml_reserved wifi: mwifiex: Fix HT40 bandwidth issue. bnxt_en: Query FW parameters when the CAPS_CHANGE bit is set riscv: Call secondary mmu notifier when flushing the tlb ext4: reorder capability check last hypfs_create_cpu_files(): add missing check for hypfs_mkdir() failure scsi: st: Tighten the page format heuristics with MODE SELECT scsi: st: ERASE does not change tape location vfio/pci: Handle INTx IRQ_NOTCONNECTED bpf: Return prog btf_id without capable check PCI: dwc: Use resource start as ioremap() input in dw_pcie_pme_turn_off() jbd2: do not try to recover wiped journal tcp: reorganize tcp_in_ack_event() and tcp_count_delivered() rtc: rv3032: fix EERD location objtool: Fix error handling inconsistencies in check() thunderbolt: Do not add non-active NVM if NVM upgrade is disabled for retimer erofs: initialize decompression early spi: spi-mux: Fix coverity issue, unchecked return value ASoC: pcm6240: Drop bogus code handling IRQ as GPIO ASoC: mediatek: mt6359: Add stub for mt6359_accdet_enable_jack_detect bpf: Allow pre-ordering for bpf cgroup progs kbuild: fix argument parsing in scripts/config kconfig: do not clear SYMBOL_VALID when reading include/config/auto.conf crypto: octeontx2 - suppress auth failure screaming due to negative tests dm: restrict dm device size to 2^63-512 bytes net/smc: use the correct ndev to find pnetid by pnetid table xen: Add support for XenServer 6.1 platform device pinctrl-tegra: Restore SFSEL bit when freeing pins mfd: tps65219: Remove TPS65219_REG_TI_DEV_ID check drm/amdgpu/gfx12: don't read registers in mqd init drm/amdgpu/gfx11: don't read registers in mqd init drm/amdgpu: Update SRIOV video codec caps ASoC: sun4i-codec: support hp-det-gpios property clk: qcom: lpassaudiocc-sc7280: Add support for LPASS resets for QCM6490 ext4: reject the 'data_err=abort' option in nojournal mode ext4: do not convert the unwritten extents if data writeback fails RDMA/uverbs: Propagate errors from rdma_lookup_get_uobject() posix-timers: Add cond_resched() to posix_timer_add() search loop posix-timers: Ensure that timer initialization is fully visible net: stmmac: dwmac-rk: Validate GRF and peripheral GRF during probe net: hsr: Fix PRP duplicate detection timer_list: Don't use %pK through printk() wifi: rtw89: set force HE TB mode when connecting to 11ax AP netfilter: conntrack: Bound nf_conntrack sysctl writes PNP: Expand length of fixup id string phy: rockchip: usbdp: Only verify link rates/lanes/voltage when the corresponding set flags are set arm64/mm: Check pmd_table() in pmd_trans_huge() arm64/mm: Check PUD_TYPE_TABLE in pud_bad() mmc: dw_mmc: add exynos7870 DW MMC support mmc: sdhci: Disable SD card clock before changing parameters usb: xhci: Don't change the status of stalled TDs on failed Stop EP wifi: iwlwifi: mvm: fix setting the TK when associated hwmon: (dell-smm) Increment the number of fans iommu: Keep dev->iommu state consistent printk: Check CON_SUSPEND when unblanking a console wifi: iwlwifi: don't warn when if there is a FW error wifi: iwlwifi: w/a FW SMPS mode selection wifi: iwlwifi: fix debug actions order wifi: iwlwifi: mark Br device not integrated wifi: iwlwifi: fix the ECKV UEFI variable name wifi: mac80211: fix warning on disconnect during failed ML reconf wifi: mac80211_hwsim: Fix MLD address translation wifi: cfg80211: allow IR in 20 MHz configurations ipv6: save dontfrag in cork drm/amd/display: remove minimum Dispclk and apply oem panel timing. drm/amd/display: calculate the remain segments for all pipes drm/amd/display: not abort link train when bw is low drm/amd/display: Fix incorrect DPCD configs while Replay/PSR switch gfs2: Check for empty queue in run_queue auxdisplay: charlcd: Partially revert "Move hwidth and bwidth to struct hd44780_common" ASoC: qcom: sm8250: explicitly set format in sm8250_be_hw_params_fixup() badblocks: Fix a nonsense WARN_ON() which checks whether a u64 variable < 0 coresight-etb10: change etb_drvdata spinlock's type to raw_spinlock_t iommu/amd/pgtbl_v2: Improve error handling cpufreq: tegra186: Share policy per cluster watchdog: aspeed: Update bootstatus handling PCI: endpoint: pci-epf-test: Fix double free that causes kernel to oops misc: pci_endpoint_test: Give disabled BARs a distinct error code crypto: lzo - Fix compression buffer overrun crypto: mxs-dcp - Only set OTP_KEY bit for OTP key drm/amdkfd: Set per-process flags only once for gfx9/10/11/12 drm/amdkfd: Set per-process flags only once cik/vi drm/amdgpu: Fix missing drain retry fault the last entry arm64: tegra: p2597: Fix gpio for vdd-1v8-dis regulator arm64: tegra: Resize aperture for the IGX PCIe C5 slot powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7 ALSA: seq: Improve data consistency at polling tcp: bring back NUMA dispersion in inet_ehash_locks_alloc() rtc: ds1307: stop disabling alarms on probe ieee802154: ca8210: Use proper setters and getters for bitwise types drm/xe: Nuke VM's mapping upon close drm/xe: Retry BO allocation soc: samsung: include linux/array_size.h where needed ARM: tegra: Switch DSI-B clock parent to PLLD on Tegra114 media: c8sectpfe: Call of_node_put(i2c_bus) only once in c8sectpfe_probe() usb: xhci: set page size to the xHCI-supported size dm cache: prevent BUG_ON by blocking retries on failed device resumes soc: mediatek: mtk-mutex: Add DPI1 SOF/EOF to MT8188 mutex tables orangefs: Do not truncate file size drm/gem: Test for imported GEM buffers with helper net: phylink: use pl->link_interface in phylink_expects_phy() blk-throttle: don't take carryover for prioritized processing of metadata remoteproc: qcom_wcnss: Handle platforms with only single power domain drm/amdgpu: Do not program AGP BAR regs under SRIOV in gfxhub_v1_0.c drm/amd/display: Ensure DMCUB idle before reset on DCN31/DCN35 drm/amd/display: Skip checking FRL_MODE bit for PCON BW determination drm/amd/display: Fix DMUB reset sequence for DCN401 drm/amd/display: Fix p-state type when p-state is unsupported drm/amd/display: Request HW cursor on DCN3.2 with SubVP perf/core: Clean up perf_try_init_event() media: cx231xx: set device_caps for 417 pinctrl: bcm281xx: Use "unsigned int" instead of bare "unsigned" rcu: Fix get_state_synchronize_rcu_full() GP-start detection net: ethernet: ti: cpsw_new: populate netdev of_node net: phy: nxp-c45-tja11xx: add match_phy_device to TJA1103/TJA1104 dpll: Add an assertion to check freq_supported_num ublk: enforce ublks_max only for unprivileged devices iommufd: Disallow allocating nested parent domain with fault ID media: imx335: Set vblank immediately net: pktgen: fix mpls maximum labels list parsing perf/hw_breakpoint: Return EOPNOTSUPP for unsupported breakpoint type ALSA: hda/realtek: Enable PC beep passthrough for HP EliteBook 855 G7 scsi: logging: Fix scsi_logging_level bounds ipv4: fib: Move fib_valid_key_len() to rtm_to_fib_config(). drm/rockchip: vop2: Add uv swap for cluster window block: mark bounce buffering as incompatible with integrity ublk: complete command synchronously on error media: uvcvideo: Add sanity check to uvc_ioctl_xu_ctrl_map media: uvcvideo: Handle uvc menu translation inside uvc_get_le_value clk: imx8mp: inform CCF of maximum frequency of clocks x86/bugs: Make spectre user default depend on MITIGATION_SPECTRE_V2 hwmon: (gpio-fan) Add missing mutex locks ARM: at91: pm: fix at91_suspend_finish for ZQ calibration drm/mediatek: mtk_dpi: Add checks for reg_h_fre_con existence fpga: altera-cvp: Increase credit timeout perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters soc: apple: rtkit: Use high prio work queue soc: apple: rtkit: Implement OSLog buffers properly wifi: ath12k: Report proper tx completion status to mac80211 PCI: brcmstb: Expand inbound window size up to 64GB PCI: brcmstb: Add a softdep to MIP MSI-X driver firmware: arm_ffa: Set dma_mask for ffa devices drm/xe/vf: Retry sending MMIO request to GUC on timeout error drm/xe/pf: Create a link between PF and VF devices net/mlx5: Avoid report two health errors on same syndrome selftests/net: have `gro.sh -t` return a correct exit code pinctrl: sophgo: avoid to modify untouched bit when setting cv1800 pinconf drm/amdkfd: KFD release_work possible circular locking drm/xe: xe_gen_wa_oob: replace program_invocation_short_name leds: pwm-multicolor: Add check for fwnode_property_read_u32 net: ethernet: mtk_ppe_offload: Allow QinQ, double ETH_P_8021Q only net: xgene-v2: remove incorrect ACPI_PTR annotation bonding: report duplicate MAC address in all situations wifi: ath12k: Improve BSS discovery with hidden SSID in 6 GHz band soc: ti: k3-socinfo: Do not use syscon helper to build regmap bpf: Search and add kfuncs in struct_ops prologue and epilogue Octeontx2-af: RPM: Register driver with PCI subsys IDs x86/build: Fix broken copy command in genimage.sh when making isoimage drm/amd/display: handle max_downscale_src_width fail check drm/amd/display: fix dcn4x init failed drm/amd/display: Fix mismatch type comparison ASoC: mediatek: mt8188: Treat DMIC_GAINx_CUR as non-volatile ASoC: mediatek: mt8188: Add reference for dmic clocks x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus() vhost-scsi: Return queue full for page alloc failures during copy vdpa/mlx5: Fix mlx5_vdpa_get_config() endianness on big-endian machines cpuidle: menu: Avoid discarding useful information media: adv7180: Disable test-pattern control on adv7180 media: tc358746: improve calculation of the D-PHY timing registers net/mlx5e: Add correct match to check IPSec syndromes for switchdev mode scsi: mpi3mr: Update timestamp only for supervisor IOCs loop: check in LO_FLAGS_DIRECT_IO in loop_default_blocksize libbpf: Fix out-of-bound read dm: fix unconditional IO throttle caused by REQ_PREFLUSH scsi: scsi_debug: First fixes for tapes net/mlx5: Change POOL_NEXT_SIZE define value and make it global x86/kaslr: Reduce KASLR entropy on most x86 systems crypto: ahash - Set default reqsize from ahash_alg crypto: skcipher - Zap type in crypto_alloc_sync_skcipher net: ipv6: Init tunnel link-netns before registering dev drm/xe/oa: Ensure that polled read returns latest data MIPS: Use arch specific syscall name match function drm/amdgpu: remove all KFD fences from the BO on release x86/locking: Use ALT_OUTPUT_SP() for percpu_{,try_}cmpxchg{64,128}_op() genirq/msi: Store the IOMMU IOVA directly in msi_desc instead of iommu_cookie MIPS: pm-cps: Use per-CPU variables as per-CPU, not per-core clocksource: mips-gic-timer: Enable counter when CPUs start PCI: epf-mhi: Update device ID for SA8775P scsi: mpt3sas: Send a diag reset if target reset fails wifi: rtw88: Fix rtw_init_vht_cap() for RTL8814AU wifi: rtw88: Fix rtw_init_ht_cap() for RTL8814AU wifi: rtw88: Fix rtw_desc_to_mcsrate() to handle MCS16-31 wifi: rtw89: fw: propagate error code from rtw89_h2c_tx() wifi: rtw89: fw: get sb_sel_ver via get_unaligned_le32() wifi: rtw89: fw: add blacklist to avoid obsolete secure firmware wifi: rtw89: 8922a: fix incorrect STA-ID in EHT MU PPDU net: pktgen: fix access outside of user given buffer in pktgen_thread_write() power: supply: axp20x_battery: Update temp sensor for AXP717 from device tree EDAC/ie31200: work around false positive build warning i3c: master: svc: Flush FIFO before sending Dynamic Address Assignment(DAA) mfd: axp20x: AXP717: Add AXP717_TS_PIN_CFG to writeable regs eeprom: ee1004: Check chip before probing irqchip/riscv-imsic: Separate next and previous pointers in IMSIC vector drm/amd/pm: Fetch current power limit from PMFW drm/amd/display: Add support for disconnected eDP streams drm/amd/display: Guard against setting dispclk low when active drm/amd/display: Fix BT2020 YCbCr limited/full range input drm/amd/display: Read LTTPR ALPM caps during link cap retrieval Revert "drm/amd/display: Request HW cursor on DCN3.2 with SubVP" drm/amd/display: Don't treat wb connector as physical in create_validate_stream_for_sink serial: mctrl_gpio: split disable_ms into sync and no_sync APIs RDMA/core: Fix best page size finding when it can cross SG entries pmdomain: imx: gpcv2: use proper helper for property detection can: c_can: Use of_property_present() to test existence of DT property bpf: don't do clean_live_states when state->loop_entry->branches > 0 bpf: copy_verifier_state() should copy 'loop_entry' field eth: mlx4: don't try to complete XDP frames in netpoll PCI: Fix old_size lower bound in calculate_iosize() too ACPI: HED: Always initialize before evged vxlan: Join / leave MC group after remote changes hrtimers: Replace hrtimer_clock_to_base_table with switch-case irqchip/riscv-imsic: Set irq_set_affinity() for IMSIC base media: test-drivers: vivid: don't call schedule in loop net/mlx5: Modify LSB bitmask in temperature event to include only the first bit net/mlx5: Apply rate-limiting to high temperature warning firmware: arm_ffa: Reject higher major version as incompatible firmware: arm_ffa: Handle the presence of host partition in the partition info firmware: xilinx: Dont send linux address to get fpga config get status ASoC: ops: Enforce platform maximum on initial value ASoC: tas2764: Add reg defaults for TAS2764_INT_CLK_CFG ASoC: tas2764: Mark SW_RESET as volatile ASoC: tas2764: Power up/down amp on mute ops ASoC: soc-dai: check return value at snd_soc_dai_set_tdm_slot() pinctrl: devicetree: do not goto err when probing hogs in pinctrl_dt_to_map smack: recognize ipv4 CIPSO w/o categories smack: Revert "smackfs: Added check catlen" kunit: tool: Use qboot on QEMU x86_64 media: i2c: imx219: Correct the minimum vblanking value media: v4l: Memset argument to 0 before calling get_mbus_config pad op net/mlx4_core: Avoid impossible mlx4_db_alloc() order value drm/xe: Stop ignoring errors from xe_ttm_stolen_mgr_init() drm/xe: Fix xe_tile_init_noalloc() error propagation clk: qcom: ipq5018: allow it to be bulid on arm32 clk: qcom: clk-alpha-pll: Do not use random stack value for recalc rate drm/xe/debugfs: fixed the return value of wedged_mode_set drm/xe/debugfs: Add missing xe_pm_runtime_put in wedge_mode_set x86/ibt: Handle FineIBT in handle_cfi_failure() x86/traps: Cleanup and robustify decode_bug() sched: Reduce the default slice to avoid tasks getting an extra tick serial: sh-sci: Update the suspend/resume support pinctrl: renesas: rzg2l: Add suspend/resume support for pull up/down phy: phy-rockchip-samsung-hdptx: Swap the definitions of LCPLL_REF and ROPLL_REF phy: core: don't require set_mode() callback for phy_get_mode() to work phy: exynos5-usbdrd: fix EDS distribution tuning (gs101) soundwire: amd: change the soundwire wake enable/disable sequence soundwire: cadence_master: set frame shape and divider based on actual clk freq net: stmmac: dwmac-loongson: Set correct {tx,rx}_fifo_size drm/amdgpu/mes11: fix set_hw_resources_1 calculation drm/amdkfd: fix missing L2 cache info in topology drm/amdgpu: Set snoop bit for SDMA for MI series drm/amd/display: pass calculated dram_speed_mts to dml2 drm/amd/display: Don't try AUX transactions on disconnected link drm/amdgpu: reset psp->cmd to NULL after releasing the buffer drm/amd/pm: Skip P2S load for SMU v13.0.12 drm/amd/display: Support multiple options during psr entry. Revert "drm/amd/display: Exit idle optimizations before attempt to access PHY" drm/amd/display: Update CR AUX RD interval interpretation drm/amd/display: Initial psr_version with correct setting drm/amd/display: Increase block_sequence array size drm/amd/display: Use Nominal vBlank If Provided Instead Of Capping It drm/amd/display: Populate register address for dentist for dcn401 drm/amdgpu: Use active umc info from discovery drm/amdgpu: enlarge the VBIOS binary size limit drm/amd/display/dm: drop hw_support check in amdgpu_dm_i2c_xfer() scsi: target: spc: Fix loop traversal in spc_rsoc_get_descr() net/mlx5: XDP, Enable TX side XDP multi-buffer support net/mlx5: Extend Ethtool loopback selftest to support non-linear SKB net/mlx5e: set the tx_queue_len for pfifo_fast net/mlx5e: reduce rep rxq depth to 256 for ECPF net/mlx5e: reduce the max log mpwrq sz for ECPF and reps drm/v3d: Add clock handling xfrm: prevent high SEQ input in non-ESN mode wifi: ath12k: fix the ampdu id fetch in the HAL_RX_MPDU_START TLV mptcp: pm: userspace: flags: clearer msg if no remote addr wifi: iwlwifi: use correct IMR dump variable wifi: iwlwifi: don't warn during reprobe wifi: mac80211: don't unconditionally call drv_mgd_complete_tx() wifi: mac80211: remove misplaced drv_mgd_complete_tx() call wifi: mac80211: set ieee80211_prep_tx_info::link_id upon Auth Rx net: fec: Refactor MAC reset to function powerpc/pseries/iommu: memory notifier incorrectly adds TCEs for pmemory powerpc/pseries/iommu: create DDW for devices with DMA mask less than 64-bits arch/powerpc/perf: Check the instruction type before creating sample with perf_mem_data_src ip: fib_rules: Fetch net from fib_rule in fib[46]_rule_configure(). r8152: add vendor/device ID pair for Dell Alienware AW1022z iio: adc: ad7944: don't use storagebits for sizing pstore: Change kmsg_bytes storage size to u32 leds: trigger: netdev: Configure LED blink interval for HW offload ext4: don't write back data before punch hole in nojournal mode ext4: remove writable userspace mappings before truncating page cache wifi: rtw88: Fix download_firmware_validate() for RTL8814AU wifi: rtw88: Fix __rtw_download_firmware() for RTL8814AU wifi: rtw89: coex: Assign value over than 0 to avoid firmware timer hang wifi: rtw89: fw: validate multi-firmware header before getting its size wifi: rtw89: fw: validate multi-firmware header before accessing wifi: rtw89: call power_on ahead before selecting firmware clk: qcom: camcc-sm8250: Use clk_rcg2_shared_ops for some RCGs net: page_pool: avoid false positive warning if NAPI was never added tools/power turbostat: Clustered Uncore MHz counters should honor show/hide options hwmon: (xgene-hwmon) use appropriate type for the latency value f2fs: introduce f2fs_base_attr for global sysfs entries media: qcom: camss: csid: Only add TPG v4l2 ctrl if TPG hardware is available media: qcom: camss: Add default case in vfe_src_pad_code drm/rockchip: vop2: Improve display modes handling on RK3588 HDMI0 eth: fbnic: set IFF_UNICAST_FLT to avoid enabling promiscuous mode when adding unicast addrs tools: ynl-gen: don't output external constants net/mlx5e: Avoid WARN_ON when configuring MQPRIO with HTB offload enabled cpufreq: amd-pstate: Remove unnecessary driver_lock in set_boost vxlan: Annotate FDB data races ipv4: ip_gre: Fix set but not used warning in ipgre_err() if IPv4-only r8169: don't scan PHY addresses > 0 net: flush_backlog() small changes bridge: mdb: Allow replace of a host-joined group ice: init flow director before RDMA ice: treat dyn_allowed only as suggestion rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y rcu: handle unstable rdp in rcu_read_unlock_strict() rcu: fix header guard for rcu_all_qs() perf: Avoid the read if the count is already updated ice: count combined queues using Rx/Tx count drm/xe/relay: Don't use GFP_KERNEL for new transactions net/mana: fix warning in the writer of client oob scsi: lpfc: Handle duplicate D_IDs in ndlp search-by D_ID routine scsi: lpfc: Ignore ndlp rport mismatch in dev_loss_tmo callbk scsi: lpfc: Free phba irq in lpfc_sli4_enable_msi() when pci_irq_vector() fails scsi: st: Restore some drive settings after reset wifi: ath12k: Avoid napi_sync() before napi_enable() HID: usbkbd: Fix the bit shift number for LED_KANA arm64: zynqmp: add clock-output-names property in clock nodes ASoC: codecs: pcm3168a: Allow for 24-bit in provider mode ASoC: rt722-sdca: Add some missing readable registers irqchip/riscv-aplic: Add support for hart indexes dm vdo indexer: prevent unterminated string warning dm vdo: use a short static string for thread name prefix drm/ast: Find VBIOS mode from regular display size bpf: Use kallsyms to find the function name of a struct_ops's stub function bpftool: Fix readlink usage in get_fd_type firmware: arm_scmi: Relax duplicate name constraint across protocol ids perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt perf/amd/ibs: Fix ->config to sample period calculation for OP PMU clk: renesas: rzg2l-cpg: Refactor Runtime PM clock validation wifi: rtl8xxxu: retry firmware download on error wifi: rtw88: Don't use static local variable in rtw8822b_set_tx_power_index_by_rate wifi: rtw89: add wiphy_lock() to work that isn't held wiphy_lock() yet spi: zynqmp-gqspi: Always acknowledge interrupts regulator: ad5398: Add device tree support wifi: ath12k: fix ath12k_hal_tx_cmd_ext_desc_setup() info1 override accel/qaic: Mask out SR-IOV PCI resources drm/xe/pf: Reset GuC VF config when unprovisioning critical resource wifi: ath9k: return by of_get_mac_address wifi: ath12k: Fetch regdb.bin file from board-2.bin wifi: ath12k: Fix end offset bit definition in monitor ring descriptor drm: bridge: adv7511: fill stream capabilities drm/nouveau: fix the broken marco GSP_MSG_MAX_SIZE wifi: ath11k: Use dma_alloc_noncoherent for rx_tid buffer allocation drm/xe: Move suballocator init to after display init drm/xe: Do not attempt to bootstrap VF in execlists mode wifi: rtw89: coex: Separated Wi-Fi connecting event from Wi-Fi scan event drm/xe/sa: Always call drm_suballoc_manager_fini() drm/xe: Reject BO eviction if BO is bound to current VM drm/atomic: clarify the rules around drm_atomic_state->allow_modeset drm/buddy: fix issue that force_merge cannot free all roots drm/panel-edp: Add Starry 116KHD024006 drm: Add valid clones check ASoC: imx-card: Adjust over allocation of memory in imx_card_parse_of() book3s64/radix: Fix compile errors when CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=n pinctrl: meson: define the pull up/down resistor value as 60 kOhm smb: server: smb2pdu: check return value of xa_store() platform/x86/intel: hid: Add Pantherlake support platform/x86: asus-wmi: Disable OOBE state after resume from hibernation platform/x86: ideapad-laptop: add support for some new buttons ASoC: cs42l43: Disable headphone clamps during type detection ASoC: Intel: bytcr_rt5640: Add DMI quirk for Acer Aspire SW3-013 ALSA: hda/realtek: Add quirk for HP Spectre x360 15-df1xxx nvme-pci: add quirks for device 126f:1001 nvme-pci: add quirks for WDC Blue SN550 15b7:5009 ALSA: usb-audio: Fix duplicated name in MIDI substream names nvmet-tcp: don't restore null sk_state_change io_uring/fdinfo: annotate racy sq/cq head/tail reads cifs: Fix and improve cifs_query_path_info() and cifs_query_file_info() cifs: Fix changing times and read-only attr over SMB1 smb_set_file_info() function ASoC: intel/sdw_utils: Add volume limit to cs42l43 speakers btrfs: compression: adjust cb->compressed_folios allocation type btrfs: correct the order of prelim_ref arguments in btrfs__prelim_ref btrfs: handle empty eb->folios in num_extent_folios() btrfs: avoid NULL pointer dereference if no valid csum tree tools: ynl-gen: validate 0 len strings from kernel block: only update request sector if needed wifi: iwlwifi: add support for Killer on MTL x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88 xenbus: Allow PVH dom0 a non-local xenstore drm/amd/display: Call FP Protect Before Mode Programming/Mode Support __legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock soundwire: bus: Fix race on the creation of the IRQ domain espintcp: fix skb leaks espintcp: remove encap socket caching to avoid reference leak xfrm: Fix UDP GRO handling for some corner cases dmaengine: idxd: Fix allowing write() from different address spaces x86/sev: Fix operator precedence in GHCB_MSR_VMPL_REQ_LEVEL macro kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for fork() remoteproc: qcom_wcnss: Fix on platforms without fallback regulators clk: sunxi-ng: d1: Add missing divider for MMC mod clocks xfrm: Sanitize marks before insert dmaengine: idxd: Fix ->poll() return value dmaengine: fsl-edma: Fix return code for unhandled interrupts driver core: Split devres APIs to device/devres.h devres: Introduce devm_kmemdup_array() ASoC: SOF: Intel: hda: Fix UAF when reloading module irqchip/riscv-imsic: Start local sync timer on correct CPU perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freq Bluetooth: L2CAP: Fix not checking l2cap_chan security level Bluetooth: btusb: use skb_pull to avoid unsafe access in QCA dump handling ptp: ocp: Limit signal/freq counts in summary output functions bridge: netfilter: Fix forwarding of fragmented packets ice: fix vf->num_mac count with port representors ice: Fix LACP bonds without SRIOV environment idpf: fix null-ptr-deref in idpf_features_check loop: don't require ->write_iter for writable files in loop_configure pinctrl: qcom: switch to devm_register_sys_off_handler() net: dwmac-sun8i: Use parsed internal PHY address instead of 1 net: lan743x: Restore SGMII CTRL register on resume io_uring: fix overflow resched cqe reordering idpf: fix idpf_vport_splitq_napi_poll() sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue() octeontx2-pf: Add AF_XDP non-zero copy support net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done octeontx2-af: Set LMT_ENA bit for APR table entries octeontx2-af: Fix APR entry mapping based on APR_LMT_CFG clk: s2mps11: initialise clk_hw_onecell_data::num before accessing ::hws[] in probe() crypto: algif_hash - fix double free in hash_accept padata: do not leak refcount in reorder_work can: slcan: allow reception of short error messages can: bcm: add locking for bcm_op runtime updates can: bcm: add missing rcu read protection for procfs content ASoC: SOF: ipc4-control: Use SOF_CTRL_CMD_BINARY as numid for bytes_ext ASoC: SOF: Intel: hda-bus: Use PIO mode on ACE2+ platforms ASoc: SOF: topology: connect DAI to a single DAI link ASoC: SOF: ipc4-pcm: Delay reporting is only supported for playback direction ALSA: pcm: Fix race of buffer access at PCM OSS layer ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14ASP10 llc: fix data loss when reading from a socket in llc_ui_recvmsg() can: kvaser_pciefd: Continue parsing DMA buf after dropped RX can: kvaser_pciefd: Fix echo_skb race net: dsa: microchip: linearize skb for tail-tagging switches vmxnet3: update MTU after device quiesce pmdomain: renesas: rcar: Remove obsolete nullify checks pmdomain: core: Fix error checking in genpd_dev_pm_attach_by_id() platform/x86: dell-wmi-sysman: Avoid buffer overflow in current_password_store() thermal: intel: x86_pkg_temp_thermal: Fix bogus trip temperature drm/edid: fixed the bug that hdr metadata was not reset smb: client: Fix use-after-free in cifs_fill_dirent arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDs smb: client: Reset all search buffer pointers when releasing buffer Revert "drm/amd: Keep display off while going into S4" Input: xpad - add more controllers highmem: add folio_test_partial_kmap() memcg: always call cond_resched() after fn() mm/page_alloc.c: avoid infinite retries caused by cpuset race mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled mm: vmalloc: actually use the in-place vrealloc region mm: vmalloc: only zero-init on vrealloc shrink nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs() Bluetooth: btmtksdio: Check function enabled before doing close Bluetooth: btmtksdio: Do close if SDIO card removed without close Revert "arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection" ksmbd: fix stream write failure platform/x86: think-lmi: Fix attribute name usage for non-compliant items spi: use container_of_cont() for to_spi_device() spi: spi-fsl-dspi: restrict register range for regmap access spi: spi-fsl-dspi: Halt the module after a new message transfer spi: spi-fsl-dspi: Reset SR flags before sending a new message err.h: move IOMEM_ERR_PTR() to err.h gcc-15: make 'unterminated string initialization' just a warning gcc-15: disable '-Wunterminated-string-initialization' entirely for now Fix mis-uses of 'cc-option' for warning disablement kbuild: Properly disable -Wunterminated-string-initialization for clang drm/amd/display: Exit idle optimizations before accessing PHY bpf: abort verification if env->cur_state->loop_entry != NULL serial: sh-sci: Save and restore more registers drm/amdkfd: Correct F8_MODE for gfx950 watchdog: aspeed: fix 64-bit division pinctrl: tegra: Fix off by one in tegra_pinctrl_get_group() i3c: master: svc: Fix implicit fallthrough in svc_i3c_master_ibi_work() x86/mm/init: Handle the special case of device private pages in add_pages(), to not increase max_pfn and trigger dma_addressing_limited() bounce buffers bounce buffers drm/gem: Internally test import_attach for imported objects Linux 6.12.31 Change-Id: I017795966fb764f9320a6a0df1571d19e5e631fe Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2025-07-03 07:19:01 +00:00
Kalesh Singh	030e00a2d7	ANDROID: 16K: Use vma_area slab cache for pad VMA Allocate padding VMA from the vma slab cache; this make it easier to debug slab leaks than from kmalloc slabs. Bug: 427145188 Change-Id: I24c5f5d0eb3b06acf506f18f5eb57cd497b13d6d Signed-off-by: Kalesh Singh <kaleshsingh@google.com>	2025-06-24 16:42:50 -07:00
David Hildenbrand	447c8f0c06	kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for fork() [ Upstream commit e9f180d7cfde23b9f8eebd60272465176373ab2c ] Not intuitive, but vm_area_dup() located in kernel/fork.c is not only used for duplicating VMAs during fork(), but also for duplicating VMAs when splitting VMAs or when mremap()'ing them. VM_PFNMAP mappings can at least get ordinarily mremap()'ed (no change in size) and apparently also shrunk during mremap(), which implies duplicating the VMA in __split_vma() first. In case of ordinary mremap() (no change in size), we first duplicate the VMA in copy_vma_and_data()->copy_vma() to then call untrack_pfn_clear() on the old VMA: we effectively move the VM_PAT reservation. So the untrack_pfn_clear() call on the new VMA duplicating is wrong in that context. Splitting of VMAs seems problematic, because we don't duplicate/adjust the reservation when splitting the VMA. Instead, in memtype_erase() -- called during zapping/munmap -- we shrink a reservation in case only the end address matches: Assume we split a VMA into A and B, both would share a reservation until B is unmapped. So when unmapping B, the reservation would be updated to cover only A. When unmapping A, we would properly remove the now-shrunk reservation. That scenario describes the mremap() shrinking (old_size > new_size), where we split + unmap B, and the untrack_pfn_clear() on the new VMA when is wrong. What if we manage to split a VM_PFNMAP VMA into A and B and unmap A first? It would be broken because we would never free the reservation. Likely, there are ways to trigger such a VMA split outside of mremap(). Affecting other VMA duplication was not intended, vm_area_dup() being used outside of kernel/fork.c was an oversight. So let's fix that for; how to handle VMA splits better should be investigated separately. With a simple reproducer that uses mprotect() to split such a VMA I can trigger x86/PAT: pat_mremap:26448 freeing invalid memtype [mem 0x00000000-0x00000fff] Link: https://lkml.kernel.org/r/20250422144942.2871395-1-david@redhat.com Fixes: dc84bc2aba85 ("x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-29 11:03:14 +02:00
xiaofeng	2a5729e149	ANDROID: vendor_hooks:vendor hook for mmput add vendor hook in mmput while mm_users decreased to 0. Bug: 238821038 Change-Id: I42a717cbeeb3176bac14b4b2391fdb2366c972d3 Signed-off-by: xiaofeng <xiaofeng5@xiaomi.com>	2025-04-28 12:00:31 -07:00
Greg Kroah-Hartman	0946c695bb	Merge `7d8dfc27d9` ("smb: client: Fix netns refcount imbalance causing leaks and use-after-free") into android16-6.12 Steps on the way to 6.12.23 Change-Id: I071040c57ea134f0a618ecc9e25db4a302dff4a8 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2025-04-24 08:30:10 -07:00
David Hildenbrand	8d6373f83f	x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range() [ Upstream commit dc84bc2aba85a1508f04a936f9f9a15f64ebfb31 ] If track_pfn_copy() fails, we already added the dst VMA to the maple tree. As fork() fails, we'll cleanup the maple tree, and stumble over the dst VMA for which we neither performed any reservation nor copied any page tables. Consequently untrack_pfn() will see VM_PAT and try obtaining the PAT information from the page table -- which fails because the page table was not copied. The easiest fix would be to simply clear the VM_PAT flag of the dst VMA if track_pfn_copy() fails. However, the whole thing is about "simply" clearing the VM_PAT flag is shaky as well: if we passed track_pfn_copy() and performed a reservation, but copying the page tables fails, we'll simply clear the VM_PAT flag, not properly undoing the reservation ... which is also wrong. So let's fix it properly: set the VM_PAT flag only if the reservation succeeded (leaving it clear initially), and undo the reservation if anything goes wrong while copying the page tables: clearing the VM_PAT flag after undoing the reservation. Note that any copied page table entries will get zapped when the VMA will get removed later, after copy_page_range() succeeded; as VM_PAT is not set then, we won't try cleaning VM_PAT up once more and untrack_pfn() will be happy. Note that leaving these page tables in place without a reservation is not a problem, as we are aborting fork(); this process will never run. A reproducer can trigger this usually at the first try: https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/reproducers/pat_fork.c WARNING: CPU: 26 PID: 11650 at arch/x86/mm/pat/memtype.c:983 get_pat_info+0xf6/0x110 Modules linked in: ... CPU: 26 UID: 0 PID: 11650 Comm: repro3 Not tainted 6.12.0-rc5+ #92 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:get_pat_info+0xf6/0x110 ... Call Trace: <TASK> ... untrack_pfn+0x52/0x110 unmap_single_vma+0xa6/0xe0 unmap_vmas+0x105/0x1f0 exit_mmap+0xf6/0x460 __mmput+0x4b/0x120 copy_process+0x1bf6/0x2aa0 kernel_clone+0xab/0x440 __do_sys_clone+0x66/0x90 do_syscall_64+0x95/0x180 Likely this case was missed in: `d155df53f3` ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed") ... and instead of undoing the reservation we simply cleared the VM_PAT flag. Keep the documentation of these functions in include/linux/pgtable.h, one place is more than sufficient -- we should clean that up for the other functions like track_pfn_remap/untrack_pfn separately. Fixes: `d155df53f3` ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed") Fixes: `2ab640379a` ("x86: PAT: hooks in generic vm code to help archs to track pfnmap regions - v3") Reported-by: xingwei lee <xrivendell7@gmail.com> Reported-by: yuxin wang <wang1315768607@163.com> Reported-by: Marius Fleischer <fleischermarius@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20250321112323.153741-1-david@redhat.com Closes: https://lore.kernel.org/lkml/CABOYnLx_dnqzpCW99G81DmOr+2UzdmZMk=T3uxwNxwz+R1RAwg@mail.gmail.com/ Closes: https://lore.kernel.org/lkml/CAJg=8jwijTP5fre8woS4JVJQ8iUA6v+iNcsOgtj9Zfpc3obDOQ@mail.gmail.com/ Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-10 14:39:18 +02:00
chenweitao	1dc69ebe20	ANDROID: vendor_hooks: Add hook for trace_android_vh_copy_process Add hook for trace_android_vh_copy_process, which gives the vendor a chance to monitor the total thread count of the system and the thread count under a particular process Bug: 325765508 Change-Id: Ibeb8aa571d44997ac10623321cd00d1686bde033 Signed-off-by: chenweitao <chenweitao@oppo.com>	2025-03-11 11:26:45 +08:00
Suren Baghdasaryan	3e74468f1e	FROMGIT: mm: make vma cache SLAB_TYPESAFE_BY_RCU To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that object reuse before RCU grace period is over will be detected by lock_vma_under_rcu(). Current checks are sufficient as long as vma is detached before it is freed. The only place this is not currently happening is in exit_mmap(). Add the missing vma_mark_detached() in exit_mmap(). Another issue which might trick lock_vma_under_rcu() during vma reuse is vm_area_dup(), which copies the entire content of the vma into a new one, overriding new vma's vm_refcnt and temporarily making it appear as attached. This might trick a racing lock_vma_under_rcu() to operate on a reused vma if it found the vma before it got reused. To prevent this situation, we should ensure that vm_refcnt stays at detached state (0) when it is copied and advances to attached state only after it is added into the vma tree. Introduce vm_area_init_from() which preserves new vma's vm_refcnt and use it in vm_area_dup(). Since all vmas are in detached state with no current readers when they are freed, lock_vma_under_rcu() will not be able to take vm_refcnt after vma got detached even if vma is reused. vma_mark_attached() in modified to include a release fence to ensure all stores to the vma happen before vm_refcnt gets initialized. Finally, make vm_area_cachep SLAB_TYPESAFE_BY_RCU. This will facilitate vm_area_struct reuse and will minimize the number of call_rcu() calls. Link: https://lkml.kernel.org/r/20250213224655.1680278-18-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit f56ae9bc0002a2ff7bf3cdd27ed847fe6e9d686a https: //git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) Bug: 322132947 Change-Id: I410c6fbce2e0d87ed5f7c19dc1f8806b2556837a Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2025-02-28 02:29:43 -08:00
Suren Baghdasaryan	540df3e90d	BACKPORT: FROMGIT: mm: replace vm_lock and detached flag with a reference count rw_semaphore is a sizable structure of 40 bytes and consumes considerable space for each vm_area_struct. However vma_lock has two important specifics which can be used to replace rw_semaphore with a simpler structure: 1. Readers never wait. They try to take the vma_lock and fall back to mmap_lock if that fails. 2. Only one writer at a time will ever try to write-lock a vma_lock because writers first take mmap_lock in write mode. Because of these requirements, full rw_semaphore functionality is not needed and we can replace rw_semaphore and the vma->detached flag with a refcount (vm_refcnt). When vma is in detached state, vm_refcnt is 0 and only a call to vma_mark_attached() can take it out of this state. Note that unlike before, now we enforce both vma_mark_attached() and vma_mark_detached() to be done only after vma has been write-locked. vma_mark_attached() changes vm_refcnt to 1 to indicate that it has been attached to the vma tree. When a reader takes read lock, it increments vm_refcnt, unless the top usable bit of vm_refcnt (0x40000000) is set, indicating presence of a writer. When writer takes write lock, it sets the top usable bit to indicate its presence. If there are readers, writer will wait using newly introduced mm->vma_writer_wait. Since all writers take mmap_lock in write mode first, there can be only one writer at a time. The last reader to release the lock will signal the writer to wake up. refcount might overflow if there are many competing readers, in which case read-locking will fail. Readers are expected to handle such failures. In summary: 1. all readers increment the vm_refcnt; 2. writer sets top usable (writer) bit of vm_refcnt; 3. readers cannot increment the vm_refcnt if the writer bit is set; 4. in the presence of readers, writer must wait for the vm_refcnt to drop to 1 (plus the VMA_LOCK_OFFSET writer bit), indicating an attached vma with no readers; 5. vm_refcnt overflow is handled by the readers. While this vm_lock replacement does not yet result in a smaller vm_area_struct (it stays at 256 bytes due to cacheline alignment), it allows for further size optimization by structure member regrouping to bring the size of vm_area_struct below 192 bytes. Link: https://lkml.kernel.org/r/20250213224655.1680278-13-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Matthew Wilcox <willy@infradead.org> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 810c1edd93f29baa10142aa430f8d6c2909fcc25 https: //git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) [surenb: trivial merge conflicts in mm.h and vma_internal.h] Bug: 322132947 Change-Id: I4ef39de83b6b44b30c5bd2ff0cd34c0a84d10632 Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2025-02-28 02:29:43 -08:00
Suren Baghdasaryan	5fcab29750	FROMGIT: mm: move mmap_init_lock() out of the header file mmap_init_lock() is used only from mm_init() in fork.c, therefore it does not have to reside in the header file. This move lets us avoid including additional headers in mmap_lock.h later, when mmap_init_lock() needs to initialize rcuwait object. Link: https://lkml.kernel.org/r/20250213224655.1680278-9-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 9ab68ea874f31ea5b633d14095f7ec001495b11e https: //git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) Bug: 322132947 Change-Id: I69aeecdd917bae33a429aa872643c3a11dfa0e32 Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2025-02-28 02:29:43 -08:00
Suren Baghdasaryan	74cc099459	BACKPORT: FROMGIT: mm: mark vma as detached until it's added into vma tree Current implementation does not set detached flag when a VMA is first allocated. This does not represent the real state of the VMA, which is detached until it is added into mm's VMA tree. Fix this by marking new VMAs as detached and resetting detached flag only after VMA is added into a tree. Introduce vma_mark_attached() to make the API more readable and to simplify possible future cleanup when vma->vm_mm might be used to indicate detached vma and vma_mark_attached() will need an additional mm parameter. Link: https://lkml.kernel.org/r/20250213224655.1680278-4-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 286750a6443552abad64c66ac96e629c4516bb3b [surenb: resolved conflict due to the reattach_vmas() being moved from vma.h to vma.c] https: //git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) Bug: 322132947 Change-Id: I7361060f5e3ef392848f835db4c0c0f74de12ea7 Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2025-02-28 02:29:43 -08:00
Suren Baghdasaryan	e1e4842c07	FROMGIT: mm: move per-vma lock into vm_area_struct Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. Splitting single logical structure into multiple ones leads to more complicated management, extra pointer dereferences and overall less maintainable code. When that split-away part is a lock, it complicates things even further. With no performance benefits, there are no reasons for this split. Merging the vm_lock back into vm_area_struct also allows vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. Move vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cacheline-aligned as well. With kernel compiled using defconfig, this causes VMA memory consumption to grow from 160 (vm_area_struct) + 40 (vm_lock) bytes to 256 bytes: slabinfo before: <name> ... <objsize> <objperslab> <pagesperslab> : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 160 51 2 : ... slabinfo after moving vm_lock: <name> ... <objsize> <objperslab> <pagesperslab> : ... vm_area_struct ... 256 32 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages, which is 5.5MB per 100000 VMAs. Note that the size of this structure is dependent on the kernel configuration and typically the original size is higher than 160 bytes. Therefore these calculations are close to the worst case scenario. A more realistic vm_area_struct usage before this change is: <name> ... <objsize> <objperslab> <pagesperslab> : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 176 46 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 54 to 64 pages, which is 3.9MB per 100000 VMAs. This memory consumption growth can be addressed later by optimizing the vm_lock. [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/ Link: https://lkml.kernel.org/r/20250213224655.1680278-3-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit ad8786318a05a4c59fa9bc03a0e69d0b6b2170f9 https: //git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable) Bug: 322132947 Change-Id: Iefd3e6cfcd7a003d994eaa24b4a72593045e48b4 Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2025-02-28 02:29:43 -08:00
Suren Baghdasaryan	90996df30f	UPSTREAM: mm: convert mm_lock_seq to a proper seqcount Convert mm_lock_seq to be seqcount_t and change all mmap_write_lock variants to increment it, in-line with the usual seqcount usage pattern. This lets us check whether the mmap_lock is write-locked by checking mm_lock_seq.sequence counter (odd=locked, even=unlocked). This will be used when implementing mmap_lock speculation functions. As a result vm_lock_seq is also change to be unsigned to match the type of mm_lock_seq.sequence. Link: https://lkml.kernel.org/r/20241122174416.1367052-2-surenb@google.com Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit e5e7fb278e5924f29ceab42bbbb891cde528f7cc) Bug: 322132947 Change-Id: I515a62599fa971935471bf61d314b0365c3e2926 Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2025-02-28 02:29:43 -08:00
Sooyong Suk	7b7404ab99	ANDROID: mm: export symbol for vendor module export symbols for vendor module for custom madvise behavior - mm_access, pidfd_get_pid, swp_swapcount Bug: 351175506 Change-Id: I55a48d09fa61b74a00eba32723eca16153d309ec Signed-off-by: Sooyong Suk <s.suk@samsung.corp-partner.google.com>	2025-02-11 14:44:05 -08:00
Liujie Xie	ad17f45365	ANDROID: vendor_hooks: Export the tracepoints task_rename Export the tracepoint task_rename to identify specific new task， to customize task's util for power and performance, or optimize task schedule parameters. Bug: 189985971 Change-Id: I3bb71eae316e3096d361e7b47012ba46ea4be509 Signed-off-by: Liujie Xie <xieliujie@oppo.com> (cherry picked from commit ed1e87e42cc2c4ed61ad6bc9d242e7e7a70c5b99)	2025-01-22 13:27:34 -08:00
Greg Kroah-Hartman	bba787badd	Merge 6.12.8 into android16-6.12 GKI (arm64) relevant 24 out of 115 changes, affecting 34 files +169/-94 `f4ab7d7424` bpf: Fix bpf_get_smp_processor_id() on !CONFIG_SMP [1 file, +5/-1] `8cdfb06569` fork: avoid inappropriate uprobe access to invalid mm [1 file, +6/-7] `2175b66c7f` mm/vmstat: fix a W=1 clang compiler warning [1 file, +1/-1] `35727f4506` tcp_bpf: Charge receive socket buffer in bpf_tcp_ingress() [2 files, +9/-3] `4aa5dcb389` tcp_bpf: Add sk_rmem_alloc related logic for tcp_bpf ingress redirection [3 files, +16/-5] `997cf2d8c2` bpf: Check negative offsets in __bpf_skb_min_len() [1 file, +15/-6] `a817e938a0` phy: core: Fix an OF node refcount leakage in _of_phy_get() [1 file, +5/-2] `479b6c2a5f` phy: core: Fix an OF node refcount leakage in of_phy_provider_lookup() [1 file, +3/-1] `09f17bfb36` phy: core: Fix that API devm_phy_put() fails to release the phy [1 file, +1/-1] `f797151e84` phy: core: Fix that API devm_of_phy_provider_unregister() fails to unregister the phy provider [1 file, +3/-3] `7e7c8ffc01` phy: core: Fix that API devm_phy_destroy() fails to destroy the phy [1 file, +1/-1] `c180c3f42d` ALSA: memalloc: prefer dma_mapping_error() over explicit address checking [1 file, +1/-1] `a39ff5bf23` stddef: make __struct_group() UAPI C++-friendly [2 files, +21/-7] `68662d78af` tracing/kprobe: Make trace_kprobe's module callback called after jump_label update [1 file, +1/-1] `ca5995f805` regmap: Use correct format specifier for logging range errors [1 file, +2/-2] `fdaaf92943` bpf: Zero index arg error string for dynptr and iter [6 files, +29/-29] `92d5139b91` virtio-blk: don't keep queue frozen during system suspend [1 file, +5/-2] `16b54ee81d` blk-mq: register cpuhp callback after hctx is added to xarray table [1 file, +7/-8] `7d680f2f76` ublk: detach gendisk from ublk device if add_disk() fails [1 file, +17/-9] `79a47fd0f1` freezer, sched: Report frozen tasks as 'D' instead of 'R' [1 file, +2/-1] `a744146969` tracing: Constify string literal data member in struct trace_event_call [1 file, +1/-1] `1cca920af1` tracing: Prevent bad count for tracing_cpumask_write [1 file, +3/-0] `8e8494c83c` io_uring/sqpoll: fix sqpoll error handling races [1 file, +6/-0] `aed157301c` PCI/MSI: Handle lack of irqdomain gracefully [2 files, +9/-2] Changes in 6.12.8 media: dvb-frontends: dib3000mb: fix uninit-value in dib3000_write_reg ceph: allocate sparse_ext map only for sparse reads arm64: dts: broadcom: Fix L2 linesize for Raspberry Pi 5 bpf: Fix bpf_get_smp_processor_id() on !CONFIG_SMP fork: avoid inappropriate uprobe access to invalid mm mm/vmstat: fix a W=1 clang compiler warning selftests/bpf: Fix compilation error in get_uprobe_offset() smb: client: Deduplicate "select NETFS_SUPPORT" in Kconfig smb: fix bytes written value in /proc/fs/cifs/Stats tcp_bpf: Charge receive socket buffer in bpf_tcp_ingress() tcp_bpf: Add sk_rmem_alloc related logic for tcp_bpf ingress redirection bpf: Check negative offsets in __bpf_skb_min_len() nfsd: Revert "nfsd: release svc_expkey/svc_export with rcu_work" nfsd: restore callback functionality for NFSv4.0 mtd: diskonchip: Cast an operand to prevent potential overflow mtd: rawnand: arasan: Fix double assertion of chip-select mtd: rawnand: arasan: Fix missing de-registration of NAND phy: qcom-qmp: Fix register name in RX Lane config of SC8280XP phy: core: Fix an OF node refcount leakage in _of_phy_get() phy: core: Fix an OF node refcount leakage in of_phy_provider_lookup() phy: core: Fix that API devm_phy_put() fails to release the phy phy: core: Fix that API devm_of_phy_provider_unregister() fails to unregister the phy provider phy: core: Fix that API devm_phy_destroy() fails to destroy the phy phy: usb: Toggle the PHY power during init phy: rockchip: samsung-hdptx: Set drvdata before enabling runtime PM phy: rockchip: naneng-combphy: fix phy reset ALSA: memalloc: prefer dma_mapping_error() over explicit address checking dmaengine: mv_xor: fix child node refcount handling in early exit dmaengine: dw: Select only supported masters for ACPI devices dmaengine: tegra: Return correct DMA status when paused dmaengine: amd: qdma: Remove using the private get and set dma_ops APIs dmaengine: fsl-edma: implement the cleanup path of fsl_edma3_attach_pd() dmaengine: apple-admac: Avoid accessing registers in probe dmaengine: at_xdmac: avoid null_prt_deref in at_xdmac_prep_dma_memset ASoC: SOF: Intel: hda-dai: Do not release the link DMA on STOP platform/chrome: cros_ec_lpc: fix product identity for early Framework Laptops mtd: rawnand: fix double free in atmel_pmecc_create_user() ASoC: amd: ps: Fix for enabling DMIC on acp63 platform via _DSD entry ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21QA and 21QB ASoC: dt-bindings: realtek,rt5645: Fix CPVDD voltage comment ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21Q6 and 21Q7 powerpc/pseries/vas: Add close() callback in vas_vm_ops struct power: supply: bq24190: Fix BQ24296 Vbus regulator support stddef: make __struct_group() UAPI C++-friendly tracing/kprobe: Make trace_kprobe's module callback called after jump_label update watchdog: it87_wdt: add PWRGD enable quirk for Qotom QCML04 watchdog: rzg2l_wdt: Power on the watchdog domain in the restart handler Revert "watchdog: s3c2410_wdt: use exynos_get_pmu_regmap_by_phandle() for PMU regs" watchdog: mediatek: Add support for MT6735 TOPRGU/WDT scsi: qla1280: Fix hw revision numbering for ISP1020/1040 scsi: megaraid_sas: Fix for a potential deadlock udf: Skip parent dir link count update if corrupted udf: Verify inode link counts before performing rename ALSA: ump: Don't open legacy substream for an inactive group ALSA: ump: Indicate the inactive group in legacy substream names ALSA: ump: Update legacy substream names upon FB info update ALSA: hda/conexant: fix Z60MR100 startup pop issue ALSA: sh: Use standard helper for buffer accesses smb: server: Fix building with GCC 15 regmap: Use correct format specifier for logging range errors LoongArch: Fix reserving screen info memory for above-4G firmware LoongArch: BPF: Adjust the parameter of emit_jirl() platform/x86: asus-nb-wmi: Ignore unknown event 0xCF bpf: Zero index arg error string for dynptr and iter spi: intel: Add Panther Lake SPI controller support scsi: mpt3sas: Diag-Reset when Doorbell-In-Use bit is set during driver load time scsi: mpi3mr: Synchronize access to ioctl data buffer scsi: mpi3mr: Fix corrupt config pages PHY state is switched in sysfs scsi: mpi3mr: Start controller indexing from 0 scsi: mpi3mr: Handling of fault code for insufficient power scsi: storvsc: Do not flag MAINTENANCE_IN return of SRB_STATUS_DATA_OVERRUN as an error ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A spi: omap2-mcspi: Fix the IS_ERR() bug for devm_clk_get_optional_enabled() drm/dp_mst: Ensure mst_primary pointer is valid in drm_dp_mst_handle_up_req() virtio-blk: don't keep queue frozen during system suspend blk-mq: register cpuhp callback after hctx is added to xarray table wifi: iwlwifi: be less noisy if the NIC is dead in S3 ublk: detach gendisk from ublk device if add_disk() fails drm/xe: Take PM ref in delayed snapshot capture worker drm/xe: Move the coredump registration to the worker thread objtool: Add bch2_trans_unlocked_error() to bcachefs noreturns freezer, sched: Report frozen tasks as 'D' instead of 'R' dmaengine: loongson2-apb: Change GENMASK to GENMASK_ULL perf/x86/intel/uncore: Add Clearwater Forest support tracing: Constify string literal data member in struct trace_event_call tracing: Prevent bad count for tracing_cpumask_write rtla/timerlat: Fix histogram ALL for zero samples io_uring/sqpoll: fix sqpoll error handling races i2c: microchip-core: actually use repeated sends x86/fred: Clear WFE in missing-ENDBRANCH #CPs virt: tdx-guest: Just leak decrypted memory on unrecoverable errors PCI/MSI: Handle lack of irqdomain gracefully perf/x86/intel: Fix bitmask of OCR and FRONTEND events for LNC i2c: imx: add imx7d compatible string for applying erratum ERR007805 i2c: microchip-core: fix "ghost" detections perf/x86/intel/ds: Add PEBS format 6 power: supply: cros_charge-control: add mutex for driver data power: supply: cros_charge-control: allow start_threshold == end_threshold power: supply: cros_charge-control: hide start threshold on v2 cmd power: supply: gpio-charger: Fix set charge current limits btrfs: fix race with memory mapped writes when activating swap file btrfs: avoid monopolizing a core when activating a swap file btrfs: fix swap file activation failure due to extents that used to be shared btrfs: fix transaction atomicity bug when enabling simple quotas btrfs: sysfs: fix direct super block member reads btrfs: fix use-after-free when COWing tree bock and tracing is enabled btrfs: check folio mapping after unlock in put_file_data() btrfs: check folio mapping after unlock in relocate_one_folio() Bluetooth: btusb: mediatek: move Bluetooth power off command position Bluetooth: btusb: mediatek: add callback function in btusb_disconnect Bluetooth: btusb: mediatek: add intf release flow when usb disconnect Bluetooth: btusb: mediatek: change the conditions for ISO interface ALSA: ump: Shut up truncated string warning ALSA: sh: Fix wrong argument order for copy_from_iter() Linux 6.12.8 Change-Id: I2f5b46453984dde6ed8c381109655261a6bc3596 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2025-01-03 07:44:08 +00:00
Lorenzo Stoakes	8cdfb06569	fork: avoid inappropriate uprobe access to invalid mm [ Upstream commit 8ac662f5da19f5873fdd94c48a5cdb45b2e1b58f ] If dup_mmap() encounters an issue, currently uprobe is able to access the relevant mm via the reverse mapping (in build_map_info()), and if we are very unlucky with a race window, observe invalid XA_ZERO_ENTRY state which we establish as part of the fork error path. This occurs because uprobe_write_opcode() invokes anon_vma_prepare() which in turn invokes find_mergeable_anon_vma() that uses a VMA iterator, invoking vma_iter_load() which uses the advanced maple tree API and thus is able to observe XA_ZERO_ENTRY entries added to dup_mmap() in commit `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()"). This change was made on the assumption that only process tear-down code would actually observe (and make use of) these values. However this very unlikely but still possible edge case with uprobes exists and unfortunately does make these observable. The uprobe operation prevents races against the dup_mmap() operation via the dup_mmap_sem semaphore, which is acquired via uprobe_start_dup_mmap() and dropped via uprobe_end_dup_mmap(), and held across register_for_each_vma() prior to invoking build_map_info() which does the reverse mapping lookup. Currently these are acquired and dropped within dup_mmap(), which exposes the race window prior to error handling in the invoking dup_mm() which tears down the mm. We can avoid all this by just moving the invocation of uprobe_start_dup_mmap() and uprobe_end_dup_mmap() up a level to dup_mm() and only release this lock once the dup_mmap() operation succeeds or clean up is done. This means that the uprobe code can never observe an incompletely constructed mm and resolves the issue in this case. Link: https://lkml.kernel.org/r/20241210172412.52995-1-lorenzo.stoakes@oracle.com Fixes: `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: syzbot+2d788f4f7cb660dac4b7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6756d273.050a0220.2477f.003d.GAE@google.com/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-02 10:34:10 +01:00
kuyo chang	7b8d3e27a3	ANDROID: GKI: Add initial dynamically task vendor size flow UBSAN trigged loading invalid value when CONFIG_PAGE_POISONING=y. The static vendor data has been initial by android_init_vendor_data. Add the initial flow for the memory content to zero before vendor use it. Bug: 383246978 Change-Id: Ic4351dfeda5b9d49cfddeaf0464f9250bed80ffe Signed-off-by: kuyo chang <kuyo.chang@mediatek.com> Signed-off-by: kuyo chang <kuyo.chang@mediatek.corp-partner.google.com> [jstultz: Minor cleanup to avoid ifdefs] Signed-off-by: John Stultz <jstultz@google.com>	2024-12-17 10:15:29 +08:00
Peter Zijlstra	f86b854c98	ANDROID: sched: Add deactivated (sleeping) owner handling to find_proxy_task() If the blocked_on chain resolves to a sleeping owner, deactivate the donor task, and enqueue it on the sleeping owner task. Then re-activate it later when the owner is woken up. NOTE: This has been particularly challenging to get working properly, and some of the locking is particularly awkward. I'd very much appreciate review and feedback for ways to simplify this. Cc: Joel Fernandes <joelaf@google.com> Cc: Qais Yousef <qyousef@layalina.io> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Zimuzo Ezeozue <zezeozue@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Metin Kaya <Metin.Kaya@arm.com> Cc: Xuewen Yan <xuewen.yan94@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: kernel-team@android.com Change-Id: Ib7e9a793c13465be06a60dbdaff7e97133091e44 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Connor O'Brien <connoro@google.com> [jstultz: This was broken out from the larger proxy() patch] Signed-off-by: John Stultz <jstultz@google.com> Bug: 306081722 --- v5: * Split out from larger proxy patch v6: * Major rework, replacing the single list head per task with per-task list head and nodes, creating a tree structure so we only wake up descendants of the task woken. * Reworked the locking to take the task->pi_lock, so we can avoid mid-chain wakeup races from try_to_wake_up() called by the ww_mutex logic. v7: * Drop unnecessary __nested lock annotation, as we already drop the lock prior. * Add comments on #else & #endif lines, and clearer function names, and commit message tweaks as suggested by Metin Kaya * Move activate_blocked_entities() call from ttwu_queue to try_to_wake_up() to simplify locking. Thanks to questions from Metin Kaya * Fix irqsave/irqrestore usage now we call this outside where the pi_lock is held * Fix activate_blocked_entitites not preserving wake_cpu * Fix for UP builds v8: * Minor checkpatch fixup * Drop proxy_deactivate and cleanups suggested by Metin v9: * Fix bug causing possibly uninitialized cpu value to be used with activate_blocked_entities() * Improved comment around preserving wake_cpu suggested by Metin * Add additional lockdep asserts, suggested by Metin * Tweaked placement of lockdep assert, suggested by Metin * Fixed comment referring to structure entry name * Fix to call proxy_resched_idle() _prior_ to calling proxy_enqueue_on_owner() where we deactivate the task, this avoids stale references to rq_selected() when the task may have been migrated to another rq. * Fix to remove the blocked_head list at the start of activate_blocked_entities() so we only do a finite amount of work, avoiding a potential livelock of two cpus removing and adding tasks to the list at the same time if the owner went back to sleep while blocked entities were being woken. v11: * Big rework to get rid of recursion. Had to add another list item to the task_stuct to do this as we are in atomic context and cannot allocate memory while activating blocked entities. Will need to watch carefully for bugs, as switching to a list_head in the task_struct instead of a pointer on the stack opens up the potential for races on the shared state, but I think I've got the locking sorted. * Moved proxy_set_task_cpu helper to earlier in the series * Minor rework for try_to_deactivate_task changes * Minor variable name cleanups suggested by Metin v13: * Switch to use donor from next for proxy_enqueue_on_owner * Switch to using block_task instead of deactivate_task v14: * Ensure we call block_task() last in proxy_enqueue_on_owner and not touch it again to avoid races where it might be activated on another cpu * Make sure we activate blocked_entities when we exit from ttwu * Fix to enqueue the last task in the chain (p) on the blocked owner instead of donor, so that we preserve the chain structure so mid-chain wakeups propagate properly * Rework of sleeping_owner handling so that we properly deal with delayed-dequeued (sched_delayed) tasks (also removes now unused proxy_deactivate() logic)	2024-12-13 10:01:53 -08:00
John Stultz	95c9e8505a	ANDROID: sched: Migrate whole chain in proxy_migrate_task() Instead of migrating one task each time through find_proxy_task(), we can walk up the blocked_donor ptrs and migrate the entire current chain in one go. This was broken out of earlier patches and held back while the series was being stabilized, but I wanted to re-introduce it. Cc: Joel Fernandes <joelaf@google.com> Cc: Qais Yousef <qyousef@layalina.io> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Zimuzo Ezeozue <zezeozue@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Metin Kaya <Metin.Kaya@arm.com> Cc: Xuewen Yan <xuewen.yan94@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: kernel-team@android.com Change-Id: Ia920b2d4161b47b10b5d0774fb1e3283e92bbf0f Signed-off-by: John Stultz <jstultz@google.com> Bug: 306081722 --- v12: * Earlier this was re-using blocked_node, but I hit a race with activating blocked entities, and to avoid it introduced a new migration_node listhead	2024-12-13 10:01:53 -08:00
Peter Zijlstra	465f85fe91	ANDROID: sched: Add blocked_donor link to task for smarter mutex handoffs Add link to the task this task is proxying for, and use it so the mutex owner can do an intelligent hand-off of the mutex to the task that the owner is running on behalf. Cc: Joel Fernandes <joelaf@google.com> Cc: Qais Yousef <qyousef@layalina.io> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Zimuzo Ezeozue <zezeozue@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Metin Kaya <Metin.Kaya@arm.com> Cc: Xuewen Yan <xuewen.yan94@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: kernel-team@android.com Change-Id: Iad6f775f928b9e90e22d1d831aff26f60f37e773 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Connor O'Brien <connoro@google.com> [jstultz: This patch was split out from larger proxy patch] Signed-off-by: John Stultz <jstultz@google.com> Bug: 306081722 --- v5: * Split out from larger proxy patch v6: * Moved proxied value from earlier patch to this one where it is actually used * Rework logic to check sched_proxy_exec() instead of using ifdefs * Moved comment change to this patch where it makes sense v7: * Use more descriptive term then "us" in comments, as suggested by Metin Kaya. * Minor typo fixup from Metin Kaya * Reworked proxied variable to prev_not_proxied to simplify usage v8: * Use helper for donor blocked_on_state transition v9: * Re-add mutex lock handoff in the unlock path, but only when we have a blocked donor * Slight reword of commit message suggested by Metin	2024-12-13 10:01:53 -08:00
Peter Zijlstra	484044f3c6	FROMLIST: locking/mutex: Rework task_struct::blocked_on Track the blocked-on relation for mutexes, to allow following this relation at schedule time. task \| blocked-on v mutex \| owner v task Also add a blocked_on_state value so we can distinguish when a task is blocked_on a mutex, but is either blocked, waking up, or runnable (such that it can try to acquire the lock its blocked on). This avoids some of the subtle & racy games where the blocked_on state gets cleared, only to have it re-added by the mutex_lock_slowpath call when it tries to acquire the lock on wakeup Also add blocked_lock to the task_struct so we can safely serialize the blocked-on state. Finally add wrappers that are useful to provide correctness checks. Folded in from a patch by: Valentin Schneider <valentin.schneider@arm.com> This all will be used for tracking blocked-task/mutex chains with the prox-execution patch in a similar fashion to how priority inheritance is done with rt_mutexes. Cc: Joel Fernandes <joelaf@google.com> Cc: Qais Yousef <qyousef@layalina.io> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Zimuzo Ezeozue <zezeozue@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Metin Kaya <Metin.Kaya@arm.com> Cc: Xuewen Yan <xuewen.yan94@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: kernel-team@android.com Change-Id: I3c88f64c5defe46b7f5ac468048d88dbbd2deb5e Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [minor changes while rebasing] Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Connor O'Brien <connoro@google.com> [jstultz: Fix blocked_on tracking in __mutex_lock_common in error paths] Signed-off-by: John Stultz <jstultz@google.com> Link: https://lore.kernel.org/lkml/20241125195204.2374458-3-jstultz@google.com/ Bug: 306081722 --- v2: * Fixed blocked_on tracking in error paths that was causing crashes v4: * Ensure we clear blocked_on when waking ww_mutexes to die or wound. This is critical so we don't get circular blocked_on relationships that can't be resolved. v5: * Fix potential bug where the skip_wait path might clear blocked_on when that path never set it * Slight tweaks to where we set blocked_on to make it consistent, along with extra WARN_ON correctness checking * Minor comment changes v7: * Minor commit message change suggested by Metin Kaya * Fix WARN_ON conditionals in unlock path (as blocked_on might already be cleared), found while looking at issue Metin Kaya raised. * Minor tweaks to be consistent in what we do under the blocked_on lock, also tweaked variable name to avoid confusion with label, and comment typos, as suggested by Metin Kaya * Minor tweak for CONFIG_SCHED_PROXY_EXEC name change * Moved unused block of code to later in the series, as suggested by Metin Kaya * Switch to a tri-state to be able to distinguish from waking and runnable so we can later safely do return migration from ttwu * Folded together with related blocked_on changes v8: * Fix issue leaving task BO_BLOCKED when calling into optimistic spinning path. * Include helper to better handle BO_BLOCKED->BO_WAKING transitions v9: * Typo fixup pointed out by Metin * Cleanup BO_WAKING->BO_RUNNABLE transitions for the !proxy case * Many cleanups and simplifications suggested by Metin v11: * Whitespace fixup pointed out by Metin v13: * Refactor set_blocked_on helpers clean things up a bit v14: * Small build fixup with PREEMPT_RT	2024-12-13 10:01:53 -08:00
Christian Brauner	13111945c2	Revert "fs: don't block i_writecount during exec" commit 3b832035387ff508fdcf0fba66701afc78f79e3d upstream. This reverts commit `2a010c4128`. Rui Ueyama <rui314@gmail.com> writes: > I'm the creator and the maintainer of the mold linker > (https://github.com/rui314/mold). Recently, we discovered that mold > started causing process crashes in certain situations due to a change > in the Linux kernel. Here are the details: > > - In general, overwriting an existing file is much faster than > creating an empty file and writing to it on Linux, so mold attempts to > reuse an existing executable file if it exists. > > - If a program is running, opening the executable file for writing > previously failed with ETXTBSY. If that happens, mold falls back to > creating a new file. > > - However, the Linux kernel recently changed the behavior so that > writing to an executable file is now always permitted > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a010c412853). > > That caused mold to write to an executable file even if there's a > process running that file. Since changes to mmap'ed files are > immediately visible to other processes, any processes running that > file would almost certainly crash in a very mysterious way. > Identifying the cause of these random crashes took us a few days. > > Rejecting writes to an executable file that is currently running is a > well-known behavior, and Linux had operated that way for a very long > time. So, I don’t believe relying on this behavior was our mistake; > rather, I see this as a regression in the Linux kernel. Quoting myself from commit `2a010c4128` ("fs: don't block i_writecount during exec") > Yes, someone in userspace could potentially be relying on this. It's not > completely out of the realm of possibility but let's find out if that's > actually the case and not guess. It seems we found out that someone is relying on this obscure behavior. So revert the change. Link: https://github.com/rui314/mold/issues/1361 Link: https://lore.kernel.org/r/4a2bc207-76be-4715-8e12-7fc45a76a125@leemhuis.info Cc: <stable@vger.kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-05 14:02:50 +01:00
Matthias Maennich	7fc0276001	Merge 'v6.12-rc6' into android-mainline Change-Id: I0c3f47fe0cae2b79dc90050b15d424ac8a56d089 Signed-off-by: Matthias Maennich <maennich@google.com>	2024-11-05 00:24:26 +00:00
Linus Torvalds	b019b4a670	Merge tag 'timers-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for posix CPU timers. When a thread is cloned, the posix CPU timers are not inherited. If the parent has a CPU timer armed the corresponding tick dependency in the tasks tick_dep_mask is set and copied to the new thread, which means the new thread and all decendants will prevent the system to go into full NOHZ operation. Clear the tick dependency mask in copy_process() to fix this" * tag 'timers-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone	2024-11-03 08:22:21 -10:00
Liangliang Li	16151a687e	ANDROID: vendor_hooks: Add hooks to dup_task_struct Add hook to dup_task_struct for vendor data fields initialisation. Bug: 188004638 Change-Id: I4b58604ee822fb8d1e0cc37bec72e820e7318427 Signed-off-by: Liangliang Li <liliangliang@vivo.com> (cherry picked from commit f66d96b14aab5051fdf6b5054d87362c17a7b365) (cherry picked from commit bafafe0ec46160573bef46d3d0f5d6c65fadaa3b)	2024-10-30 00:42:30 +00:00
Lorenzo Stoakes	985da552a9	fork: only invoke khugepaged, ksm hooks if no error There is no reason to invoke these hooks early against an mm that is in an incomplete state. The change in commit `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") makes this more pertinent as we may be in a state where entries in the maple tree are not yet consistent. Their placement early in dup_mmap() only appears to have been meaningful for early error checking, and since functionally it'd require a very small allocation to fail (in practice 'too small to fail') that'd only occur in the most dire circumstances, meaning the fork would fail or be OOM'd in any case. Since both khugepaged and KSM tracking are there to provide optimisations to memory performance rather than critical functionality, it doesn't really matter all that much if, under such dire memory pressure, we fail to register an mm with these. As a result, we follow the example of commit `d2081b2bf8` ("mm: khugepaged: make khugepaged_enter() void function") and make ksm_fork() a void function also. We only expose the mm to these functions once we are done with them and only if no error occurred in the fork operation. Link: https://lkml.kernel.org/r/e0cb8b840c9d1d5a6e84d4f8eff5f3f2022aa10c.1729014377.git.lorenzo.stoakes@oracle.com Fixes: `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Jann Horn <jannh@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Lorenzo Stoakes	f64e67e5d3	fork: do not invoke uffd on fork if error occurs Patch series "fork: do not expose incomplete mm on fork". During fork we may place the virtual memory address space into an inconsistent state before the fork operation is complete. In addition, we may encounter an error during the fork operation that indicates that the virtual memory address space is invalidated. As a result, we should not be exposing it in any way to external machinery that might interact with the mm or VMAs, machinery that is not designed to deal with incomplete state. We specifically update the fork logic to defer khugepaged and ksm to the end of the operation and only to be invoked if no error arose, and disallow uffd from observing fork events should an error have occurred. This patch (of 2): Currently on fork we expose the virtual address space of a process to userland unconditionally if uffd is registered in VMAs, regardless of whether an error arose in the fork. This is performed in dup_userfaultfd_complete() which is invoked unconditionally, and performs two duties - invoking registered handlers for the UFFD_EVENT_FORK event via dup_fctx(), and clearing down userfaultfd_fork_ctx objects established in dup_userfaultfd(). This is problematic, because the virtual address space may not yet be correctly initialised if an error arose. The change in commit `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") makes this more pertinent as we may be in a state where entries in the maple tree are not yet consistent. We address this by, on fork error, ensuring that we roll back state that we would otherwise expect to clean up through the event being handled by userland and perform the memory freeing duty otherwise performed by dup_userfaultfd_complete(). We do this by implementing a new function, dup_userfaultfd_fail(), which performs the same loop, only decrementing reference counts. Note that we perform mmgrab() on the parent and child mm's, however userfaultfd_ctx_put() will mmdrop() this once the reference count drops to zero, so we will avoid memory leaks correctly here. Link: https://lkml.kernel.org/r/cover.1729014377.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/d3691d58bb58712b6fb3df2be441d175bd3cdf07.1729014377.git.lorenzo.stoakes@oracle.com Fixes: `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:38 -07:00
Benjamin Segall	b5413156ba	posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone When cloning a new thread, its posix_cputimers are not inherited, and are cleared by posix_cputimers_init(). However, this does not clear the tick dependency it creates in tsk->tick_dep_mask, and the handler does not reach the code to clear the dependency if there were no timers to begin with. Thus if a thread has a cputimer running before clone/fork, all descendants will prevent nohz_full unless they create a cputimer of their own. Fix this by entirely clearing the tick_dep_mask in copy_process(). (There is currently no inherited state that needs a tick dependency) Process-wide timers do not have this problem because fork does not copy signal_struct as a baseline, it creates one from scratch. Fixes: `b78783000d` ("posix-cpu-timers: Migrate to use new tick dependency mask model") Signed-off-by: Ben Segall <bsegall@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/xm26o737bq8o.fsf@google.com	2024-10-27 10:36:04 +01:00
Sai Harshini Nimmala	de863f65b8	ANDROID: GKI: Guard dynamic task_struct size feature with config option Ensure that dynamic task_struct size feature is enabled only for GKI platforms. With this patch, non-GKI platforms will not face build issues anymore due to incorrect configuration earlier. Bug: 233921394 Fixes: `5e9a8cb714` ("ANDROID: GKI: Add to task_struct size via cmdline") Change-Id: Ice341f4826baf8d20a3c846d55db5ea870753c7d Signed-off-by: Sai Harshini Nimmala <quic_snimmala@quicinc.com>	2024-10-17 15:05:31 -07:00
Sai Harshini Nimmala	5e9a8cb714	ANDROID: GKI: Add to task_struct size via cmdline To reduce the size of vendor data allocated in the task_struct, from 512 bytes to a significantly lower 48 bytes, the move to a dynamically sized task_struct is being made. As part of this effort, provide means for vendors to pass a size value via kernel cmdline. Use the passed value to dynamically add to the task_struct size to accommodate vendor data. The cmdline parameter to be used is 'android_task_struct_vendor_size'. For eg., vendors can add the following to the bootargs section of their devicetree to add an extra 512 bytes to the task_struct: "android_task_struct_vendor_size=512" To access this additional memory, use the android_task_vendor_data function provided. Bug: 233921394 Change-Id: I6d5ab92080b82f29bbe9735d40f7d0b1e5bb5913 Signed-off-by: Sai Harshini Nimmala <quic_snimmala@quicinc.com>	2024-10-15 01:09:54 +00:00
Matthias Maennich	32fec317a6	Merge `8cf0b93919` ("Linux 6.12-rc2") into android-mainline Bug: 367265496 Change-Id: I5fec4dbf7e9cd941e3fcd8adca6e0d26ba6adbfe Signed-off-by: Matthias Maennich <maennich@google.com>	2024-10-07 17:20:05 +00:00
Matthias Maennich	0e65cf24a0	Merge `aa486552a1` ("Merge tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock") into android-mainline Steps on the way to 6.12-rc1 Bug: 367265496 Change-Id: I4a4b6fec7b7f189f30a2ce5c650c73d3dda6945d Signed-off-by: Matthias Maennich <maennich@google.com>	2024-10-03 20:41:35 +00:00
Matthias Maennich	662100c8e6	Merge `88264981f2` ("Merge tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext") into android-mainline Steps on the way to 6.12-rc1 Bug: 367265496 Change-Id: If7725ee337ef04be805a9677090bbc38b9dc3358 Signed-off-by: Matthias Maennich <maennich@google.com>	2024-09-30 20:27:29 +00:00
Matthias Maennich	e9d92621d7	Merge `7856a56541` ("Merge tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm") into android-mainline Steps on the way to 6.12-rc1 Bug: 367265496 Change-Id: Ia778d96b2e701765c170e2f4e920e850ceedec0e Signed-off-by: Matthias Maennich <maennich@google.com>	2024-09-30 16:20:19 +00:00
Al Viro	678379e1d4	close_range(): fix the logics in descriptor table trimming Cloning a descriptor table picks the size that would cover all currently opened files. That's fine for clone() and unshare(), but for close_range() there's an additional twist - we clone before we close, and it would be a shame to have close_range(3, ~0U, CLOSE_RANGE_UNSHARE) leave us with a huge descriptor table when we are not going to keep anything past stderr, just because some large file descriptor used to be open before our call has taken it out. Unfortunately, it had been dealt with in an inherently racy way - sane_fdtable_size() gets a "don't copy anything past that" argument (passed via unshare_fd() and dup_fd()), close_range() decides how much should be trimmed and passes that to unshare_fd(). The problem is, a range that used to extend to the end of descriptor table back when close_range() had looked at it might very well have stuff grown after it by the time dup_fd() has allocated a new files_struct and started to figure out the capacity of fdtable to be attached to that. That leads to interesting pathological cases; at the very least it's a QoI issue, since unshare(CLONE_FILES) is atomic in a sense that it takes a snapshot of descriptor table one might have observed at some point. Since CLOSE_RANGE_UNSHARE close_range() is supposed to be a combination of unshare(CLONE_FILES) with plain close_range(), ending up with a weird state that would never occur with unshare(2) is confusing, to put it mildly. It's not hard to get rid of - all it takes is passing both ends of the range down to sane_fdtable_size(). There we are under ->files_lock, so the race is trivially avoided. So we do the following: * switch close_files() from calling unshare_fd() to calling dup_fd(). * undo the calling convention change done to unshare_fd() in `60997c3d45` "close_range: add CLOSE_RANGE_UNSHARE" * introduce struct fd_range, pass a pointer to that to dup_fd() and sane_fdtable_size() instead of "trim everything past that point" they are currently getting. NULL means "we are not going to be punching any holes"; NR_OPEN_MAX is gone. * make sane_fdtable_size() use find_last_bit() instead of open-coding it; it's easier to follow that way. * while we are at it, have dup_fd() report errors by returning ERR_PTR(), no need to use a separate int *errorp argument. Fixes: `60997c3d45` "close_range: add CLOSE_RANGE_UNSHARE" Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2024-09-29 21:52:29 -04:00
Matthias Maennich	df2ebc4bcb	Merge `efdfcd40ad` ("Merge tag 'lkmm.2024.09.14b' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu") into android-mainline Steps on the way to 6.12-rc1 Bug: 367265496 Change-Id: I0a0d83175270f57ba857b91e7c1c403e939fa34f Signed-off-by: Matthias Maennich <maennich@google.com>	2024-09-27 01:47:34 +00:00
Linus Torvalds	aa486552a1	Merge tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - new memblock_estimated_nr_free_pages() helper to replace totalram_pages() which is less accurate when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set - fixes for memblock tests * tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: s390/mm: get estimated free pages by memblock api kernel/fork.c: get estimated free pages by memblock api mm/memblock: introduce a new helper memblock_estimated_nr_free_pages() memblock test: fix implicit declaration of function 'strscpy' memblock test: fix implicit declaration of function 'isspace' memblock test: fix implicit declaration of function 'memparse' memblock test: add the definition of __setup() memblock test: fix implicit declaration of function 'virt_to_phys' tools/testing: abstract two init.h into common include directory memblock tests: include export.h in linkage.h as kernel dose memblock tests: include memory_hotplug.h in mmzone.h as kernel dose	2024-09-25 11:35:19 -07:00
Matthias Maennich	b5aeebd6f1	Merge `c903327d32` ("Merge tag 'printk-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux") into android-mainline Steps on the way to 6.12-rc1 Bug: 367265496 Change-Id: I0d94aa9be16f183bf187f91dc4916add32722775 Signed-off-by: Matthias Maennich <maennich@google.com>	2024-09-25 08:51:49 +00:00
Linus Torvalds	88264981f2	Merge tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext support from Tejun Heo: "This implements a new scheduler class called ‘ext_sched_class’, or sched_ext, which allows scheduling policies to be implemented as BPF programs. The goals of this are: - Ease of experimentation and exploration: Enabling rapid iteration of new scheduling policies. - Customization: Building application-specific schedulers which implement policies that are not applicable to general-purpose schedulers. - Rapid scheduler deployments: Non-disruptive swap outs of scheduling policies in production environments" See individual commits for more documentation, but also the cover letter for the latest series: Link: https://lore.kernel.org/all/20240618212056.2833381-1-tj@kernel.org/ * tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (110 commits) sched: Move update_other_load_avgs() to kernel/sched/pelt.c sched_ext: Don't trigger ops.quiescent/runnable() on migrations sched_ext: Synchronize bypass state changes with rq lock scx_qmap: Implement highpri boosting sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq() sched_ext: Compact struct bpf_iter_scx_dsq_kern sched_ext: Replace consume_local_task() with move_local_task_to_local_dsq() sched_ext: Move consume_local_task() upward sched_ext: Move sanity check and dsq_mod_nr() into task_unlink_from_dsq() sched_ext: Reorder args for consume_local/remote_task() sched_ext: Restructure dispatch_to_local_dsq() sched_ext: Fix processs_ddsp_deferred_locals() by unifying DTL_INVALID handling sched_ext: Make find_dsq_for_dispatch() handle SCX_DSQ_LOCAL_ON sched_ext: Refactor consume_remote_task() sched_ext: Rename scx_kfunc_set_sleepable to unlocked and relocate sched_ext: Add missing static to scx_dump_data sched_ext: Add missing static to scx_has_op[] sched_ext: Temporarily work around pick_task_scx() being called without balance_scx() sched_ext: Add a cgroup scheduler which uses flattened hierarchy sched_ext: Add cgroup support ...	2024-09-21 09:44:57 -07:00
Linus Torvalds	617a814f14	Merge tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Along with the usual shower of singleton patches, notable patch series in this pull request are: - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. - "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. - "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. - "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". - "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. - "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. - "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. - "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. - "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). - "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. - "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. - "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. - "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. - "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. - "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. - "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. - "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). - "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! - "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. - "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. - "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. - "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. - "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. - "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy. This was overprovisioning THPs in sparsely accessed memory areas. - "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. - "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. - "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. - "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios" * tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits) zram: free secondary algorithms names uprobes: turn xol_area->pages[2] into xol_area->page uprobes: introduce the global struct vm_special_mapping xol_mapping Revert "uprobes: use vm_special_mapping close() functionality" mm: support large folios swap-in for sync io devices mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios mm: fix swap_read_folio_zeromap() for large folios with partial zeromap mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries set_memory: add __must_check to generic stubs mm/vma: return the exact errno in vms_gather_munmap_vmas() memcg: cleanup with !CONFIG_MEMCG_V1 mm/show_mem.c: report alloc tags in human readable units mm: support poison recovery from copy_present_page() mm: support poison recovery from do_cow_fault() resource, kunit: add test case for region_intersects() resource: make alloc_free_mem_region() works for iomem_resource mm: z3fold: deprecate CONFIG_Z3FOLD vfio/pci: implement huge_fault support mm/arm64: support large pfn mappings mm/x86: support large pfn mappings ...	2024-09-21 07:29:05 -07:00
Linus Torvalds	78567e2bc7	Merge tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cpuset isolation improvements - cpuset cgroup1 support is split into its own file behind the new config option CONFIG_CPUSET_V1. This makes it the second controller which makes cgroup1 support optional after memcg - Handling of unavailable v1 controller handling improved during cgroup1 mount operations - union_find applied to cpuset. It makes code simpler and more efficient - Reduce spurious events in pids.events - Cleanups and other misc changes - Contains a merge of cgroup/for-6.11-fixes to receive cpuset fixes that further changes build upon * tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (34 commits) cgroup: Do not report unavailable v1 controllers in /proc/cgroups cgroup: Disallow mounting v1 hierarchies without controller implementation cgroup/cpuset: Expose cpuset filesystem with cpuset v1 only cgroup/cpuset: Move cpu.h include to cpuset-internal.h cgroup/cpuset: add sefltest for cpuset v1 cgroup/cpuset: guard cpuset-v1 code under CONFIG_CPUSETS_V1 cgroup/cpuset: rename functions shared between v1 and v2 cgroup/cpuset: move v1 interfaces to cpuset-v1.c cgroup/cpuset: move validate_change_legacy to cpuset-v1.c cgroup/cpuset: move legacy hotplug update to cpuset-v1.c cgroup/cpuset: add callback_lock helper cgroup/cpuset: move memory_spread to cpuset-v1.c cgroup/cpuset: move relax_domain_level to cpuset-v1.c cgroup/cpuset: move memory_pressure to cpuset-v1.c cgroup/cpuset: move common code to cpuset-internal.h cgroup/cpuset: introduce cpuset-v1.c selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs cgroup/cpuset: Account for boot time isolated CPUs cgroup/cpuset: remove use_parent_ecpus of cpuset cgroup/cpuset: remove fetch_xcpus ...	2024-09-18 06:39:03 +02:00
Oleg Nesterov	ed8d5b0ce1	Revert "uprobes: use vm_special_mapping close() functionality" This reverts commit `08e28de116`. A malicious application can munmap() its "[uprobes]" vma and in this case xol_mapping.close == uprobe_clear_state() will free the memory which can be used by another thread, or the same thread when it hits the uprobe bp afterwards. Link: https://lkml.kernel.org/r/20240911131320.GA3448@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-09-17 01:07:01 -07:00
Sven Schnelle	08e28de116	uprobes: use vm_special_mapping close() functionality The following KASAN splat was shown: [ 44.505448] ================================================================== 20:37:27 [3421/145075] [ 44.505455] BUG: KASAN: slab-use-after-free in special_mapping_close+0x9c/0xc8 [ 44.505471] Read of size 8 at addr 00000000868dac48 by task sh/1384 [ 44.505479] [ 44.505486] CPU: 51 UID: 0 PID: 1384 Comm: sh Not tainted 6.11.0-rc6-next-20240902-dirty #1496 [ 44.505503] Hardware name: IBM 3931 A01 704 (z/VM 7.3.0) [ 44.505508] Call Trace: [ 44.505511] [<000b0324d2f78080>] dump_stack_lvl+0xd0/0x108 [ 44.505521] [<000b0324d2f5435c>] print_address_description.constprop.0+0x34/0x2e0 [ 44.505529] [<000b0324d2f5464c>] print_report+0x44/0x138 [ 44.505536] [<000b0324d1383192>] kasan_report+0xc2/0x140 [ 44.505543] [<000b0324d2f52904>] special_mapping_close+0x9c/0xc8 [ 44.505550] [<000b0324d12c7978>] remove_vma+0x78/0x120 [ 44.505557] [<000b0324d128a2c6>] exit_mmap+0x326/0x750 [ 44.505563] [<000b0324d0ba655a>] __mmput+0x9a/0x370 [ 44.505570] [<000b0324d0bbfbe0>] exit_mm+0x240/0x340 [ 44.505575] [<000b0324d0bc0228>] do_exit+0x548/0xd70 [ 44.505580] [<000b0324d0bc1102>] do_group_exit+0x132/0x390 [ 44.505586] [<000b0324d0bc13b6>] __s390x_sys_exit_group+0x56/0x60 [ 44.505592] [<000b0324d0adcbd6>] do_syscall+0x2f6/0x430 [ 44.505599] [<000b0324d2f78434>] __do_syscall+0xa4/0x170 [ 44.505606] [<000b0324d2f9454c>] system_call+0x74/0x98 [ 44.505614] [ 44.505616] Allocated by task 1384: [ 44.505621] kasan_save_stack+0x40/0x70 [ 44.505630] kasan_save_track+0x28/0x40 [ 44.505636] __kasan_kmalloc+0xa0/0xc0 [ 44.505642] __create_xol_area+0xfa/0x410 [ 44.505648] get_xol_area+0xb0/0xf0 [ 44.505652] uprobe_notify_resume+0x27a/0x470 [ 44.505657] irqentry_exit_to_user_mode+0x15e/0x1d0 [ 44.505664] pgm_check_handler+0x122/0x170 [ 44.505670] [ 44.505672] Freed by task 1384: [ 44.505676] kasan_save_stack+0x40/0x70 [ 44.505682] kasan_save_track+0x28/0x40 [ 44.505687] kasan_save_free_info+0x4a/0x70 [ 44.505693] __kasan_slab_free+0x5a/0x70 [ 44.505698] kfree+0xe8/0x3f0 [ 44.505704] __mmput+0x20/0x370 [ 44.505709] exit_mm+0x240/0x340 [ 44.505713] do_exit+0x548/0xd70 [ 44.505718] do_group_exit+0x132/0x390 [ 44.505722] __s390x_sys_exit_group+0x56/0x60 [ 44.505727] do_syscall+0x2f6/0x430 [ 44.505732] __do_syscall+0xa4/0x170 [ 44.505738] system_call+0x74/0x98 The problem is that uprobe_clear_state() kfree's struct xol_area, which contains struct vm_special_mapping xol_mapping. This one is passed to _install_special_mapping() in xol_add_vma(). __mput reads: static inline void __mmput(struct mm_struct mm) { VM_BUG_ON(atomic_read(&mm->mm_users)); uprobe_clear_state(mm); exit_aio(mm); ksm_exit(mm); khugepaged_exit(mm); /* must run before exit_mmap */ exit_mmap(mm); ... } So uprobe_clear_state() in the beginning free's the memory area containing the vm_special_mapping data, but exit_mmap() uses this address later via vma->vm_private_data (which was set in _install_special_mapping(). Fix this by moving uprobe_clear_state() to uprobes.c and use it as close() callback. [usama.anjum@collabora.com: remove unneeded condition] Link: https://lkml.kernel.org/r/20240906101825.177490-1-usama.anjum@collabora.com Link: https://lkml.kernel.org/r/20240903073629.2442754-1-svens@linux.ibm.com Fixes: `223febc6e5` ("mm: add optional close() to struct vm_special_mapping") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-09-09 16:39:14 -07:00
Lee Jones	6bf9b9c6e9	Merge tag 'v6.11-rc6' into android-mainline Linux 6.11-rc6 Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I321f364a91703f6814332ef96c1d9ae3747625af	2024-09-05 10:16:50 +00:00
Tejun Heo	649e980dad	Merge branch 'bpf/master' into for-6.12 Pull bpf/master to receive `baebe9aaba` ("bpf: allow passing struct bpf_iter_<type> as kfunc arguments") and related changes in preparation for the DSQ iterator patchset. Signed-off-by: Tejun Heo <tj@kernel.org>	2024-09-04 11:41:32 -10:00
Lee Jones	8e0dce3251	Merge tag 'v6.11-rc4' into android-mainline Linux 6.11-rc4 Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: Icd84f7f6bed0651850e3f9c98898d8ab444271da	2024-09-03 07:16:47 +00:00

1 2 3 4 5 ...

1482 Commits