Commit Graph

209 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
fcc21a6112 Merge 11c7fa11fa ("net: stmmac: dwmac-loongson: Set correct {tx,rx}_fifo_size") into android16-6.12-lts
Steps on the way to 6.12.31

Resolves merge conflicts in:
	kernel/sched/fair.c

Change-Id: I545f90ce44822f1a0f940be224258533b6581077
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2025-07-02 07:08:57 +00:00
Andy Shevchenko
dc5f5c9d2b hrtimers: Replace hrtimer_clock_to_base_table with switch-case
[ Upstream commit 4441b976dfeff0d3579e8da3c0283300c618a553 ]

Clang and GCC complain about overlapped initialisers in the
hrtimer_clock_to_base_table definition. With `make W=1` and CONFIG_WERROR=y
(which is default nowadays) this breaks the build:

  CC      kernel/time/hrtimer.o
kernel/time/hrtimer.c:124:21: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
  124 |         [CLOCK_REALTIME]        = HRTIMER_BASE_REALTIME,

kernel/time/hrtimer.c:122:27: note: previous initialization is here
  122 |         [0 ... MAX_CLOCKS - 1]  = HRTIMER_MAX_CLOCK_BASES,

(and similar for CLOCK_MONOTONIC, CLOCK_BOOTTIME, and CLOCK_TAI).

hrtimer_clockid_to_base(), which uses the table, is only used in
__hrtimer_init(), which is not a hotpath.

Therefore replace the table lookup with a switch case in
hrtimer_clockid_to_base() to avoid this warning.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250214134424.3367619-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29 11:02:46 +02:00
Greg Kroah-Hartman
db92d2187f Merge d5ca39d336 ("btrfs: fix two misuses of folio_shift()") into android16-6.12
Steps on the way to 6.12.20

Resolves merge conflicts in:
	fs/btrfs/extent_io.c
	kernel/futex/core.c

Change-Id: If0d75c0d1a638c34dea1a88e09ef07feed5af130
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2025-04-22 09:30:35 +00:00
Andy Shevchenko
ae5716b463 hrtimers: Mark is_migration_base() with __always_inline
[ Upstream commit 27af31e44949fa85550176520ef7086a0d00fd7b ]

When is_migration_base() is unused, it prevents kernel builds
with clang, `make W=1` and CONFIG_WERROR=y:

kernel/time/hrtimer.c:156:20: error: unused function 'is_migration_base' [-Werror,-Wunused-function]
  156 | static inline bool is_migration_base(struct hrtimer_clock_base *base)
      |                    ^~~~~~~~~~~~~~~~~

Fix this by marking it with __always_inline.

[ tglx: Use __always_inline instead of __maybe_unused and move it into the
  	usage sites conditional ]

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250116160745.243358-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22 12:54:14 -07:00
litao
4713550a59 ANDROID: vendor_hooks: add vendor hook for nanosleep syscall
Through this vendor hook, we can get the timing to check
current running task for the validation of its credential
and nanosleep operations. through this we can filter some
app tasks which may fire the nanosleep events storm.

Bug: 396244637
Bug: 341618050
Change-Id: I644ac3d217930aa0a50966996e8001e27ce8a501
Signed-off-by: litao <tao.li@vivo.corp-partner.google.com>
(cherry picked from commit 1d9ed15534d4988da3ac56dd7892d37bf5f96847)
2025-02-25 17:23:27 -08:00
Greg Kroah-Hartman
3d75a8d3e8 Merge 6.12.14 into android16-6.12
GKI (arm64) relevant 75 out of 419 changes, affecting 91 files +700/-304
  38a1aa02b9 exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case [2 files, +29/-4]
  e5ff8d825d sched: Don't try to catch up excess steal time. [1 file, +4/-2]
  404e5fd918 printk: Fix signed integer overflow when defining LOG_BUF_LEN_MAX [1 file, +1/-1]
  b006aadf72 drm/connector: add mutex to protect ELD from concurrent access [3 files, +11/-1]
  22a1a75818 ring-buffer: Make reading page consistent with the code logic [1 file, +3/-1]
  5c2b1d9386 tun: fix group permission check [1 file, +9/-5]
  f4b8bac3cf mmc: core: Respect quirk_max_rate for non-UHS SDIO card [1 file, +2/-0]
  e557b15ea2 HID: multitouch: Add quirk for Hantick 5288 touchpad [1 file, +5/-0]
  adcb8ce68d HID: Wacom: Add PCI Wacom device support [1 file, +5/-0]
  ebb90f23f0 Bluetooth: MGMT: Fix slab-use-after-free Read in mgmt_remove_adv_monitor_sync [1 file, +11/-1]
  c257c15845 tipc: re-order conditions in tipc_crypto_key_rcv() [1 file, +2/-2]
  90778f31ef ASoC: soc-pcm: don't use soc_pcm_ret() on .prepare callback [1 file, +28/-4]
  33a4a9f54a Input: allocate keycode for phone linking [1 file, +1/-0]
  57e07d10b3 sched/fair: Fix inaccurate h_nr_runnable accounting with delayed dequeue [1 file, +19/-0]
  bc85817e6b nvme: handle connectivity loss in nvme_set_queue_count [1 file, +7/-1]
  5eba53a9ea nvme: make nvme_tls_attrs_group static [1 file, +1/-1]
  83ebf741aa udp: gso: do not drop small packets when PMTU reduces [3 files, +30/-4]
  3139a7024e ethtool: rss: fix hiding unsupported fields in dumps [1 file, +2/-1]
  e40cb34b7f pfifo_tail_enqueue: Drop new packet when sch->limit == 0 [1 file, +3/-0]
  6312555249 netem: Update sch->q.qlen before qdisc_tree_reduce_backlog() [1 file, +1/-1]
  e36364d5d4 tun: revert fix group permission check [1 file, +5/-9]
  181b23ca2e net: sched: Fix truncation of offloaded action statistics [1 file, +1/-1]
  ac7b5f3e4d drm/client: Handle tiled displays better [1 file, +9/-0]
  f735c9d4dc fs/proc: do_task_stat: Fix ESP not readable during coredump [1 file, +1/-1]
  5a6520493c arm64/kvm: Configure HYP TCR.PS/DS based on host stage1 [1 file, +4/-4]
  c66e5205fd arm64/sme: Move storage of reg_smidr to __cpuinfo_store_cpu() [2 files, +10/-13]
  e5251ae5d3 arm64/mm: Reduce PA space to 48 bits when LPA2 is not enabled [3 files, +11/-7]
  de3ffeb212 KVM: arm64: timer: Always evaluate the need for a soft timer [1 file, +1/-3]
  f2f805ada6 KVM: Explicitly verify target vCPU is online in kvm_get_vcpu() [1 file, +9/-0]
  691218a50c Bluetooth: L2CAP: handle NULL sock pointer in l2cap_sock_alloc [1 file, +2/-1]
  ddfc234761 Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection [1 file, +2/-2]
  5a262628f4 seccomp: passthrough uretprobe systemcall without filtering [1 file, +12/-0]
  2ce09aabe0 blk-cgroup: Fix class @block_class's subsystem refcount leakage [1 file, +1/-0]
  ae959ab075 scsi: ufs: core: Fix the HIGH/LOW_TEMP Bit Definitions [1 file, +2/-2]
  45ad3c7d62 of: Correct child specifier used as input of the 2nd nexus node [1 file, +1/-1]
  e62c630810 of: address: Fix empty resource handling in __of_address_resource_bounds() [1 file, +5/-7]
  4e4b3d4926 of: Fix of_find_node_opts_by_path() handling of alias+path+options [1 file, +3/-3]
  5b91440ebe of: reserved-memory: Fix using wrong number of cells to get property 'alignment' [1 file, +2/-2]
  ed0ad04c68 ring-buffer: Do not allow events in NMI with generic atomic64 cmpxchg() [1 file, +7/-2]
  d0b81ea5a5 dm-crypt: don't update io->sector after kcryptd_crypt_write_io_submit() [1 file, +3/-11]
  68a25ceb11 dm-crypt: track tag_offset in convert_context [1 file, +7/-6]
  68f16d3034 block: don't revert iter for -EIOCBQUEUED [1 file, +3/-2]
  0a14a2b841 Revert "media: uvcvideo: Require entities to have a non-zero unique ID" [1 file, +27/-43]
  3d17a4bbf2 PCI: endpoint: Finish virtual EP removal in pci_epf_remove_vepf() [1 file, +1/-0]
  36786d1a45 PCI: dwc: ep: Write BAR_MASK before iATU registers in pci_epc_set_bar() [1 file, +15/-13]
  b5cacfd067 PCI: dwc: ep: Prevent changing BAR size/flags in pci_epc_set_bar() [1 file, +21/-1]
  9fbac83100 nvme-pci: Add TUXEDO InfinityFlex to Samsung sleep quirk [1 file, +2/-1]
  2c4cda456e nvme-pci: Add TUXEDO IBP Gen9 to Samsung sleep quirk [1 file, +1/-0]
  0c77c0d754 scsi: ufs: core: Fix use-after free in init error and remove paths [4 files, +30/-32]
  8db25d4c4a scsi: core: Do not retry I/Os during depopulation [1 file, +7/-2]
  c287f18f64 rv: Reset per-task monitors also for idle tasks [1 file, +4/-0]
  e456a88bdd hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING [2 files, +83/-21]
  2a54e8f118 kfence: skip __GFP_THISNODE allocations on NUMA systems [1 file, +2/-0]
  b64b773087 soc: qcom: smem_state: fix missing of_node_put in error path [1 file, +2/-1]
  5100391aca media: mc: fix endpoint iteration [1 file, +1/-1]
  d2eac8b14a media: uvcvideo: Fix crash during unbind if gpio unit is in use [2 files, +22/-7]
  4f534dd576 media: uvcvideo: Fix event flags in uvc_ctrl_send_events [1 file, +2/-2]
  ac7737ed9a media: uvcvideo: Support partial control reads [1 file, +21/-0]
  34fb9eb31d media: uvcvideo: Only save async fh if success [1 file, +11/-7]
  e8a650dbc7 media: uvcvideo: Remove redundant NULL assignment [1 file, +1/-3]
  438bda062b media: uvcvideo: Remove dangling pointers [3 files, +67/-3]
  a403eca86c mm: gup: fix infinite loop within __get_longterm_locked [1 file, +4/-10]
  4b69308314 mm/vmscan: accumulate nr_demoted for accurate demotion statistics [1 file, +4/-3]
  4491159774 mm/compaction: fix UBSAN shift-out-of-bounds warning [1 file, +2/-1]
  2c3109dcda nvmem: core: improve range check for nvmem_cell_write() [1 file, +2/-0]
  35ae7910c3 io_uring: fix multishots with selected buffers [1 file, +2/-0]
  be985aea92 io_uring/net: don't retry connect operation on EPOLLERR [2 files, +7/-0]
  e398619920 i3c: master: Fix missing 'ret' assignment in set_speed() [1 file, +1/-1]
  8b4120b3e0 misc: misc_minor_alloc to use ida for all dynamic/misc dynamic minors [1 file, +30/-9]
  19fc795e9d maple_tree: simplify split calculation [1 file, +6/-17]
  8441aea464 ptp: Ensure info->enable callback is always set [1 file, +8/-0]
  c6dd70e5b4 timers/migration: Fix off-by-one root mis-connection [1 file, +9/-1]
  45439a8b11 fs: prepend statmount.mnt_opts string with security_sb_mnt_opts() [1 file, +4/-0]
  7db0365ee6 fs: fix adding security options to statmount.mnt_opt [1 file, +14/-15]
  d49c64c1d7 statmount: let unset strings be empty [1 file, +12/-4]

Changes in 6.12.14
	irqchip/lan966x-oic: Make CONFIG_LAN966X_OIC depend on CONFIG_MCHP_LAN966X_PCI
	btrfs: fix assertion failure when splitting ordered extent after transaction abort
	btrfs: do not output error message if a qgroup has been already cleaned up
	btrfs: fix use-after-free when attempting to join an aborted transaction
	arm64/mm: Ensure adequate HUGE_MAX_HSTATE
	exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case
	s390/stackleak: Use exrl instead of ex in __stackleak_poison()
	btrfs: fix data race when accessing the inode's disk_i_size at btrfs_drop_extents()
	btrfs: convert BUG_ON in btrfs_reloc_cow_block() to proper error handling
	sched: Don't try to catch up excess steal time.
	x86: Convert unreachable() to BUG()
	locking/ww_mutex/test: Use swap() macro
	lockdep: Fix upper limit for LOCKDEP_*_BITS configs
	x86/amd_nb: Restrict init function to AMD-based systems
	drm/virtio: New fence for every plane update
	drm: Add panel backlight quirks
	drm: panel-backlight-quirks: Add Framework 13 matte panel
	drm: panel-backlight-quirks: Add Framework 13 glossy and 2.8k panels
	nvkm/gsp: correctly advance the read pointer of GSP message queue
	nvkm: correctly calculate the available space of the GSP cmdq buffer
	drm/tests: hdmi: handle empty modes in find_preferred_mode()
	drm/tests: hdmi: return meaningful value from set_connector_edid()
	drm/amd/display: Populate chroma prefetch parameters, DET buffer fix
	drm/amd/display: Overwriting dualDPP UBF values before usage
	printk: Fix signed integer overflow when defining LOG_BUF_LEN_MAX
	drm/connector: add mutex to protect ELD from concurrent access
	drm/bridge: anx7625: use eld_mutex to protect access to connector->eld
	drm/bridge: ite-it66121: use eld_mutex to protect access to connector->eld
	drm/amd/display: use eld_mutex to protect access to connector->eld
	drm/exynos: hdmi: use eld_mutex to protect access to connector->eld
	drm/radeon: use eld_mutex to protect access to connector->eld
	drm/sti: hdmi: use eld_mutex to protect access to connector->eld
	drm/vc4: hdmi: use eld_mutex to protect access to connector->eld
	drm/amd/display: Fix Mode Cutoff in DSC Passthrough to DP2.1 Monitor
	drm/amdgpu: Don't enable sdma 4.4.5 CTXEMPTY interrupt
	drm/amdkfd: Queue interrupt work to different CPU
	drm/bridge: it6505: Change definition MAX_HDCP_DOWN_STREAM_COUNT
	drm/bridge: it6505: fix HDCP Bstatus check
	drm/bridge: it6505: fix HDCP encryption when R0 ready
	drm/bridge: it6505: fix HDCP CTS compare V matching
	drm/bridge: it6505: fix HDCP CTS KSV list wait timer
	safesetid: check size of policy writes
	drm/amd/display: Increase sanitizer frame larger than limit when compile testing with clang
	drm/amd/display: Limit Scaling Ratio on DCN3.01
	ring-buffer: Make reading page consistent with the code logic
	wifi: rtw89: add crystal_cap check to avoid setting as overflow value
	tun: fix group permission check
	mmc: core: Respect quirk_max_rate for non-UHS SDIO card
	mmc: sdhci-esdhc-imx: enable 'SDHCI_QUIRK_NO_LED' quirk for S32G
	wifi: brcmsmac: add gain range check to wlc_phy_iqcal_gainparams_nphy()
	tomoyo: don't emit warning in tomoyo_write_control()
	mfd: lpc_ich: Add another Gemini Lake ISA bridge PCI device-id
	wifi: rtw88: add __packed attribute to efuse layout struct
	clk: qcom: Make GCC_8150 depend on QCOM_GDSC
	HID: multitouch: Add quirk for Hantick 5288 touchpad
	HID: Wacom: Add PCI Wacom device support
	net/mlx5: use do_aux_work for PHC overflow checks
	wifi: brcmfmac: Check the return value of of_property_read_string_index()
	wifi: iwlwifi: pcie: Add support for new device ids
	wifi: iwlwifi: avoid memory leak
	i2c: Force ELAN06FA touchpad I2C bus freq to 100KHz
	APEI: GHES: Have GHES honor the panic= setting
	Bluetooth: btusb: Add new VID/PID 13d3/3610 for MT7922
	Bluetooth: btusb: Add new VID/PID 13d3/3628 for MT7925
	Bluetooth: MGMT: Fix slab-use-after-free Read in mgmt_remove_adv_monitor_sync
	net: wwan: iosm: Fix hibernation by re-binding the driver around it
	HID: hid-asus: Disable OOBE mode on the ProArt P16
	mmc: sdhci-msm: Correctly set the load for the regulator
	octeon_ep: update tx/rx stats locally for persistence
	octeon_ep_vf: update tx/rx stats locally for persistence
	tipc: re-order conditions in tipc_crypto_key_rcv()
	selftests/net/ipsec: Fix Null pointer dereference in rtattr_pack()
	net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path
	ASoC: SOF: Intel: hda-dai: Ensure DAI widget is valid during params
	x86/kexec: Allocate PGD for x86_64 transition page tables separately
	ASoC: Intel: sof_sdw: Correct quirk for Lenovo Yoga Slim 7
	iommu/arm-smmu-qcom: add sdm670 adreno iommu compatible
	iommu/arm-smmu-v3: Clean up more on probe failure
	platform/x86: int3472: Check for adev == NULL
	platform/x86: acer-wmi: Add support for Acer PH14-51
	ASoC: soc-pcm: don't use soc_pcm_ret() on .prepare callback
	platform/x86: acer-wmi: Add support for Acer Predator PH16-72
	ASoC: amd: Add ACPI dependency to fix build error
	Input: allocate keycode for phone linking
	platform/x86: acer-wmi: add support for Acer Nitro AN515-58
	platform/x86: acer-wmi: Ignore AC events
	KVM: PPC: e500: Mark "struct page" dirty in kvmppc_e500_shadow_map()
	KVM: PPC: e500: Mark "struct page" pfn accessed before dropping mmu_lock
	KVM: PPC: e500: Use __kvm_faultin_pfn() to handle page faults
	KVM: e500: always restore irqs
	drm/amdgpu: Fix Circular Locking Dependency in AMDGPU GFX Isolation
	xfs: report realtime block quota limits on realtime directories
	xfs: don't over-report free space or inodes in statvfs
	tty: xilinx_uartps: split sysrq handling
	tty: Permit some TIOCL_SETSEL modes without CAP_SYS_ADMIN
	platform/x86: serdev_helpers: Check for serial_ctrl_uid == NULL
	sched/fair: Fix inaccurate h_nr_runnable accounting with delayed dequeue
	nvme: handle connectivity loss in nvme_set_queue_count
	firmware: iscsi_ibft: fix ISCSI_IBFT Kconfig entry
	gpu: drm_dp_cec: fix broken CEC adapter properties check
	ice: put Rx buffers after being done with current frame
	ice: gather page_count()'s of each frag right before XDP prog call
	ice: stop storing XDP verdict within ice_rx_buf
	nvme: make nvme_tls_attrs_group static
	nvme-fc: use ctrl state getter
	net: bcmgenet: Correct overlaying of PHY and MAC Wake-on-LAN
	ice: Add check for devm_kzalloc()
	vmxnet3: Fix tx queue race condition with XDP
	tg3: Disable tg3 PCIe AER on system reboot
	udp: gso: do not drop small packets when PMTU reduces
	drm/i915/dp: fix the Adaptive sync Operation mode for SDP
	ethtool: rss: fix hiding unsupported fields in dumps
	rxrpc: Fix the rxrpc_connection attend queue handling
	gpio: pca953x: Improve interrupt support
	net: atlantic: fix warning during hot unplug
	net: rose: lock the socket in rose_bind()
	gpio: sim: lock hog configfs items if present
	x86/xen: fix xen_hypercall_hvm() to not clobber %rbx
	x86/xen: add FRAME_END to xen_hypercall_hvm()
	ACPI: property: Fix return value for nval == 0 in acpi_data_prop_read()
	pfifo_tail_enqueue: Drop new packet when sch->limit == 0
	netem: Update sch->q.qlen before qdisc_tree_reduce_backlog()
	tun: revert fix group permission check
	net: sched: Fix truncation of offloaded action statistics
	rxrpc: Fix call state set to not include the SERVER_SECURING state
	cpufreq: fix using cpufreq-dt as module
	cpufreq: s3c64xx: Fix compilation warning
	leds: lp8860: Write full EEPROM, not only half of it
	ALSA: hda/realtek: Enable Mute LED on HP Laptop 14s-fq1xxx
	cifs: Remove intermediate object of failed create SFU call
	drm/modeset: Handle tiled displays in pan_display_atomic.
	drm/client: Handle tiled displays better
	smb: client: fix order of arguments of tracepoints
	smb: client: change lease epoch type from unsigned int to __u16
	md: reintroduce md-linear
	s390/futex: Fix FUTEX_OP_ANDN implementation
	arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
	m68k: vga: Fix I/O defines
	fs/proc: do_task_stat: Fix ESP not readable during coredump
	binfmt_flat: Fix integer overflow bug on 32 bit systems
	accel/ivpu: Fix Qemu crash when running in passthrough
	arm64/kvm: Configure HYP TCR.PS/DS based on host stage1
	arm64/mm: Override PARange for !LPA2 and use it consistently
	arm64/sme: Move storage of reg_smidr to __cpuinfo_store_cpu()
	arm64/mm: Reduce PA space to 48 bits when LPA2 is not enabled
	KVM: arm64: timer: Always evaluate the need for a soft timer
	drm/rockchip: cdn-dp: Use drm_connector_helper_hpd_irq_event()
	arm64: dts: rockchip: increase gmac rx_delay on rk3399-puma
	remoteproc: omap: Handle ARM dma_iommu_mapping
	KVM: Explicitly verify target vCPU is online in kvm_get_vcpu()
	kvm: defer huge page recovery vhost task to later
	KVM: s390: vsie: fix some corner-cases when grabbing vsie pages
	ksmbd: fix integer overflows on 32 bit systems
	drm/amd/display: Optimize cursor position updates
	drm/amd/pm: Mark MM activity as unsupported
	drm/amd/amdgpu: change the config of cgcg on gfx12
	drm/amdkfd: only flush the validate MES contex
	drm/amdkfd: Block per-queue reset when halt_if_hws_hang=1
	Revert "drm/amd/display: Use HW lock mgr for PSR1"
	drm/i915/guc: Debug print LRC state entries only if the context is pinned
	drm/i915: Fix page cleanup on DMA remap failure
	drm/komeda: Add check for komeda_get_layer_fourcc_list()
	drm/xe/devcoredump: Move exec queue snapshot to Contexts section
	drm/i915/dp: Iterate DSC BPP from high to low on all platforms
	drm/i915: Drop 64bpp YUV formats from ICL+ SDR planes
	drm/amdgpu: add a BO metadata flag to disable write compression for Vulkan
	drm/amd/display: Fix seamless boot sequence
	Bluetooth: L2CAP: handle NULL sock pointer in l2cap_sock_alloc
	Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection
	KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y
	clk: sunxi-ng: a100: enable MMC clock reparenting
	clk: mmp2: call pm_genpd_init() only after genpd.name is set
	media: i2c: ds90ub960: Fix UB9702 refclk register access
	clk: clk-loongson2: Fix the number count of clk provider
	clk: qcom: clk-alpha-pll: fix alpha mode configuration
	clk: qcom: gcc-sm8550: Do not turn off PCIe GDSCs during gdsc_disable()
	clk: qcom: gcc-sm8650: Do not turn off PCIe GDSCs during gdsc_disable()
	clk: qcom: gcc-sm6350: Add missing parent_map for two clocks
	clk: qcom: dispcc-sm6350: Add missing parent_map for a clock
	clk: qcom: gcc-mdm9607: Fix cmd_rcgr offset for blsp1_uart6 rcg
	clk: qcom: clk-rpmh: prevent integer overflow in recalc_rate
	clk: mediatek: mt2701-vdec: fix conversion to mtk_clk_simple_probe
	clk: mediatek: mt2701-aud: fix conversion to mtk_clk_simple_probe
	clk: mediatek: mt2701-bdp: add missing dummy clk
	clk: mediatek: mt2701-img: add missing dummy clk
	clk: mediatek: mt2701-mm: add missing dummy clk
	seccomp: passthrough uretprobe systemcall without filtering
	blk-cgroup: Fix class @block_class's subsystem refcount leakage
	efi: libstub: Use '-std=gnu11' to fix build with GCC 15
	perf bench: Fix undefined behavior in cmpworker()
	scsi: ufs: core: Fix the HIGH/LOW_TEMP Bit Definitions
	of: Correct child specifier used as input of the 2nd nexus node
	of: address: Fix empty resource handling in __of_address_resource_bounds()
	of: Fix of_find_node_opts_by_path() handling of alias+path+options
	of: reserved-memory: Fix using wrong number of cells to get property 'alignment'
	Input: bbnsm_pwrkey - add remove hook
	HID: hid-sensor-hub: don't use stale platform-data on remove
	ring-buffer: Do not allow events in NMI with generic atomic64 cmpxchg()
	atomic64: Use arch_spin_locks instead of raw_spin_locks
	wifi: rtlwifi: rtl8821ae: Fix media status report
	wifi: brcmfmac: fix NULL pointer dereference in brcmf_txfinalize()
	wifi: mt76: mt7921u: Add VID/PID for TP-Link TXE50UH
	wifi: rtw88: sdio: Fix disconnection after beacon loss
	wifi: mt76: mt7915: add module param to select 5 GHz or 6 GHz on MT7916
	wifi: rtw88: 8703b: Fix RX/TX issues
	usb: gadget: f_tcm: Translate error to sense
	usb: gadget: f_tcm: Decrement command ref count on cleanup
	usb: gadget: f_tcm: ep_autoconfig with fullspeed endpoint
	usb: gadget: f_tcm: Don't prepare BOT write request twice
	usbnet: ipheth: fix possible overflow in DPE length check
	usbnet: ipheth: use static NDP16 location in URB
	usbnet: ipheth: check that DPE points past NCM header
	usbnet: ipheth: refactor NCM datagram loop
	usbnet: ipheth: break up NCM header size computation
	usbnet: ipheth: fix DPE OoB read
	usbnet: ipheth: document scope of NCM implementation
	arm64: dts: qcom: x1e80100-asus-vivobook-s15: Fix USB QMP PHY supplies
	arm64: dts: qcom: x1e80100-qcp: Fix USB QMP PHY supplies
	arm64: dts: qcom: x1e78100-lenovo-thinkpad-t14s: Fix USB QMP PHY supplies
	arm64: dts: qcom: x1e80100-crd: Fix USB QMP PHY supplies
	arm64: dts: qcom: x1e80100-lenovo-yoga-slim7x: Fix USB QMP PHY supplies
	arm64: dts: qcom: x1e80100-microsoft-romulus: Fix USB QMP PHY supplies
	arm64: dts: qcom: x1e80100: Fix usb_2 controller interrupts
	ASoC: acp: Support microphone from Lenovo Go S
	soc: qcom: socinfo: Avoid out of bounds read of serial number
	serial: sh-sci: Drop __initdata macro for port_cfg
	serial: sh-sci: Do not probe the serial port if its slot in sci_ports[] is in use
	MIPS: Loongson64: remove ROM Size unit in boardinfo
	LoongArch: Extend the maximum number of watchpoints
	powerpc/pseries/eeh: Fix get PE state translation
	dm-crypt: don't update io->sector after kcryptd_crypt_write_io_submit()
	dm-crypt: track tag_offset in convert_context
	mips/math-emu: fix emulation of the prefx instruction
	MIPS: pci-legacy: Override pci_address_to_pio
	Revert "MIPS: csrc-r4k: Select HAVE_UNSTABLE_SCHED_CLOCK if SMP && 64BIT"
	block: don't revert iter for -EIOCBQUEUED
	Revert "media: uvcvideo: Require entities to have a non-zero unique ID"
	firmware: qcom: scm: Fix missing read barrier in qcom_scm_is_available()
	firmware: qcom: scm: Fix missing read barrier in qcom_scm_get_tzmem_pool()
	ALSA: hda/realtek: Enable headset mic on Positivo C6400
	ALSA: hda/realtek: Fix quirk matching for Legion Pro 7
	ALSA: hda: Fix headset detection failure due to unstable sort
	arm64: tegra: Fix Tegra234 PCIe interrupt-map
	s390/pci: Fix SR-IOV for PFs initially in standby
	PCI: Avoid putting some root ports into D3 on TUXEDO Sirius Gen1
	PCI: endpoint: Finish virtual EP removal in pci_epf_remove_vepf()
	PCI: dwc: ep: Write BAR_MASK before iATU registers in pci_epc_set_bar()
	PCI: dwc: ep: Prevent changing BAR size/flags in pci_epc_set_bar()
	nvme-pci: Add TUXEDO InfinityFlex to Samsung sleep quirk
	nvme-pci: Add TUXEDO IBP Gen9 to Samsung sleep quirk
	KVM: x86/mmu: Ensure NX huge page recovery thread is alive before waking
	scsi: st: Don't set pos_unknown just after device recognition
	scsi: qla2xxx: Move FCE Trace buffer allocation to user control
	scsi: ufs: qcom: Fix crypto key eviction
	scsi: ufs: core: Fix use-after free in init error and remove paths
	scsi: storvsc: Set correct data length for sending SCSI command without payload
	scsi: core: Do not retry I/Os during depopulation
	kbuild: Move -Wenum-enum-conversion to W=2
	rust: init: use explicit ABI to clean warning in future compilers
	x86: rust: set rustc-abi=x86-softfloat on rustc>=1.86.0
	x86/acpi: Fix LAPIC/x2APIC parsing order
	x86/boot: Use '-std=gnu11' to fix build with GCC 15
	ubi: Add a check for ubi_num
	ARM: dts: dra7: Add bus_dma_limit for l4 cfg bus
	ARM: dts: ti/omap: gta04: fix pm issues caused by spi module
	arm64: dts: mediatek: mt8183: Disable DPI display output by default
	arm64: dts: qcom: sdx75: Fix MPSS memory length
	arm64: dts: qcom: x1e80100: Fix ADSP memory base and length
	arm64: dts: qcom: x1e80100: Fix CDSP memory length
	arm64: dts: qcom: sm6115: Fix MPSS memory length
	arm64: dts: qcom: sm6115: Fix CDSP memory length
	arm64: dts: qcom: sm6115: Fix ADSP memory base and length
	arm64: dts: qcom: sm6350: Fix ADSP memory length
	arm64: dts: qcom: sm6350: Fix MPSS memory length
	arm64: dts: qcom: sm6350: Fix uart1 interconnect path
	arm64: dts: qcom: sm6375: Fix ADSP memory length
	arm64: dts: qcom: sm6375: Fix CDSP memory base and length
	arm64: dts: qcom: sm6375: Fix MPSS memory base and length
	arm64: dts: qcom: sm8350: Fix ADSP memory base and length
	arm64: dts: qcom: sm8350: Fix CDSP memory base and length
	arm64: dts: qcom: sm8350: Fix MPSS memory length
	arm64: dts: qcom: sm8450: Fix ADSP memory base and length
	arm64: dts: qcom: sm8450: Fix CDSP memory length
	arm64: dts: qcom: sm8450: Fix MPSS memory length
	arm64: dts: qcom: sm8550: Fix ADSP memory base and length
	arm64: dts: qcom: sm8550: Fix CDSP memory length
	arm64: dts: qcom: sm8550: Fix MPSS memory length
	arm64: dts: qcom: sm8650: Fix ADSP memory base and length
	arm64: dts: qcom: sm8650: Fix CDSP memory length
	arm64: dts: qcom: sm8650: Fix MPSS memory length
	arm64: dts: qcom: sm8550: correct MDSS interconnects
	arm64: dts: qcom: sm8650: correct MDSS interconnects
	crypto: qce - fix priority to be less than ARMv8 CE
	arm64: tegra: Fix typo in Tegra234 dce-fabric compatible
	arm64: tegra: Disable Tegra234 sce-fabric node
	parisc: Temporarily disable jump label support
	pwm: microchip-core: fix incorrect comparison with max period
	xfs: don't call remap_verify_area with sb write protection held
	xfs: Propagate errors from xfs_reflink_cancel_cow_range in xfs_dax_write_iomap_end
	xfs: Add error handling for xfs_reflink_cancel_cow_range
	accel/ivpu: Clear runtime_error after pm_runtime_resume_and_get() fails
	ACPI: PRM: Remove unnecessary strict handler address checks
	tpm: Change to kvalloc() in eventlog/acpi.c
	rv: Reset per-task monitors also for idle tasks
	hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING
	iommufd: Fix struct iommu_hwpt_pgfault init and padding
	kfence: skip __GFP_THISNODE allocations on NUMA systems
	media: ccs: Clean up parsed CCS static data on parse failure
	mm/hugetlb: fix avoid_reserve to allow taking folio from subpool
	iio: light: as73211: fix channel handling in only-color triggered buffer
	iommu/tegra241-cmdqv: Read SMMU IDR1.CMDQS instead of hardcoding
	iommufd/fault: Destroy response and mutex in iommufd_fault_destroy()
	iommufd/fault: Use a separate spinlock to protect fault->deliver list
	soc: samsung: exynos-pmu: Fix uninitialized ret in tensor_set_bits_atomic()
	soc: mediatek: mtk-devapc: Fix leaking IO map on error paths
	soc: mediatek: mtk-devapc: Fix leaking IO map on driver remove
	soc: qcom: llcc: Enable LLCC_WRCACHE at boot on X1
	soc: qcom: smem_state: fix missing of_node_put in error path
	media: mmp: Bring back registration of the device
	media: mc: fix endpoint iteration
	media: nuvoton: Fix an error check in npcm_video_ece_init()
	media: imx296: Add standby delay during probe
	media: intel/ipu6: remove cpu latency qos request on error
	media: ov5640: fix get_light_freq on auto
	media: stm32: dcmipp: correct dma_set_mask_and_coherent mask value
	media: ccs: Fix CCS static data parsing for large block sizes
	media: ccs: Fix cleanup order in ccs_probe()
	media: i2c: ds90ub9x3: Fix extra fwnode_handle_put()
	media: i2c: ds90ub960: Fix use of non-existing registers on UB9702
	media: i2c: ds90ub960: Fix UB9702 VC map
	media: i2c: ds90ub960: Fix logging SP & EQ status only for UB9702
	media: uvcvideo: Fix crash during unbind if gpio unit is in use
	media: uvcvideo: Fix event flags in uvc_ctrl_send_events
	media: uvcvideo: Support partial control reads
	media: uvcvideo: Only save async fh if success
	media: uvcvideo: Remove redundant NULL assignment
	media: uvcvideo: Remove dangling pointers
	mm: kmemleak: fix upper boundary check for physical address objects
	mm: gup: fix infinite loop within __get_longterm_locked
	mm/vmscan: accumulate nr_demoted for accurate demotion statistics
	mm/hugetlb: fix hugepage allocation for interleaved memory nodes
	mm/compaction: fix UBSAN shift-out-of-bounds warning
	ata: libata-sff: Ensure that we cannot write outside the allocated buffer
	irqchip/irq-mvebu-icu: Fix access to msi_data from irq_domain::host_data
	crypto: qce - fix goto jump in error path
	crypto: qce - unregister previously registered algos in error path
	ceph: fix memory leak in ceph_mds_auth_match()
	nvmem: qcom-spmi-sdam: Set size in struct nvmem_config
	nvmem: core: improve range check for nvmem_cell_write()
	nvmem: imx-ocotp-ele: simplify read beyond device check
	nvmem: imx-ocotp-ele: fix MAC address byte order
	nvmem: imx-ocotp-ele: fix reading from non zero offset
	nvmem: imx-ocotp-ele: set word length to 1
	io_uring: fix multishots with selected buffers
	io_uring/net: don't retry connect operation on EPOLLERR
	vfio/platform: check the bounds of read/write syscalls
	selftests: mptcp: connect: -f: no reconnect
	pnfs/flexfiles: retry getting layout segment for reads
	ocfs2: fix incorrect CPU endianness conversion causing mount failure
	ocfs2: handle a symlink read error correctly
	nilfs2: fix possible int overflows in nilfs_fiemap()
	nfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it
	NFSD: Encode COMPOUND operation status on page boundaries
	mailbox: tegra-hsp: Clear mailbox before using message
	mailbox: zynqmp: Remove invalid __percpu annotation in zynqmp_ipi_probe()
	NFC: nci: Add bounds checking in nci_hci_create_pipe()
	fgraph: Fix set_graph_notrace with setting TRACE_GRAPH_NOTRACE_BIT
	i3c: master: Fix missing 'ret' assignment in set_speed()
	irqchip/apple-aic: Only handle PMC interrupt as FIQ when configured so
	mtd: onenand: Fix uninitialized retlen in do_otp_read()
	misc: misc_minor_alloc to use ida for all dynamic/misc dynamic minors
	misc: fastrpc: Deregister device nodes properly in error scenarios
	misc: fastrpc: Fix registered buffer page address
	misc: fastrpc: Fix copy buffer page size
	net/ncsi: wait for the last response to Deselect Package before configuring channel
	net: phy: c45-tjaxx: add delay between MDIO write and read in soft_reset
	maple_tree: simplify split calculation
	scripts/gdb: fix aarch64 userspace detection in get_current_task
	tracing/osnoise: Fix resetting of tracepoints
	rtla/osnoise: Distinguish missing workload option
	rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads
	rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads
	rtla: Add trace_instance_stop
	rtla/timerlat_hist: Stop timerlat tracer on signal
	rtla/timerlat_top: Stop timerlat tracer on signal
	pinctrl: samsung: fix fwnode refcount cleanup if platform_get_irq_optional() fails
	pinctrl: renesas: rzg2l: Fix PFC_MASK for RZ/V2H and RZ/G3E
	ptp: Ensure info->enable callback is always set
	RDMA/mlx5: Fix a race for an ODP MR which leads to CQE with error
	rtc: zynqmp: Fix optional clock name property
	timers/migration: Fix off-by-one root mis-connection
	s390/fpu: Add fpc exception handler / remove fixup section again
	MIPS: ftrace: Declare ftrace_get_parent_ra_addr() as static
	xfs: avoid nested calls to __xfs_trans_commit
	xfs: don't lose solo superblock counter update transactions
	xfs: separate dquot buffer reads from xfs_dqflush
	xfs: clean up log item accesses in xfs_qm_dqflush{,_done}
	xfs: attach dquot buffer to dquot log item buffer
	xfs: convert quotacheck to attach dquot buffers
	xfs: release the dquot buf outside of qli_lock
	xfs: lock dquot buffer before detaching dquot from b_li_list
	xfs: fix mount hang during primary superblock recovery failure
	spi: atmel-quadspi: Create `atmel_qspi_ops` to support newer SoC families
	spi: atmel-qspi: Memory barriers after memory-mapped I/O
	Revert "btrfs: avoid monopolizing a core when activating a swap file"
	btrfs: avoid monopolizing a core when activating a swap file
	mptcp: prevent excessive coalescing on receive
	x86/mm: Convert unreachable() to BUG()
	md/md-linear: Fix a NULL vs IS_ERR() bug in linear_add()
	md: Fix linear_set_limits()
	Revert "selftests/sched_ext: fix build after renames in sched_ext API"
	Revert "drm/amd/display: Fix green screen issue after suspend"
	drm/xe: Fix and re-enable xe_print_blob_ascii85()
	fs: prepend statmount.mnt_opts string with security_sb_mnt_opts()
	fs: fix adding security options to statmount.mnt_opt
	statmount: let unset strings be empty
	arm64: dts: rockchip: add reset-names for combphy on rk3568
	ocfs2: check dir i_size in ocfs2_find_entry
	Linux 6.12.14

Change-Id: Id4141bfc8ee9a6320b056561aa528228e7a3f1df
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2025-02-24 00:02:42 -08:00
Frederic Weisbecker
e456a88bdd hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING
commit 53dac345395c0d2493cbc2f4c85fe38aef5b63f5 upstream.

hrtimers are migrated away from the dying CPU to any online target at
the CPUHP_AP_HRTIMERS_DYING stage in order not to delay bandwidth timers
handling tasks involved in the CPU hotplug forward progress.

However wakeups can still be performed by the outgoing CPU after
CPUHP_AP_HRTIMERS_DYING. Those can result again in bandwidth timers being
armed. Depending on several considerations (crystal ball power management
based election, earliest timer already enqueued, timer migration enabled or
not), the target may eventually be the current CPU even if offline. If that
happens, the timer is eventually ignored.

The most notable example is RCU which had to deal with each and every of
those wake-ups by deferring them to an online CPU, along with related
workarounds:

_ e787644caf (rcu: Defer RCU kthreads wakeup when CPU is dying)
_ 9139f93209 (rcu/nocb: Fix RT throttling hrtimer armed from offline CPU)
_ f7345ccc62 (rcu/nocb: Fix rcuog wake-up from offline softirq)

The problem isn't confined to RCU though as the stop machine kthread
(which runs CPUHP_AP_HRTIMERS_DYING) reports its completion at the end
of its work through cpu_stop_signal_done() and performs a wake up that
eventually arms the deadline server timer:

   WARNING: CPU: 94 PID: 588 at kernel/time/hrtimer.c:1086 hrtimer_start_range_ns+0x289/0x2d0
   CPU: 94 UID: 0 PID: 588 Comm: migration/94 Not tainted
   Stopper: multi_cpu_stop+0x0/0x120 <- stop_machine_cpuslocked+0x66/0xc0
   RIP: 0010:hrtimer_start_range_ns+0x289/0x2d0
   Call Trace:
   <TASK>
     start_dl_timer
     enqueue_dl_entity
     dl_server_start
     enqueue_task_fair
     enqueue_task
     ttwu_do_activate
     try_to_wake_up
     complete
     cpu_stopper_thread

Instead of providing yet another bandaid to work around the situation, fix
it in the hrtimers infrastructure instead: always migrate away a timer to
an online target whenever it is enqueued from an offline CPU.

This will also allow to revert all the above RCU disgraceful hacks.

Fixes: 5c0930ccaa ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Reported-by: Vlad Poenaru <vlad.wing@gmail.com>
Reported-by: Usama Arif <usamaarif642@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/all/20250117232433.24027-1-frederic@kernel.org
Closes: 20241213203739.1519801-1-usamaarif642@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17 10:05:31 +01:00
Greg Kroah-Hartman
6c86d6f4fe Merge 6.12.11 into android16-6.12
GKI (arm64) relevant 28 out of 122 changes, affecting 34 files +368/-94
  cccd51dd22 bpf: Fix bpf_sk_select_reuseport() memory leak [1 file, +18/-12]
  1654578a3b cpuidle: teo: Update documentation after previous changes [1 file, +48/-43]
  7a4fd3df85 net: make page_pool_ref_netmem work with net iovs [1 file, +1/-1]
  2b78cab481 netdev: avoid CFI problems with sock priv helpers [2 files, +25/-5]
  e19f31169f i2c: core: fix reference leak in i2c_register_adapter() [1 file, +1/-0]
  b856d2c138 mac802154: check local interfaces before deleting sdata list [1 file, +4/-0]
  7c37879b76 fs: fix missing declaration of init_files [1 file, +1/-0]
  3d46037625 netfs: Fix non-contiguous donation between completed reads [1 file, +5/-4]
  ac216ffa69 scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers [1 file, +6/-3]
  402ce16421 iomap: avoid avoid truncating 64-bit offset to 32 bits [1 file, +1/-1]
  621f95fa0b poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll() [1 file, +9/-1]
  e98394f7bc sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE [1 file, +5/-1]
  902ef8f16d zram: fix potential UAF of zram table [1 file, +1/-0]
  6771e1279d vsock/bpf: return early if transport is not assigned [1 file, +9/-0]
  677579b641 vsock/virtio: discard packets if the transport changes [1 file, +5/-2]
  450aa12993 vsock/virtio: cancel close work in the destructor [1 file, +21/-8]
  01c178d690 vsock: reset socket state when de-assigning the transport [1 file, +9/-0]
  c23d1d4f8e vsock: prevent null-ptr-deref in vsock_*[has_data|has_space] [1 file, +9/-0]
  280f1fb89a filemap: avoid truncating 64-bit offset to 32 bits [1 file, +1/-1]
  310ac886d6 mm: clear uffd-wp PTE/PMD state on mremap() [4 files, +68/-2]
  c78b04977d mm: vmscan : pgdemote vmstat is not getting updated when MGLRU is enabled. [1 file, +3/-0]
  e96a2838d8 tracing: gfp: Fix the GFP enum values shown for user space tracing tools [1 file, +63/-0]
  115719a953 irqchip: Plug a OF node reference leak in platform_irqchip_probe() [1 file, +1/-3]
  44feb76129 irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly [1 file, +1/-1]
  93955a7788 irqchip/gic-v3-its: Don't enable interrupts in its_irq_set_vcpu_affinity() [1 file, +1/-1]
  38492f6ee8 hrtimers: Handle CPU state correctly on hotplug [3 files, +12/-2]
  12ead225b7 timers/migration: Fix another race between hotplug and idle entry/exit [1 file, +28/-1]
  6e641d499b timers/migration: Enforce group initialization visibility to tree walkers [1 file, +12/-2]

Changes in 6.12.11
	efi/zboot: Limit compression options to GZIP and ZSTD
	net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()
	bpf: Fix bpf_sk_select_reuseport() memory leak
	eth: bnxt: always recalculate features after XDP clearing, fix null-deref
	net: ravb: Fix max TX frame size for RZ/V2M
	openvswitch: fix lockup on tx to unregistering netdev with carrier
	pktgen: Avoid out-of-bounds access in get_imix_entries
	ice: Fix E825 initialization
	ice: Fix quad registers read on E825
	ice: Fix ETH56G FC-FEC Rx offset value
	ice: Introduce ice_get_phy_model() wrapper
	ice: Add ice_get_ctrl_ptp() wrapper to simplify the code
	ice: Use ice_adapter for PTP shared data instead of auxdev
	ice: Add correct PHY lane assignment
	cpuidle: teo: Update documentation after previous changes
	btrfs: add the missing error handling inside get_canonical_dev_path
	gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().
	gtp: Destroy device along with udp socket's netns dismantle.
	pfcp: Destroy device along with udp socket's netns dismantle.
	cpufreq: Move endif to the end of Kconfig file
	nfp: bpf: prevent integer overflow in nfp_bpf_event_output()
	net: xilinx: axienet: Fix IRQ coalescing packet count overflow
	net: fec: handle page_pool_dev_alloc_pages error
	net: make page_pool_ref_netmem work with net iovs
	net/mlx5: Fix RDMA TX steering prio
	net/mlx5: Fix a lockdep warning as part of the write combining test
	net/mlx5: SF, Fix add port error handling
	net/mlx5: Clear port select structure when fail to create
	net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel
	net/mlx5e: Rely on reqid in IPsec tunnel mode
	net/mlx5e: Always start IPsec sequence number from 1
	netdev: avoid CFI problems with sock priv helpers
	drm/tests: helpers: Fix compiler warning
	drm/vmwgfx: Unreserve BO on error
	drm/vmwgfx: Add new keep_resv BO param
	drm/v3d: Ensure job pointer is set to NULL after job completion
	reset: rzg2l-usbphy-ctrl: Assign proper of node to the allocated device
	soc: ti: pruss: Fix pruss APIs
	i2c: core: fix reference leak in i2c_register_adapter()
	platform/x86: dell-uart-backlight: fix serdev race
	platform/x86: lenovo-yoga-tab2-pro-1380-fastcharger: fix serdev race
	hwmon: (tmp513) Fix division of negative numbers
	Revert "mtd: spi-nor: core: replace dummy buswidth from addr to data"
	i2c: mux: demux-pinctrl: check initial mux selection, too
	i2c: rcar: fix NACK handling when being a target
	i2c: testunit: on errors, repeat NACK until STOP
	hwmon: (ltc2991) Fix mixed signed/unsigned in DIV_ROUND_CLOSEST
	smb: client: fix double free of TCP_Server_Info::hostname
	mac802154: check local interfaces before deleting sdata list
	hfs: Sanity check the root record
	fs/qnx6: Fix building with GCC 15
	fs: fix missing declaration of init_files
	kheaders: Ignore silly-rename files
	netfs: Fix non-contiguous donation between completed reads
	cachefiles: Parse the "secctx" immediately
	scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers
	gpio: virtuser: lock up configfs that an instantiated device depends on
	gpio: sim: lock up configfs that an instantiated device depends on
	selftests: tc-testing: reduce rshift value
	platform/x86/intel: power-domains: Add Clearwater Forest support
	platform/x86: ISST: Add Clearwater Forest to support list
	ACPI: resource: acpi_dev_irq_override(): Check DMI match last
	sched_ext: keep running prev when prev->scx.slice != 0
	iomap: avoid avoid truncating 64-bit offset to 32 bits
	afs: Fix merge preference rule failure condition
	poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()
	selftests/sched_ext: fix build after renames in sched_ext API
	scx: Fix maximal BPF selftest prog
	RDMA/bnxt_re: Fix to export port num to ib_query_qp
	sched_ext: Fix dsq_local_on selftest
	nvmet: propagate npwg topology
	sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE
	x86/asm: Make serialize() always_inline
	ALSA: hda/realtek: Add support for Ayaneo System using CS35L41 HDA
	ALSA: hda/realtek: fixup ASUS GA605W
	ALSA: hda/realtek: fixup ASUS H7606W
	zram: fix potential UAF of zram table
	i2c: atr: Fix client detach
	mptcp: be sure to send ack when mptcp-level window re-opens
	mptcp: fix spurious wake-up on under memory pressure
	selftests: mptcp: avoid spurious errors on disconnect
	net: ethernet: xgbe: re-add aneg to supported features in PHY quirks
	vsock/bpf: return early if transport is not assigned
	vsock/virtio: discard packets if the transport changes
	vsock/virtio: cancel close work in the destructor
	vsock: reset socket state when de-assigning the transport
	vsock: prevent null-ptr-deref in vsock_*[has_data|has_space]
	nouveau/fence: handle cross device fences properly
	drm/nouveau/disp: Fix missing backlight control on Macbook 5,1
	net/ncsi: fix locking in Get MAC Address handling
	filemap: avoid truncating 64-bit offset to 32 bits
	fs/proc: fix softlockup in __read_vmcore (part 2)
	gpio: xilinx: Convert gpio_lock to raw spinlock
	tools: fix atomic_set() definition to set the value correctly
	pmdomain: imx8mp-blk-ctrl: add missing loop break condition
	mm/kmemleak: fix percpu memory leak detection failure
	selftests/mm: set allocated memory to non-zero content in cow test
	drm/amd/display: Do not elevate mem_type change to full update
	mm: clear uffd-wp PTE/PMD state on mremap()
	mm: vmscan : pgdemote vmstat is not getting updated when MGLRU is enabled.
	tracing: gfp: Fix the GFP enum values shown for user space tracing tools
	irqchip: Plug a OF node reference leak in platform_irqchip_probe()
	irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly
	irqchip/gic-v3-its: Don't enable interrupts in its_irq_set_vcpu_affinity()
	hrtimers: Handle CPU state correctly on hotplug
	timers/migration: Fix another race between hotplug and idle entry/exit
	timers/migration: Enforce group initialization visibility to tree walkers
	x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache
	drm/i915/fb: Relax clear color alignment to 64 bytes
	drm/xe: Mark ComputeCS read mode as UC on iGPU
	drm/xe/oa: Add missing VISACTL mux registers
	drm/amdgpu/smu13: update powersave optimizations
	drm/amdgpu: fix fw attestation for MP0_14_0_{2/3}
	drm/amdgpu: disable gfxoff with the compute workload on gfx12
	drm/amdgpu: always sync the GFX pipe on ctx switch
	drm/amd/display: Fix PSR-SU not support but still call the amdgpu_dm_psr_enable
	drm/amd/display: Disable replay and psr while VRR is enabled
	drm/amd/display: Do not wait for PSR disable on vbl enable
	Revert "drm/amd/display: Enable urgent latency adjustments for DCN35"
	drm/amd/display: Validate mdoe under MST LCT=1 case as well
	apparmor: allocate xmatch for nullpdb inside aa_alloc_null
	Linux 6.12.11

Change-Id: I44fe9d80e5229632bb5dcbab4d39302fa03c099f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2025-01-27 00:06:30 -08:00
Koichiro Den
38492f6ee8 hrtimers: Handle CPU state correctly on hotplug
commit 2f8dea1692eef2b7ba6a256246ed82c365fdc686 upstream.

Consider a scenario where a CPU transitions from CPUHP_ONLINE to halfway
through a CPU hotunplug down to CPUHP_HRTIMERS_PREPARE, and then back to
CPUHP_ONLINE:

Since hrtimers_prepare_cpu() does not run, cpu_base.hres_active remains set
to 1 throughout. However, during a CPU unplug operation, the tick and the
clockevents are shut down at CPUHP_AP_TICK_DYING. On return to the online
state, for instance CFS incorrectly assumes that the hrtick is already
active, and the chance of the clockevent device to transition to oneshot
mode is also lost forever for the CPU, unless it goes back to a lower state
than CPUHP_HRTIMERS_PREPARE once.

This round-trip reveals another issue; cpu_base.online is not set to 1
after the transition, which appears as a WARN_ON_ONCE in enqueue_hrtimer().

Aside of that, the bulk of the per CPU state is not reset either, which
means there are dangling pointers in the worst case.

Address this by adding a corresponding startup() callback, which resets the
stale per CPU state and sets the online flag.

[ tglx: Make the new callback unconditionally available, remove the online
  	modification in the prepare() callback and clear the remaining
  	state in the starting callback instead of the prepare callback ]

Fixes: 5c0930ccaa ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241220134421.3809834-1-koichiro.den@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-23 17:23:03 +01:00
Donghyeok Choe
d7aa7636d9 ANDROID: hrtimer: Export hrtimer_expire_entry/exit tracepoints
Export hrtimer_expire_entry/exit tracepoints, so that vendor modules
can register probes for these tracepoints.

To debug hrtimer-related issue, hrtimer_expire_entry/exit can be
the core debugging data when hrtimer bug is included when panic
occurs in multiple cores at the same time.
And hrtimer_expire_entry/exit also check that hrtimer is operated normally or not.

Bug: 382155299
Change-Id: Ib960141e4af2c21f54efda3bd4b11644d48291ac
Signed-off-by: Donghyeok Choe <d7271.choe@samsung.com>
2024-12-11 02:20:16 -08:00
Linus Torvalds
2004cef11e Merge tag 'sched-core-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - Implement the SCHED_DEADLINE server infrastructure - Daniel Bristot
   de Oliveira's last major contribution to the kernel:

     "SCHED_DEADLINE servers can help fixing starvation issues of low
      priority tasks (e.g., SCHED_OTHER) when higher priority tasks
      monopolize CPU cycles. Today we have RT Throttling; DEADLINE
      servers should be able to replace and improve that."

   (Daniel Bristot de Oliveira, Peter Zijlstra, Joel Fernandes, Youssef
   Esmat, Huang Shijie)

 - Preparatory changes for sched_ext integration:
     - Use set_next_task(.first) where required
     - Fix up set_next_task() implementations
     - Clean up DL server vs. core sched
     - Split up put_prev_task_balance()
     - Rework pick_next_task()
     - Combine the last put_prev_task() and the first set_next_task()
     - Rework dl_server
     - Add put_prev_task(.next)

   (Peter Zijlstra, with a fix by Tejun Heo)

 - Complete the EEVDF transition and refine EEVDF scheduling:
     - Implement delayed dequeue
     - Allow shorter slices to wakeup-preempt
     - Use sched_attr::sched_runtime to set request/slice suggestion
     - Document the new feature flags
     - Remove unused and duplicate-functionality fields
     - Simplify & unify pick_next_task_fair()
     - Misc debuggability enhancements

   (Peter Zijlstra, with fixes/cleanups by Dietmar Eggemann, Valentin
   Schneider and Chuyi Zhou)

 - Initialize the vruntime of a new task when it is first enqueued,
   resulting in significant decrease in latency of newly woken tasks
   (Zhang Qiao)

 - Introduce SM_IDLE and an idle re-entry fast-path in __schedule()
   (K Prateek Nayak, Peter Zijlstra)

 - Clean up and clarify the usage of Clean up usage of rt_task()
   (Qais Yousef)

 - Preempt SCHED_IDLE entities in strict cgroup hierarchies
   (Tianchen Ding)

 - Clarify the documentation of time units for deadline scheduler
   parameters (Christian Loehle)

 - Remove the HZ_BW chicken-bit feature flag introduced a year ago,
   the original change seems to be working fine (Phil Auld)

 - Misc fixes and cleanups (Chen Yu, Dan Carpenter, Huang Shijie,
   Peilin He, Qais Yousefm and Vincent Guittot)

* tag 'sched-core-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
  sched/cpufreq: Use NSEC_PER_MSEC for deadline task
  cpufreq/cppc: Use NSEC_PER_MSEC for deadline task
  sched/deadline: Clarify nanoseconds in uapi
  sched/deadline: Convert schedtool example to chrt
  sched/debug: Fix the runnable tasks output
  sched: Fix sched_delayed vs sched_core
  kernel/sched: Fix util_est accounting for DELAY_DEQUEUE
  kthread: Fix task state in kthread worker if being frozen
  sched/pelt: Use rq_clock_task() for hw_pressure
  sched/fair: Move effective_cpu_util() and effective_cpu_util() in fair.c
  sched/core: Introduce SM_IDLE and an idle re-entry fast-path in __schedule()
  sched: Add put_prev_task(.next)
  sched: Rework dl_server
  sched: Combine the last put_prev_task() and the first set_next_task()
  sched: Rework pick_next_task()
  sched: Split up put_prev_task_balance()
  sched: Clean up DL server vs core sched
  sched: Fixup set_next_task() implementations
  sched: Use set_next_task(.first) where required
  sched/fair: Properly deactivate sched_delayed task upon class change
  ...
2024-09-19 15:55:58 +02:00
Linus Torvalds
9ea925c806 Merge tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Core:

   - Overhaul of posix-timers in preparation of removing the workaround
     for periodic timers which have signal delivery ignored.

   - Remove the historical extra jiffie in msleep()

     msleep() adds an extra jiffie to the timeout value to ensure
     minimal sleep time. The timer wheel ensures minimal sleep time
     since the large rewrite to a non-cascading wheel, but the extra
     jiffie in msleep() remained unnoticed. Remove it.

   - Make the timer slack handling correct for realtime tasks.

     The procfs interface is inconsistent and does neither reflect
     reality nor conforms to the man page. Show the correct 0 slack for
     real time tasks and enforce it at the core level instead of having
     inconsistent individual checks in various timer setup functions.

   - The usual set of updates and enhancements all over the place.

  Drivers:

   - Allow the ACPI PM timer to be turned off during suspend

   - No new drivers

   - The usual updates and enhancements in various drivers"

* tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  ntp: Make sure RTC is synchronized when time goes backwards
  treewide: Fix wrong singular form of jiffies in comments
  cpu: Use already existing usleep_range()
  timers: Rename next_expiry_recalc() to be unique
  platform/x86:intel/pmc: Fix comment for the pmc_core_acpi_pm_timer_suspend_resume function
  clocksource/drivers/jcore: Use request_percpu_irq()
  clocksource/drivers/cadence-ttc: Add missing clk_disable_unprepare in ttc_setup_clockevent
  clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init
  clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
  clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers
  platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended
  clocksource: acpi_pm: Add external callback for suspend/resume
  clocksource/drivers/arm_arch_timer: Using for_each_available_child_of_node_scoped()
  dt-bindings: timer: rockchip: Add rk3576 compatible
  timers: Annotate possible non critical data race of next_expiry
  timers: Remove historical extra jiffie for timeout in msleep()
  hrtimer: Use and report correct timerslack values for realtime tasks
  hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.
  timers: Add sparse annotation for timer_sync_wait_running().
  signal: Replace BUG_ON()s
  ...
2024-09-17 07:25:37 +02:00
Anna-Maria Behnsen
bd7c8ff9fe treewide: Fix wrong singular form of jiffies in comments
There are several comments all over the place, which uses a wrong singular
form of jiffies.

Replace 'jiffie' by 'jiffy'. No functional change.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-3-e98760256370@linutronix.de
2024-09-08 20:47:40 +02:00
Felix Moessbauer
ed4fb6d7ef hrtimer: Use and report correct timerslack values for realtime tasks
The timerslack_ns setting is used to specify how much the hardware
timers should be delayed, to potentially dispatch multiple timers in a
single interrupt. This is a performance optimization. Timers of
realtime tasks (having a realtime scheduling policy) should not be
delayed.

This logic was inconsitently applied to the hrtimers, leading to delays
of realtime tasks which used timed waits for events (e.g. condition
variables). Due to the downstream override of the slack for rt tasks,
the procfs reported incorrect (non-zero) timerslack_ns values.

This is changed by setting the timer_slack_ns task attribute to 0 for
all tasks with a rt policy. By that, downstream users do not need to
specially handle rt tasks (w.r.t. the slack), and the procfs entry
shows the correct value of "0". Setting non-zero slack values (either
via procfs or PR_SET_TIMERSLACK) on tasks with a rt policy is ignored,
as stated in "man 2 PR_SET_TIMERSLACK":

  Timer slack is not applied to threads that are scheduled under a
  real-time scheduling policy (see sched_setscheduler(2)).

The special handling of timerslack on rt tasks in downstream users
is removed as well.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240814121032.368444-2-felix.moessbauer@siemens.com
2024-08-23 20:13:02 +02:00
Caleb Sander Mateos
e68ac2b488 softirq: Remove unused 'action' parameter from action callback
When soft interrupt actions are called, they are passed a pointer to the
struct softirq action which contains the action's function pointer.

This pointer isn't useful, as the action callback already knows what
function it is. And since each callback handles a specific soft interrupt,
the callback also knows which soft interrupt number is running.

No soft interrupt action callback actually uses this parameter, so remove
it from the function pointer signature. This clarifies that soft interrupt
actions are global routines and makes it slightly cheaper to call them.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/all/20240815171549.3260003-1-csander@purestorage.com
2024-08-20 17:13:40 +02:00
Sebastian Andrzej Siewior
330dd6d9c0 hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.
The two hrtimer_cpu_base_.*_expiry() functions are wrappers around the
locking functions and sparse complains about the missing counterpart.

Add sparse annotation to denote that this bevaviour is expected.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240812105326.2240000-3-bigeasy@linutronix.de
2024-08-14 12:44:41 +02:00
Qais Yousef
ae04f69de0 sched/rt: Rename realtime_{prio, task}() to rt_or_dl_{prio, task}()
Some find the name realtime overloaded. Use rt_or_dl() as an
alternative, hopefully better, name.

Suggested-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Qais Yousef <qyousef@layalina.io>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240610192018.1567075-4-qyousef@layalina.io
2024-08-07 18:32:38 +02:00
Qais Yousef
130fd056dd sched/rt: Clean up usage of rt_task()
rt_task() checks if a task has RT priority. But depends on your
dictionary, this could mean it belongs to RT class, or is a 'realtime'
task, which includes RT and DL classes.

Since this has caused some confusion already on discussion [1], it
seemed a clean up is due.

I define the usage of rt_task() to be tasks that belong to RT class.
Make sure that it returns true only for RT class and audit the users and
replace the ones required the old behavior with the new realtime_task()
which returns true for RT and DL classes. Introduce similar
realtime_prio() to create similar distinction to rt_prio() and update
the users that required the old behavior to use the new function.

Move MAX_DL_PRIO to prio.h so it can be used in the new definitions.

Document the functions to make it more obvious what is the difference
between them. PI-boosted tasks is a factor that must be taken into
account when choosing which function to use.

Rename task_is_realtime() to realtime_task_policy() as the old name is
confusing against the new realtime_task().

No functional changes were intended.

[1] https://lore.kernel.org/lkml/20240506100509.GL40213@noisy.programming.kicks-ass.net/

Signed-off-by: Qais Yousef <qyousef@layalina.io>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Reviewed-by: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240610192018.1567075-2-qyousef@layalina.io
2024-08-07 18:32:37 +02:00
Phil Chang
5a830bbce3 hrtimer: Prevent queuing of hrtimer without a function callback
The hrtimer function callback must not be NULL. It has to be specified by
the call side but it is not validated by the hrtimer code. When a hrtimer
is queued without a function callback, the kernel crashes with a null
pointer dereference when trying to execute the callback in __run_hrtimer().

Introduce a validation before queuing the hrtimer in
hrtimer_start_range_ns().

[anna-maria: Rephrase commit message]

Signed-off-by: Phil Chang <phil.chang@mediatek.com>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
2024-06-25 16:54:27 +02:00
Jiapeng Chong
b7c8e1f8a7 hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active()
The function hrtimer_hres_active() are defined in the hrtimer.c file, but
not called elsewhere, so rename __hrtimer_hres_active() to
hrtimer_hres_active() and remove the old hrtimer_hres_active() function.

kernel/time/hrtimer.c:653:19: warning: unused function 'hrtimer_hres_active'.

Fixes: 82ccdf062a ("hrtimer: Remove unused function")
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Link: https://lore.kernel.org/r/20240418023000.130324-1-jiapeng.chong@linux.alibaba.com
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8778
2024-04-22 16:13:19 +02:00
Jiapeng Chong
82ccdf062a hrtimer: Remove unused function
The function is defined, but not called anywhere:

  kernel/time/hrtimer.c:1880:20: warning: unused function '__hrtimer_peek_ahead_timers'.

Remove it.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240322070441.29646-1-jiapeng.chong@linux.alibaba.com
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8611
2024-04-08 15:03:06 +02:00
Frederic Weisbecker
7988e5ae2b tick: Split nohz and highres features from nohz_mode
The nohz mode field tells about low resolution nohz mode or high
resolution nohz mode but it doesn't tell about high resolution non-nohz
mode.

In order to retrieve the latter state, tick_cancel_sched_timer() must
fiddle with struct hrtimer's internals to guess if the tick has been
initialized in high resolution.

Move instead the nohz mode field information into the tick flags and
provide two new bits: one to know if the tick is in nohz mode and
another one to know if the tick is in high resolution. The combination
of those two flags provides all the needed informations to determine
which of the three tick modes is running.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240225225508.11587-14-frederic@kernel.org
2024-02-26 11:37:32 +01:00
Frederic Weisbecker
f04e51220a tick: Move tick cancellation up to CPUHP_AP_TICK_DYING
The tick hrtimer is cancelled right before hrtimers are migrated. This
is done from the hrtimer subsystem even though it shouldn't know about
its actual users.

Move instead the tick hrtimer cancellation to the relevant CPU hotplug
state that aims at centralizing high level tick shutdown operations so
that the related flow is easy to follow.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240225225508.11587-9-frederic@kernel.org
2024-02-26 11:37:31 +01:00
Peng Liu
ffb7e01c4e tick/nohz: Remove duplicate between tick_nohz_switch_to_nohz() and tick_setup_sched_timer()
The ts->sched_timer initialization work of tick_nohz_switch_to_nohz()
is almost the same as that of tick_setup_sched_timer(), so adjust the
latter to get it reused by tick_nohz_switch_to_nohz().

This also makes the low resolution mode sched_timer benefit from the tick
skew boot option.

Signed-off-by: Peng Liu <liupeng17@lenovo.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240225225508.11587-2-frederic@kernel.org
2024-02-26 11:37:31 +01:00
Costa Shulyupin
56c2cb1012 hrtimer: Select housekeeping CPU during migration
During CPU-down hotplug, hrtimers may migrate to isolated CPUs,
compromising CPU isolation.

Address this issue by masking valid CPUs for hrtimers using
housekeeping_cpumask(HK_TYPE_TIMER).

Suggested-by: Waiman Long <longman@redhat.com>
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20240222200856.569036-1-costa.shul@redhat.com
2024-02-22 22:18:21 +01:00
Ingo Molnar
94bf12af35 Merge tag 'v6.8-rc5' into timers/core, to resolve conflict
There's a conflict between this recent upstream fix:

  dad6a09f31 ("hrtimer: Report offline hrtimer enqueue")

and a pending commit in the timers tree:

  1a4729ecaf ("hrtimers: Move hrtimer base related definitions into hrtimer_defs.h")

Resolve it by applying the upstream fix to the new <linux/hrtimer_defs.h> header.

 Conflict:
	include/linux/hrtimer.h
 Semantic conflict:
	include/linux/hrtimer_defs.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-02-19 22:27:57 +01:00
Anna-Maria Behnsen
ca2768bbf5 hrtimers: Update formatting of documentation
Documentation of functions lacks the annotations which are used by
kernel-doc and *.rst to make appearance in rendered documents more
user-friendly.

Use those annotations to improve user-friendliness. While at it prevent
duplication of comments and use a reference instead.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240123164702.55612-3-anna-maria@linutronix.de
2024-02-19 09:37:59 +01:00
Frederic Weisbecker
dad6a09f31 hrtimer: Report offline hrtimer enqueue
The hrtimers migration on CPU-down hotplug process has been moved
earlier, before the CPU actually goes to die. This leaves a small window
of opportunity to queue an hrtimer in a blind spot, leaving it ignored.

For example a practical case has been reported with RCU waking up a
SCHED_FIFO task right before the CPUHP_AP_IDLE_DEAD stage, queuing that
way a sched/rt timer to the local offline CPU.

Make sure such situations never go unnoticed and warn when that happens.

Fixes: 5c0930ccaa ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240129235646.3171983-4-boqun.feng@gmail.com
2024-02-06 10:56:35 +01:00
Thomas Gleixner
5c0930ccaa hrtimers: Push pending hrtimers away from outgoing CPU earlier
2b8272ff4a ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug")
solved the straight forward CPU hotplug deadlock vs. the scheduler
bandwidth timer. Yu discovered a more involved variant where a task which
has a bandwidth timer started on the outgoing CPU holds a lock and then
gets throttled. If the lock required by one of the CPU hotplug callbacks
the hotplug operation deadlocks because the unthrottling timer event is not
handled on the dying CPU and can only be recovered once the control CPU
reaches the hotplug state which pulls the pending hrtimers from the dead
CPU.

Solve this by pushing the hrtimers away from the dying CPU in the dying
callbacks. Nothing can queue a hrtimer on the dying CPU at that point because
all other CPUs spin in stop_machine() with interrupts disabled and once the
operation is finished the CPU is marked offline.

Reported-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Liu Tie <liutie4@huawei.com>
Link: https://lore.kernel.org/r/87a5rphara.ffs@tglx
2023-11-11 18:06:42 +01:00
Ben Dooks
ccaa4926c2 hrtimer: Add missing sparse annotations to hrtimer locking
Sparse warns about lock imbalance vs. the hrtimer_base lock due to missing
sparse annotations:

kernel/time/hrtimer.c:175:33: warning: context imbalance in 'lock_hrtimer_base' - wrong count at exit
kernel/time/hrtimer.c:1301:28: warning: context imbalance in 'hrtimer_start_range_ns' - unexpected unlock
kernel/time/hrtimer.c:1336:28: warning: context imbalance in 'hrtimer_try_to_cancel' - unexpected unlock
kernel/time/hrtimer.c:1457:9: warning: context imbalance in '__hrtimer_get_remaining' - unexpected unlock

Add the annotations to the relevant functions.

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230621075928.394481-1-ben.dooks@codethink.co.uk
2023-06-22 10:32:37 +02:00
Davidlohr Bueso
0c52310f26 hrtimer: Ignore slack time for RT tasks in schedule_hrtimeout_range()
While in theory the timer can be triggered before expires + delta, for the
cases of RT tasks they really have no business giving any lenience for
extra slack time, so override any passed value by the user and always use
zero for schedule_hrtimeout_range() calls. Furthermore, this is similar to
what the nanosleep(2) family already does with current->timer_slack_ns.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230123173206.6764-3-dave@stgolabs.net
2023-01-31 11:23:07 +01:00
Davidlohr Bueso
c14fd3dcac hrtimer: Rely on rt_task() for DL tasks too
Checking dl_task() is redundant as rt_task() returns true for deadline
tasks too.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230123173206.6764-2-dave@stgolabs.net
2023-01-31 11:23:07 +01:00
Jann Horn
9f76d59173 timers: Prevent union confusion from unexpected restart_syscall()
The nanosleep syscalls use the restart_block mechanism, with a quirk:
The `type` and `rmtp`/`compat_rmtp` fields are set up unconditionally on
syscall entry, while the rest of the restart_block is only set up in the
unlikely case that the syscall is actually interrupted by a signal (or
pseudo-signal) that doesn't have a signal handler.

If the restart_block was set up by a previous syscall (futex(...,
FUTEX_WAIT, ...) or poll()) and hasn't been invalidated somehow since then,
this will clobber some of the union fields used by futex_wait_restart() and
do_restart_poll().

If userspace afterwards wrongly calls the restart_syscall syscall,
futex_wait_restart()/do_restart_poll() will read struct fields that have
been clobbered.

This doesn't actually lead to anything particularly interesting because
none of the union fields contain trusted kernel data, and
futex(..., FUTEX_WAIT, ...) and poll() aren't syscalls where it makes much
sense to apply seccomp filters to their arguments.

So the current consequences are just of the "if userspace does bad stuff,
it can damage itself, and that's not a problem" flavor.

But still, it seems like a hazard for future developers, so invalidate the
restart_block when partly setting it up in the nanosleep syscalls.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230105134403.754986-1-jannh@google.com
2023-01-11 19:31:47 +01:00
Peter Zijlstra
f5d39b0208 freezer,sched: Rewrite core freezer logic
Rewrite the core freezer to behave better wrt thawing and be simpler
in general.

By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is
ensured frozen tasks stay frozen until thawed and don't randomly wake
up early, as is currently possible.

As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up
two PF_flags (yay!).

Specifically; the current scheme works a little like:

	freezer_do_not_count();
	schedule();
	freezer_count();

And either the task is blocked, or it lands in try_to_freezer()
through freezer_count(). Now, when it is blocked, the freezer
considers it frozen and continues.

However, on thawing, once pm_freezing is cleared, freezer_count()
stops working, and any random/spurious wakeup will let a task run
before its time.

That is, thawing tries to thaw things in explicit order; kernel
threads and workqueues before doing bringing SMP back before userspace
etc.. However due to the above mentioned races it is entirely possible
for userspace tasks to thaw (by accident) before SMP is back.

This can be a fatal problem in asymmetric ISA architectures (eg ARMv9)
where the userspace task requires a special CPU to run.

As said; replace this with a special task state TASK_FROZEN and add
the following state transitions:

	TASK_FREEZABLE	-> TASK_FROZEN
	__TASK_STOPPED	-> TASK_FROZEN
	__TASK_TRACED	-> TASK_FROZEN

The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL
(IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state
is already required to deal with spurious wakeups and the freezer
causes one such when thawing the task (since the original state is
lost).

The special __TASK_{STOPPED,TRACED} states *can* be restored since
their canonical state is in ->jobctl.

With this, frozen tasks need an explicit TASK_FROZEN wakeup and are
free of undue (early / spurious) wakeups.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org
2022-09-07 21:53:50 +02:00
Jason A. Donenfeld
151c8e499f wireguard: ratelimiter: use hrtimer in selftest
Using msleep() is problematic because it's compared against
ratelimiter.c's ktime_get_coarse_boottime_ns(), which means on systems
with slow jiffies (such as UML's forced HZ=100), the result is
inaccurate. So switch to using schedule_hrtimeout().

However, hrtimer gives us access only to the traditional posix timers,
and none of the _COARSE variants. So now, rather than being too
imprecise like jiffies, it's too precise.

One solution would be to give it a large "range" value, but this will
still fire early on a loaded system. A better solution is to align the
timeout to the actual coarse timer, and then round up to the nearest
tick, plus change.

So add the timeout to the current coarse time, and then
schedule_hrtimer() until the absolute computed time.

This should hopefully reduce flakes in CI as well. Note that we keep the
retry loop in case the entire function is running behind, because the
test could still be scheduled out, by either the kernel or by the
hypervisor's kernel, in which case restarting the test and hoping to not
be scheduled out still helps.

Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-02 13:47:50 -07:00
Thomas Gleixner
f80e214895 hrtimer: Unbreak hrtimer_force_reprogram()
Since the recent consoliation of reprogramming functions,
hrtimer_force_reprogram() is affected by a check whether the new expiry
time is past the current expiry time.

This breaks the NOHZ logic as that relies on the fact that the tick hrtimer
is moved into the future. That means cpu_base->expires_next becomes stale
and subsequent reprogramming attempts fail as well until the situation is
cleaned up by an hrtimer interrupts.

For some yet unknown reason this leads to a complete stall, so for now
partially revert the offending commit to a known working state. The root
cause for the stall is still investigated and will be fixed in a subsequent
commit.

Fixes: b14bca97c9 ("hrtimer: Consolidate reprogramming code")
Reported-by: Mike Galbraith <efault@gmx.de>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mike Galbraith <efault@gmx.de>
Link: https://lore.kernel.org/r/8735recskh.ffs@tglx
2021-08-12 22:34:40 +02:00
Thomas Gleixner
9482fd71db hrtimer: Use raw_cpu_ptr() in clock_was_set()
clock_was_set() can be invoked from preemptible context. Use raw_cpu_ptr()
to check whether high resolution mode is active or not. It does not matter
whether the task migrates after acquiring the pointer.

Fixes: e71a4153b7 ("hrtimer: Force clock_was_set() handling for the HIGHRES=n, NOHZ=y case")
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/875ywacsmb.ffs@tglx
2021-08-12 22:34:40 +02:00
Thomas Gleixner
1e7f7fbcd4 hrtimer: Avoid more SMP function calls in clock_was_set()
By unconditionally updating the offsets there are more indicators
whether the SMP function calls on clock_was_set() can be avoided:

  - When the offset update already happened on the remote CPU then the
    remote update attempt will yield the same seqeuence number and no
    IPI is required.

  - When the remote CPU is currently handling hrtimer_interrupt(). In
    that case the remote CPU will reevaluate the timer bases before
    reprogramming anyway, so nothing to do.

  - After updating it can be checked whether the first expiring timer in
    the affected clock bases moves before the first expiring (softirq)
    timer of the CPU. If that's not the case then sending the IPI is not
    required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.887322464@linutronix.de
2021-08-10 17:57:23 +02:00
Marcelo Tosatti
81d741d346 hrtimer: Avoid unnecessary SMP function calls in clock_was_set()
Setting of clocks triggers an unconditional SMP function call on all online
CPUs to reprogram the clock event device.

However, only some clocks have their offsets updated and therefore
potentially require a reprogram. That's CLOCK_REALTIME and CLOCK_TAI and in
the case of resume (delayed sleep time injection) also CLOCK_BOOTTIME.

Instead of sending an IPI unconditionally, check each per CPU hrtimer base
whether it has active timers in the affected clock bases which are
indicated by the caller in the @bases argument of clock_was_set().

If that's not the case, skip the IPI and update the offsets remotely which
ensures that any subsequently armed timers on the affected clocks are
evaluated with the correct offsets.

[ tglx: Adopted to the new bases argument, removed the softirq_active
  	check, added comment, fixed up stale comment ]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.787536542@linutronix.de
2021-08-10 17:57:23 +02:00
Thomas Gleixner
17a1b8826b hrtimer: Add bases argument to clock_was_set()
clock_was_set() unconditionaly invokes retrigger_next_event() on all online
CPUs. This was necessary because that mechanism was also used for resume
from suspend to idle which is not longer the case.

The bases arguments allows the callers of clock_was_set() to hand in a mask
which tells clock_was_set() which of the hrtimer clock bases are affected
by the clock setting. This mask will be used in the next step to check
whether a CPU base has timers queued on a clock base affected by the event
and avoid the SMP function call if there are none.

Add a @bases argument, provide defines for the active bases masking and
fixup all callsites.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.691083465@linutronix.de
2021-08-10 17:57:23 +02:00
Thomas Gleixner
a761a67f59 timekeeping: Distangle resume and clock-was-set events
Resuming timekeeping is a clock-was-set event and uses the clock-was-set
notification mechanism. This is in the way of making the clock-was-set
update for hrtimers selective so unnecessary IPIs are avoided when a CPU
base does not have timers queued which are affected by the clock setting.

Distangle it by invoking hrtimer_resume() on each unfreezing CPU and invoke
the new timerfd_resume() function from timekeeping_resume() which is the
only place where this is needed.

Rename hrtimer_resume() to hrtimer_resume_local() to reflect the change.

With this the clock_was_set*() functions are not longer required to IPI all
CPUs unconditionally and can get some smarts to avoid them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.488853478@linutronix.de
2021-08-10 17:57:23 +02:00
Thomas Gleixner
e71a4153b7 hrtimer: Force clock_was_set() handling for the HIGHRES=n, NOHZ=y case
When CONFIG_HIGH_RES_TIMERS is disabled, but NOHZ is enabled then
clock_was_set() is not doing anything. With HIGHRES=n the kernel relies on
the periodic tick to update the clock offsets, but when NOHZ is enabled and
active then CPUs which are in a deep idle sleep do not have a periodic tick
which means the expiry of timers affected by clock_was_set() can be
arbitrarily delayed up to the point where the CPUs are brought out of idle
again.

Make the clock_was_set() logic unconditionaly available so that idle CPUs
are kicked out of idle to handle the update.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.288697903@linutronix.de
2021-08-10 17:57:22 +02:00
Thomas Gleixner
8c3b5e6ec0 hrtimer: Ensure timerfd notification for HIGHRES=n
If high resolution timers are disabled the timerfd notification about a
clock was set event is not happening for all cases which use
clock_was_set_delayed() because that's a NOP for HIGHRES=n, which is wrong.

Make clock_was_set_delayed() unconditially available to fix that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.196661266@linutronix.de
2021-08-10 17:57:22 +02:00
Peter Zijlstra
b14bca97c9 hrtimer: Consolidate reprogramming code
This code is mostly duplicated. The redudant store in the force reprogram
case does no harm and the in hrtimer interrupt condition cannot be true for
the force reprogram invocations.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.054424875@linutronix.de
2021-08-10 17:57:22 +02:00
Thomas Gleixner
627ef5ae2d hrtimer: Avoid double reprogramming in __hrtimer_start_range_ns()
If __hrtimer_start_range_ns() is invoked with an already armed hrtimer then
the timer has to be canceled first and then added back. If the timer is the
first expiring timer then on removal the clockevent device is reprogrammed
to the next expiring timer to avoid that the pending expiry fires needlessly.

If the new expiry time ends up to be the first expiry again then the clock
event device has to reprogrammed again.

Avoid this by checking whether the timer is the first to expire and in that
case, keep the timer on the current CPU and delay the reprogramming up to
the point where the timer has been enqueued again.

Reported-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135157.873137732@linutronix.de
2021-08-10 17:57:22 +02:00
Linus Torvalds
87dcebff92 Merge tag 'timers-core-2021-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The time and timers updates contain:

  Core changes:

   - Allow runtime power management when the clocksource is changed.

   - A correctness fix for clock_adjtime32() so that the return value on
     success is not overwritten by the result of the copy to user.

   - Allow late installment of broadcast clockevent devices which was
     broken because nothing switched them over to oneshot mode. This
     went unnoticed so far because clockevent devices used to be built
     in, but now people started to make them modular.

   - Debugfs related simplifications

   - Small cleanups and improvements here and there

  Driver changes:

   - The usual set of device tree binding updates for a wide range of
     drivers/devices.

   - The usual updates and improvements for drivers all over the place
     but nothing outstanding.

   - No new clocksource/event drivers. They'll come back next time"

* tag 'timers-core-2021-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  posix-timers: Preserve return value in clock_adjtime32()
  tick/broadcast: Allow late registered device to enter oneshot mode
  tick: Use tick_check_replacement() instead of open coding it
  time/timecounter: Mark 1st argument of timecounter_cyc2time() as const
  dt-bindings: timer: nuvoton,npcm7xx: Add wpcm450-timer
  clocksource/drivers/arm_arch_timer: Add __ro_after_init and __init
  clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940
  clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue
  clocksource/drivers/dw_apb_timer_of: Add handling for potential memory leak
  clocksource/drivers/npcm: Add support for WPCM450
  clocksource/drivers/sh_cmt: Don't use CMTOUT_IE with R-Car Gen2/3
  clocksource/drivers/pistachio: Fix trivial typo
  clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe()
  clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stopped
  clocksource/drivers/timer-ti-dm: Fix posted mode status check order
  dt-bindings: timer: renesas,cmt: Document R8A77961
  dt-bindings: timer: renesas,cmt: Add r8a779a0 CMT support
  clocksource/drivers/ingenic-ost: Add support for the JZ4760B
  clocksource/drivers/ingenic: Add support for the JZ4760
  dt-bindings: timer: ingenic: Add compatible strings for JZ4760(B)
  ...
2021-04-26 09:54:03 -07:00
Ingo Molnar
4bf07f6562 timekeeping, clocksource: Fix various typos in comments
Fix ~56 single-word typos in timekeeping & clocksource code comments.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-kernel@vger.kernel.org
2021-03-22 23:06:48 +01:00
Oleg Nesterov
5abbe51a52 kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()
Preparation for fixing get_nr_restart_syscall() on X86 for COMPAT.

Add a new helper which sets restart_block->fn and calls a dummy
arch_set_restart_data() helper.

Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174641.GA17871@redhat.com
2021-03-16 22:13:10 +01:00
Anna-Maria Behnsen
46eb1701c0 hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
hrtimer_force_reprogram() and hrtimer_interrupt() invokes
__hrtimer_get_next_event() to find the earliest expiry time of hrtimer
bases. __hrtimer_get_next_event() does not update
cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That
needs to be done at the callsites.

hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when
the first expiring timer is a softirq timer and the soft interrupt is not
activated. That's wrong because cpu_base::softirq_expires_next is left
stale when the first expiring timer of all bases is a timer which expires
in hard interrupt context. hrtimer_interrupt() does never update
cpu_base::softirq_expires_next which is wrong too.

That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and
the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting
CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that
timer before the stale cpu_base::softirq_expires_next.

cpu_base::softirq_expires_next is cached to make the check for raising the
soft interrupt fast. In the above case the soft interrupt won't be raised
until clock monotonic reaches the stale cpu_base::softirq_expires_next
value. That's incorrect, but what's worse it that if the softirq timer
becomes the first expiring timer of all clock bases after the hard expiry
timer has been handled the reprogramming of the clockevent from
hrtimer_interrupt() will result in an interrupt storm. That happens because
the reprogramming does not use cpu_base::softirq_expires_next, it uses
__hrtimer_get_next_event() which returns the actual expiry time. Once clock
MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is
raised and the storm subsides.

Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard
bases seperately, update softirq_expires_next and handle the case when a
soft expiring timer is the first of all bases by comparing the expiry times
and updating the required cpu base fields. Split this functionality into a
separate function to be able to use it in hrtimer_interrupt() as well
without copy paste.

Fixes: 5da7016046 ("hrtimer: Implement support for softirq based hrtimers")
Reported-by: Mikael Beckius <mikael.beckius@windriver.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mikael Beckius <mikael.beckius@windriver.com>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de
2021-03-08 09:37:01 +01:00
Linus Torvalds
533369b145 Merge tag 'timers-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timers and timekeeping updates from Thomas Gleixner:
 "Core:

   - Robustness improvements for the NOHZ tick management

   - Fixes and consolidation of the NTP/RTC synchronization code

   - Small fixes and improvements in various places

   - A set of function documentation udpates and fixes

   Drivers:

   - Cleanups and improvements in various clocksoure/event drivers

   - Removal of the EZChip NPS clocksource driver as the platfrom
     support was removed from ARC

   - The usual set of new device tree binding and json conversions

   - The RTC driver which have been acked by the RTC maintainer:

       * fix a long standing bug in the MC146818 library code which can
         cause reading garbage during the RTC internal update.

       * changes related to the NTP/RTC consolidation work"

* tag 'timers-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  ntp: Fix prototype in the !CONFIG_GENERIC_CMOS_UPDATE case
  tick/sched: Make jiffies update quick check more robust
  ntp: Consolidate the RTC update implementation
  ntp: Make the RTC sync offset less obscure
  ntp, rtc: Move rtc_set_ntp_time() to ntp code
  ntp: Make the RTC synchronization more reliable
  rtc: core: Make the sync offset default more realistic
  rtc: cmos: Make rtc_cmos sync offset correct
  rtc: mc146818: Reduce spinlock section in mc146818_set_time()
  rtc: mc146818: Prevent reading garbage
  clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM
  clocksource/drivers/arm_arch_timer: Correct fault programming of CNTKCTL_EL1.EVNTI
  clocksource/drivers/arm_arch_timer: Use stable count reader in erratum sne
  clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available
  clocksource/drivers/riscv: Make RISCV_TIMER depends on RISCV_SBI
  clocksource/drivers/ingenic: Fix section mismatch
  clocksource/drivers/cadence_ttc: Fix memory leak in ttc_setup_clockevent()
  dt-bindings: timer: renesas: tmu: Convert to json-schema
  dt-bindings: timer: renesas: tmu: Document r8a774e1 bindings
  clocksource/drivers/orion: Add missing clk_disable_unprepare() on error path
  ...
2020-12-14 18:21:14 -08:00