Merge a0c04bd55a ("Merge tag 'kbuild-fixes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild") into android-mainline

Steps on the way to v6.11-rc1

Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Idc35b6a19365ad7bf4bed9a0d24596a36e4732c6
This commit is contained in:
Lee Jones
2024-08-21 16:57:27 +01:00
116 changed files with 4363 additions and 1291 deletions
+4 -3
View File
@@ -14,9 +14,10 @@ Description:
event to its internal Informational Event log, updates the
Event Status register, and if configured, interrupts the host.
It is not an error to inject poison into an address that
already has poison present and no error is returned. The
inject_poison attribute is only visible for devices supporting
the capability.
already has poison present and no error is returned. If the
device returns 'Inject Poison Limit Reached' an -EBUSY error
is returned to the user. The inject_poison attribute is only
visible for devices supporting the capability.
What: /sys/kernel/debug/memX/clear_poison
+2
View File
@@ -9,4 +9,6 @@ Compute Express Link
memory-devices
maturity-map
.. only:: subproject and html
@@ -0,0 +1,202 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
===========================================
Compute Express Link Subsystem Maturity Map
===========================================
The Linux CXL subsystem tracks the dynamic `CXL specification
<https://computeexpresslink.org/cxl-specification-landing-page>`_ that
continues to respond to new use cases with new features, capability
updates and fixes. At any given point some aspects of the subsystem are
more mature than others. While the periodic pull requests summarize the
`work being incorporated each merge window
<https://lore.kernel.org/linux-cxl/?q=s%3APULL+s%3ACXL+tc%3Atorvalds+NOT+s%3ARe>`_,
those do not always convey progress relative to a starting point and a
future end goal.
What follows is a coarse breakdown of the subsystem's major
responsibilities along with a maturity score. The expectation is that
the change-history of this document provides an overview summary of the
subsystem maturation over time.
The maturity scores are:
- [3] Mature: Work in this area is complete and no changes on the horizon.
Note that this score can regress from one kernel release to the next
based on new test results or end user reports.
- [2] Stabilizing: Major functionality operational, common cases are
mature, but known corner cases are still a work in progress.
- [1] Initial: Capability that has exited the Proof of Concept phase, but
may still have significant gaps to close and fixes to apply as real
world testing occurs.
- [0] Known gap: Feature is on a medium to long term horizon to
implement. If the specification has a feature that does not even have
a '0' score in this document, there is a good chance that no one in
the linux-cxl@vger.kernel.org community has started to look at it.
- X: Out of scope for kernel enabling, or kernel enabling not required
Feature and Capabilities
========================
Enumeration / Provisioning
--------------------------
All of the fundamental enumeration an object model of the subsystem is
in place, but there are several corner cases that are pending closure.
* [2] CXL Window Enumeration
* [0] :ref:`Extended-linear memory-side cache <extended-linear>`
* [0] Low Memory-hole
* [0] Hetero-interleave
* [2] Switch Enumeration
* [0] CXL register enumeration link-up dependency
* [2] HDM Decoder Configuration
* [0] Decoder target and granularity constraints
* [2] Performance enumeration
* [3] Endpoint CDAT
* [3] Switch CDAT
* [1] CDAT to Core-mm integration
* [1] x86
* [0] Arm64
* [0] All other arch.
* [0] Shared link
* [2] Hotplug
(see CXL Window Enumeration)
* [0] Handle Soft Reserved conflicts
* [0] :ref:`RCH link status <rch-link-status>`
* [0] Fabrics / G-FAM (chapter 7)
* [0] Global Access Endpoint
RAS
---
In many ways CXL can be seen as a standardization of what would normally
be handled by custom EDAC drivers. The open development here is
mainly caused by the enumeration corner cases above.
* [3] Component events (OS)
* [2] Component events (FFM)
* [1] Endpoint protocol errors (OS)
* [1] Endpoint protocol errors (FFM)
* [0] Switch protocol errors (OS)
* [1] Switch protocol errors (FFM)
* [2] DPA->HPA Address translation
* [1] XOR Interleave translation
(see CXL Window Enumeration)
* [1] Memory Failure coordination
* [0] Scrub control
* [2] ACPI error injection EINJ
* [0] EINJ v2
* [X] Compliance DOE
* [2] Native error injection
* [3] RCH error handling
* [1] VH error handling
* [0] PPR
* [0] Sparing
* [0] Device built in test
Mailbox commands
----------------
* [3] Firmware update
* [3] Health / Alerts
* [1] :ref:`Background commands <background-commands>`
* [3] Sanitization
* [3] Security commands
* [3] RAW Command Debug Passthrough
* [0] CEL-only-validation Passthrough
* [0] Switch CCI
* [3] Timestamp
* [1] PMEM labels
* [0] PMEM GPF / Dirty Shutdown
* [0] Scan Media
PMU
---
* [1] Type 3 PMU
* [0] Switch USP/ DSP, Root Port
Security
--------
* [X] CXL Trusted Execution Environment Security Protocol (TSP)
* [X] CXL IDE (subsumed by TSP)
Memory-pooling
--------------
* [1] Hotplug of LDs (via PCI hotplug)
* [0] Dynamic Capacity Device (DCD) Support
Multi-host sharing
------------------
* [0] Hardware coherent shared memory
* [0] Software managed coherency shared memory
Multi-host memory
-----------------
* [0] Dynamic Capacity Device Support
* [0] Sharing
Accelerator
-----------
* [0] Accelerator memory enumeration HDM-D (CXL 1.1/2.0 Type-2)
* [0] Accelerator memory enumeration HDM-DB (CXL 3.0 Type-2)
* [0] CXL.cache 68b (CXL 2.0)
* [0] CXL.cache 256b Cache IDs (CXL 3.0)
User Flow Support
-----------------
* [0] HPA->DPA Address translation (need xormaps export solution)
Details
=======
.. _extended-linear:
* **Extended-linear memory-side cache**: An HMAT proposal to enumerate the presence of a
memory-side cache where the cache capacity extends the SRAT address
range capacity. `See the ECN
<https://lore.kernel.org/linux-cxl/6650e4f835a0e_195e294a8@dwillia2-mobl3.amr.corp.intel.com.notmuch/>`_
for more details:
.. _rch-link-status:
* **RCH Link Status**: RCH (Restricted CXL Host) topologies, end up
hiding some standard registers like PCIe Link Status / Capabilities in
the CXL RCRB (Root Complex Register Block).
.. _background-commands:
* **Background commands**: The CXL background command mechanism is
awkward as the single slot is monopolized potentially indefinitely by
various commands. A `cancel on conflict
<http://lore.kernel.org/r/66035c2e8ba17_770232948b@dwillia2-xfh.jf.intel.com.notmuch>`_
facility is needed to make sure the kernel can ensure forward progress
of priority commands.
+1 -8
View File
@@ -89,14 +89,7 @@ docs on :ref:`Building Linux with Clang/LLVM <kbuild_llvm>`.
Rust (optional)
---------------
A particular version of the Rust toolchain is required. Newer versions may or
may not work because the kernel depends on some unstable Rust features, for
the moment.
Each Rust toolchain comes with several "components", some of which are required
(like ``rustc``) and some that are optional. The ``rust-src`` component (which
is optional) needs to be installed to build the kernel. Other components are
useful for developing.
A recent version of the Rust compiler is required.
Please see Documentation/rust/quick-start.rst for instructions on how to
satisfy the build requirements of Rust support. In particular, the ``Makefile``
@@ -7,6 +7,14 @@ This document contains useful information to know when working with
the Rust support in the kernel.
``no_std``
----------
The Rust support in the kernel can link only `core <https://doc.rust-lang.org/core/>`_,
but not `std <https://doc.rust-lang.org/std/>`_. Crates for use in the
kernel must opt into this behavior using the ``#![no_std]`` attribute.
Code documentation
------------------
+100 -43
View File
@@ -5,17 +5,93 @@ Quick Start
This document describes how to get started with kernel development in Rust.
There are a few ways to install a Rust toolchain needed for kernel development.
A simple way is to use the packages from your Linux distribution if they are
suitable -- the first section below explains this approach. An advantage of this
approach is that, typically, the distribution will match the LLVM used by Rust
and Clang.
Another way is using the prebuilt stable versions of LLVM+Rust provided on
`kernel.org <https://kernel.org/pub/tools/llvm/rust/>`_. These are the same slim
and fast LLVM toolchains from :ref:`Getting LLVM <getting_llvm>` with versions
of Rust added to them that Rust for Linux supports. Two sets are provided: the
"latest LLVM" and "matching LLVM" (please see the link for more information).
Alternatively, the next two "Requirements" sections explain each component and
how to install them through ``rustup``, the standalone installers from Rust
and/or building them.
The rest of the document explains other aspects on how to get started.
Distributions
-------------
Arch Linux
**********
Arch Linux provides recent Rust releases and thus it should generally work out
of the box, e.g.::
pacman -S rust rust-src rust-bindgen
Debian
******
Debian Unstable (Sid), outside of the freeze period, provides recent Rust
releases and thus it should generally work out of the box, e.g.::
apt install rustc rust-src bindgen rustfmt rust-clippy
Fedora Linux
************
Fedora Linux provides recent Rust releases and thus it should generally work out
of the box, e.g.::
dnf install rust rust-src bindgen-cli rustfmt clippy
Gentoo Linux
************
Gentoo Linux (and especially the testing branch) provides recent Rust releases
and thus it should generally work out of the box, e.g.::
USE='rust-src rustfmt clippy' emerge dev-lang/rust dev-util/bindgen
``LIBCLANG_PATH`` may need to be set.
Nix
***
Nix (unstable channel) provides recent Rust releases and thus it should
generally work out of the box, e.g.::
{ pkgs ? import <nixpkgs> {} }:
pkgs.mkShell {
nativeBuildInputs = with pkgs; [ rustc rust-bindgen rustfmt clippy ];
RUST_LIB_SRC = "${pkgs.rust.packages.stable.rustPlatform.rustLibSrc}";
}
openSUSE
********
openSUSE Slowroll and openSUSE Tumbleweed provide recent Rust releases and thus
they should generally work out of the box, e.g.::
zypper install rust rust1.79-src rust-bindgen clang
Requirements: Building
----------------------
This section explains how to fetch the tools needed for building.
Some of these requirements might be available from Linux distributions
under names like ``rustc``, ``rust-src``, ``rust-bindgen``, etc. However,
at the time of writing, they are likely not to be recent enough unless
the distribution tracks the latest releases.
To easily check whether the requirements are met, the following target
can be used::
@@ -29,16 +105,15 @@ if that is the case.
rustc
*****
A particular version of the Rust compiler is required. Newer versions may or
may not work because, for the moment, the kernel depends on some unstable
Rust features.
A recent version of the Rust compiler is required.
If ``rustup`` is being used, enter the kernel build directory (or use
``--path=<build-dir>`` argument to the ``set`` sub-command) and run::
``--path=<build-dir>`` argument to the ``set`` sub-command) and run,
for instance::
rustup override set $(scripts/min-tool-version.sh rustc)
rustup override set stable
This will configure your working directory to use the correct version of
This will configure your working directory to use the given version of
``rustc`` without affecting your default toolchain.
Note that the override applies to the current working directory (and its
@@ -65,9 +140,9 @@ version later on requires re-adding the component.
Otherwise, if a standalone installer is used, the Rust source tree may be
downloaded into the toolchain's installation folder::
curl -L "https://static.rust-lang.org/dist/rust-src-$(scripts/min-tool-version.sh rustc).tar.gz" |
curl -L "https://static.rust-lang.org/dist/rust-src-$(rustc --version | cut -d' ' -f2).tar.gz" |
tar -xzf - -C "$(rustc --print sysroot)/lib" \
"rust-src-$(scripts/min-tool-version.sh rustc)/rust-src/lib/" \
"rust-src-$(rustc --version | cut -d' ' -f2)/rust-src/lib/" \
--strip-components=3
In this case, upgrading the Rust compiler version later on requires manually
@@ -101,26 +176,22 @@ bindgen
*******
The bindings to the C side of the kernel are generated at build time using
the ``bindgen`` tool. A particular version is required.
the ``bindgen`` tool.
Install it via (note that this will download and build the tool from source)::
Install it, for instance, via (note that this will download and build the tool
from source)::
cargo install --locked --version $(scripts/min-tool-version.sh bindgen) bindgen-cli
cargo install --locked bindgen-cli
``bindgen`` needs to find a suitable ``libclang`` in order to work. If it is
not found (or a different ``libclang`` than the one found should be used),
the process can be tweaked using the environment variables understood by
``clang-sys`` (the Rust bindings crate that ``bindgen`` uses to access
``libclang``):
``bindgen`` uses the ``clang-sys`` crate to find a suitable ``libclang`` (which
may be linked statically, dynamically or loaded at runtime). By default, the
``cargo`` command above will produce a ``bindgen`` binary that will load
``libclang`` at runtime. If it is not found (or a different ``libclang`` than
the one found should be used), the process can be tweaked, e.g. by using the
``LIBCLANG_PATH`` environment variable. For details, please see ``clang-sys``'s
documentation at:
* ``LLVM_CONFIG_PATH`` can be pointed to an ``llvm-config`` executable.
* Or ``LIBCLANG_PATH`` can be pointed to a ``libclang`` shared library
or to the directory containing it.
* Or ``CLANG_PATH`` can be pointed to a ``clang`` executable.
For details, please see ``clang-sys``'s documentation at:
https://github.com/KyleMayes/clang-sys#linking
https://github.com/KyleMayes/clang-sys#environment-variables
@@ -164,20 +235,6 @@ can be installed manually::
The standalone installers also come with ``clippy``.
cargo
*****
``cargo`` is the Rust native build system. It is currently required to run
the tests since it is used to build a custom standard library that contains
the facilities provided by the custom ``alloc`` in the kernel. The tests can
be run using the ``rusttest`` Make target.
If ``rustup`` is being used, all the profiles already install the tool,
thus nothing needs to be done.
The standalone installers also come with ``cargo``.
rustdoc
*******
+2 -3
View File
@@ -131,9 +131,8 @@ Additionally, there are the ``#[test]`` tests. These can be run using the
make LLVM=1 rusttest
This requires the kernel ``.config`` and downloads external repositories. It
runs the ``#[test]`` tests on the host (currently) and thus is fairly limited in
what these tests can test.
This requires the kernel ``.config``. It runs the ``#[test]`` tests on the host
(currently) and thus is fairly limited in what these tests can test.
The Kselftests
--------------
+1
View File
@@ -5613,6 +5613,7 @@ M: Ira Weiny <ira.weiny@intel.com>
M: Dan Williams <dan.j.williams@intel.com>
L: linux-cxl@vger.kernel.org
S: Maintained
F: Documentation/driver-api/cxl
F: drivers/cxl/
F: include/linux/einj-cxl.h
F: include/linux/cxl-event.h
+16 -14
View File
@@ -463,17 +463,17 @@ KBUILD_USERLDFLAGS := $(USERLDFLAGS)
# host programs.
export rust_common_flags := --edition=2021 \
-Zbinary_dep_depinfo=y \
-Dunsafe_op_in_unsafe_fn -Drust_2018_idioms \
-Dunreachable_pub -Dnon_ascii_idents \
-Dunsafe_op_in_unsafe_fn \
-Dnon_ascii_idents \
-Wrust_2018_idioms \
-Wunreachable_pub \
-Wmissing_docs \
-Drustdoc::missing_crate_level_docs \
-Dclippy::correctness -Dclippy::style \
-Dclippy::suspicious -Dclippy::complexity \
-Dclippy::perf \
-Dclippy::let_unit_value -Dclippy::mut_mut \
-Dclippy::needless_bitwise_bool \
-Dclippy::needless_continue \
-Dclippy::no_mangle_with_rust_abi \
-Wrustdoc::missing_crate_level_docs \
-Wclippy::all \
-Wclippy::mut_mut \
-Wclippy::needless_bitwise_bool \
-Wclippy::needless_continue \
-Wclippy::no_mangle_with_rust_abi \
-Wclippy::dbg_macro
KBUILD_HOSTCFLAGS := $(KBUILD_USERHOSTCFLAGS) $(HOST_LFS_CFLAGS) \
@@ -511,7 +511,6 @@ RUSTDOC = rustdoc
RUSTFMT = rustfmt
CLIPPY_DRIVER = clippy-driver
BINDGEN = bindgen
CARGO = cargo
PAHOLE = pahole
RESOLVE_BTFIDS = $(objtree)/tools/bpf/resolve_btfids/resolve_btfids
LEX = flex
@@ -577,7 +576,7 @@ KBUILD_RUSTFLAGS := $(rust_common_flags) \
-Csymbol-mangling-version=v0 \
-Crelocation-model=static \
-Zfunction-sections=n \
-Dclippy::float_arithmetic
-Wclippy::float_arithmetic
KBUILD_AFLAGS_KERNEL :=
KBUILD_CFLAGS_KERNEL :=
@@ -605,7 +604,7 @@ endif
export RUSTC_BOOTSTRAP := 1
export ARCH SRCARCH CONFIG_SHELL BASH HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE LD CC HOSTPKG_CONFIG
export RUSTC RUSTDOC RUSTFMT RUSTC_OR_CLIPPY_QUIET RUSTC_OR_CLIPPY BINDGEN CARGO
export RUSTC RUSTDOC RUSTFMT RUSTC_OR_CLIPPY_QUIET RUSTC_OR_CLIPPY BINDGEN
export HOSTRUSTC KBUILD_HOSTRUSTFLAGS
export CPP AR NM STRIP OBJCOPY OBJDUMP READELF PAHOLE RESOLVE_BTFIDS LEX YACC AWK INSTALLKERNEL
export PERL PYTHON3 CHECK CHECKFLAGS MAKE UTS_MACHINE HOSTCXX
@@ -2005,9 +2004,12 @@ quiet_cmd_tags = GEN $@
tags TAGS cscope gtags: FORCE
$(call cmd,tags)
# IDE support targets
# Generate rust-project.json (a file that describes the structure of non-Cargo
# Rust projects) for rust-analyzer (an implementation of the Language Server
# Protocol).
PHONY += rust-analyzer
rust-analyzer:
$(Q)$(CONFIG_SHELL) $(srctree)/scripts/rust_is_available.sh
$(Q)$(MAKE) $(build)=rust $@
# Script to generate missing namespace dependencies
+1 -1
View File
@@ -110,7 +110,7 @@ static inline void pgd_list_del(pgd_t *pgd)
#define UNSHARED_PTRS_PER_PGD \
(SHARED_KERNEL_PMD ? KERNEL_PGD_BOUNDARY : PTRS_PER_PGD)
#define MAX_UNSHARED_PTRS_PER_PGD \
max_t(size_t, KERNEL_PGD_BOUNDARY, PTRS_PER_PGD)
MAX_T(size_t, KERNEL_PGD_BOUNDARY, PTRS_PER_PGD)
static void pgd_set_mm(pgd_t *pgd, struct mm_struct *mm)
+1 -1
View File
@@ -663,12 +663,12 @@ void del_gendisk(struct gendisk *disk)
*/
if (!test_bit(GD_DEAD, &disk->state))
blk_report_disk_dead(disk, false);
__blk_mark_disk_dead(disk);
/*
* Drop all partitions now that the disk is marked dead.
*/
mutex_lock(&disk->open_mutex);
__blk_mark_disk_dead(disk);
xa_for_each_start(&disk->part_tbl, idx, part, 1)
drop_partition(part);
mutex_unlock(&disk->open_mutex);
+4
View File
@@ -3422,6 +3422,7 @@ void drbd_uuid_set_bm(struct drbd_device *device, u64 val) __must_hold(local)
/**
* drbd_bmio_set_n_write() - io_fn for drbd_queue_bitmap_io() or drbd_bitmap_io()
* @device: DRBD device.
* @peer_device: Peer DRBD device.
*
* Sets all bits in the bitmap and writes the whole bitmap to stable storage.
*/
@@ -3448,6 +3449,7 @@ int drbd_bmio_set_n_write(struct drbd_device *device,
/**
* drbd_bmio_clear_n_write() - io_fn for drbd_queue_bitmap_io() or drbd_bitmap_io()
* @device: DRBD device.
* @peer_device: Peer DRBD device.
*
* Clears all bits in the bitmap and writes the whole bitmap to stable storage.
*/
@@ -3501,6 +3503,7 @@ static int w_bitmap_io(struct drbd_work *w, int unused)
* @done: callback to be called after the bitmap IO was performed
* @why: Descriptive text of the reason for doing the IO
* @flags: Bitmap flags
* @peer_device: Peer DRBD device.
*
* While IO on the bitmap happens we freeze application IO thus we ensure
* that drbd_set_out_of_sync() can not be called. This function MAY ONLY be
@@ -3549,6 +3552,7 @@ void drbd_queue_bitmap_io(struct drbd_device *device,
* @io_fn: IO callback to be called when bitmap IO is possible
* @why: Descriptive text of the reason for doing the IO
* @flags: Bitmap flags
* @peer_device: Peer DRBD device.
*
* freezes application IO while that the actual IO operations runs. This
* functions MAY NOT be called from worker context.
+4 -1
View File
@@ -48,6 +48,9 @@
#define UBLK_MINORS (1U << MINORBITS)
/* private ioctl command mirror */
#define UBLK_CMD_DEL_DEV_ASYNC _IOC_NR(UBLK_U_CMD_DEL_DEV_ASYNC)
/* All UBLK_F_* have to be included into UBLK_F_ALL */
#define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \
| UBLK_F_URING_CMD_COMP_IN_TASK \
@@ -2903,7 +2906,7 @@ static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
case UBLK_CMD_DEL_DEV:
ret = ublk_ctrl_del_dev(&ub, true);
break;
case UBLK_U_CMD_DEL_DEV_ASYNC:
case UBLK_CMD_DEL_DEV_ASYNC:
ret = ublk_ctrl_del_dev(&ub, false);
break;
case UBLK_CMD_GET_QUEUE_AFFINITY:
+65 -60
View File
@@ -22,56 +22,42 @@ static const guid_t acpi_cxl_qtg_id_guid =
GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
/*
* Find a targets entry (n) in the host bridge interleave list.
* CXL Specification 3.0 Table 9-22
*/
static int cxl_xor_calc_n(u64 hpa, struct cxl_cxims_data *cximsd, int iw,
int ig)
{
int i = 0, n = 0;
u8 eiw;
/* IW: 2,4,6,8,12,16 begin building 'n' using xormaps */
if (iw != 3) {
for (i = 0; i < cximsd->nr_maps; i++)
n |= (hweight64(hpa & cximsd->xormaps[i]) & 1) << i;
}
/* IW: 3,6,12 add a modulo calculation to 'n' */
if (!is_power_of_2(iw)) {
if (ways_to_eiw(iw, &eiw))
return -1;
hpa &= GENMASK_ULL(51, eiw + ig);
n |= do_div(hpa, 3) << i;
}
return n;
}
static struct cxl_dport *cxl_hb_xor(struct cxl_root_decoder *cxlrd, int pos)
static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
{
struct cxl_cxims_data *cximsd = cxlrd->platform_data;
struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
struct cxl_decoder *cxld = &cxlsd->cxld;
int ig = cxld->interleave_granularity;
int iw = cxld->interleave_ways;
int n = 0;
u64 hpa;
int hbiw = cxlrd->cxlsd.nr_targets;
u64 val;
int pos;
if (dev_WARN_ONCE(&cxld->dev,
cxld->interleave_ways != cxlsd->nr_targets,
"misconfigured root decoder\n"))
return NULL;
/* No xormaps for host bridge interleave ways of 1 or 3 */
if (hbiw == 1 || hbiw == 3)
return hpa;
hpa = cxlrd->res->start + pos * ig;
/*
* For root decoders using xormaps (hbiw: 2,4,6,8,12,16) restore
* the position bit to its value before the xormap was applied at
* HPA->DPA translation.
*
* pos is the lowest set bit in an XORMAP
* val is the XORALLBITS(HPA & XORMAP)
*
* XORALLBITS: The CXL spec (3.1 Table 9-22) defines XORALLBITS
* as an operation that outputs a single bit by XORing all the
* bits in the input (hpa & xormap). Implement XORALLBITS using
* hweight64(). If the hamming weight is even the XOR of those
* bits results in val==0, if odd the XOR result is val==1.
*/
/* Entry (n) is 0 for no interleave (iw == 1) */
if (iw != 1)
n = cxl_xor_calc_n(hpa, cximsd, iw, ig);
for (int i = 0; i < cximsd->nr_maps; i++) {
if (!cximsd->xormaps[i])
continue;
pos = __ffs(cximsd->xormaps[i]);
val = (hweight64(hpa & cximsd->xormaps[i]) & 1);
hpa = (hpa & ~(1ULL << pos)) | (val << pos);
}
if (n < 0)
return NULL;
return cxlrd->cxlsd.target[n];
return hpa;
}
struct cxl_cxims_context {
@@ -361,7 +347,6 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
struct cxl_port *root_port = ctx->root_port;
struct cxl_cxims_context cxims_ctx;
struct device *dev = ctx->dev;
cxl_calc_hb_fn cxl_calc_hb;
struct cxl_decoder *cxld;
unsigned int ways, i, ig;
int rc;
@@ -389,13 +374,9 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
if (rc)
return rc;
if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_MODULO)
cxl_calc_hb = cxl_hb_modulo;
else
cxl_calc_hb = cxl_hb_xor;
struct cxl_root_decoder *cxlrd __free(put_cxlrd) =
cxl_root_decoder_alloc(root_port, ways, cxl_calc_hb);
cxl_root_decoder_alloc(root_port, ways);
if (IS_ERR(cxlrd))
return PTR_ERR(cxlrd);
@@ -434,6 +415,9 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
cxlrd->qos_class = cfmws->qtg_id;
if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR)
cxlrd->hpa_to_spa = cxl_xor_hpa_to_spa;
rc = cxl_decoder_add(cxld, target_map);
if (rc)
return rc;
@@ -482,6 +466,8 @@ struct cxl_chbs_context {
unsigned long long uid;
resource_size_t base;
u32 cxl_version;
int nr_versions;
u32 saved_version;
};
static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
@@ -490,22 +476,31 @@ static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
struct cxl_chbs_context *ctx = arg;
struct acpi_cedt_chbs *chbs;
if (ctx->base != CXL_RESOURCE_NONE)
return 0;
chbs = (struct acpi_cedt_chbs *) header;
if (ctx->uid != chbs->uid)
return 0;
ctx->cxl_version = chbs->cxl_version;
if (!chbs->base)
return 0;
if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
chbs->length != CXL_RCRB_SIZE)
return 0;
if (!chbs->base)
return 0;
if (ctx->saved_version != chbs->cxl_version) {
/*
* cxl_version cannot be overwritten before the next two
* checks, then use saved_version
*/
ctx->saved_version = chbs->cxl_version;
ctx->nr_versions++;
}
if (ctx->base != CXL_RESOURCE_NONE)
return 0;
if (ctx->uid != chbs->uid)
return 0;
ctx->cxl_version = chbs->cxl_version;
ctx->base = chbs->base;
return 0;
@@ -529,10 +524,19 @@ static int cxl_get_chbs(struct device *dev, struct acpi_device *hb,
.uid = uid,
.base = CXL_RESOURCE_NONE,
.cxl_version = UINT_MAX,
.saved_version = UINT_MAX,
};
acpi_table_parse_cedt(ACPI_CEDT_TYPE_CHBS, cxl_get_chbs_iter, ctx);
if (ctx->nr_versions > 1) {
/*
* Disclaim eRCD support given some component register may
* only be found via CHBCR
*/
dev_info(dev, "Unsupported platform config, mixed Virtual Host and Restricted CXL Host hierarchy.");
}
return 0;
}
@@ -921,6 +925,7 @@ static void __exit cxl_acpi_exit(void)
/* load before dax_hmem sees 'Soft Reserved' CXL ranges */
subsys_initcall(cxl_acpi_init);
module_exit(cxl_acpi_exit);
MODULE_DESCRIPTION("CXL ACPI: Platform Support");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
MODULE_IMPORT_NS(ACPI);
+4 -4
View File
@@ -28,12 +28,12 @@ int cxl_region_init(void);
void cxl_region_exit(void);
int cxl_get_poison_by_endpoint(struct cxl_port *port);
struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
u64 cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa);
u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa);
#else
static inline u64
cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, u64 dpa)
static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
const struct cxl_memdev *cxlmd, u64 dpa)
{
return ULLONG_MAX;
}
+2 -2
View File
@@ -875,10 +875,10 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
guard(rwsem_read)(&cxl_region_rwsem);
guard(rwsem_read)(&cxl_dpa_rwsem);
dpa = le64_to_cpu(evt->common.phys_addr) & CXL_DPA_MASK;
dpa = le64_to_cpu(evt->media_hdr.phys_addr) & CXL_DPA_MASK;
cxlr = cxl_dpa_to_region(cxlmd, dpa);
if (cxlr)
hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
hpa = cxl_dpa_to_hpa(cxlr, cxlmd, dpa);
if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
trace_cxl_general_media(cxlmd, type, cxlr, hpa,
+4 -4
View File
@@ -338,10 +338,6 @@ int cxl_dvsec_rr_decode(struct device *dev, int d,
if (rc)
return rc;
rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl);
if (rc)
return rc;
if (!(cap & CXL_DVSEC_MEM_CAPABLE)) {
dev_dbg(dev, "Not MEM Capable\n");
return -ENXIO;
@@ -368,6 +364,10 @@ int cxl_dvsec_rr_decode(struct device *dev, int d,
* disabled, and they will remain moot after the HDM Decoder
* capability is enabled.
*/
rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl);
if (rc)
return rc;
info->mem_enabled = FIELD_GET(CXL_DVSEC_MEM_ENABLE, ctrl);
if (!info->mem_enabled)
return 0;
+2 -19
View File
@@ -1733,21 +1733,6 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
return 0;
}
struct cxl_dport *cxl_hb_modulo(struct cxl_root_decoder *cxlrd, int pos)
{
struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
struct cxl_decoder *cxld = &cxlsd->cxld;
int iw;
iw = cxld->interleave_ways;
if (dev_WARN_ONCE(&cxld->dev, iw != cxlsd->nr_targets,
"misconfigured root decoder\n"))
return NULL;
return cxlrd->cxlsd.target[pos % iw];
}
EXPORT_SYMBOL_NS_GPL(cxl_hb_modulo, CXL);
static struct lock_class_key cxl_decoder_key;
/**
@@ -1807,7 +1792,6 @@ static int cxl_switch_decoder_init(struct cxl_port *port,
* cxl_root_decoder_alloc - Allocate a root level decoder
* @port: owning CXL root of this decoder
* @nr_targets: static number of downstream targets
* @calc_hb: which host bridge covers the n'th position by granularity
*
* Return: A new cxl decoder to be registered by cxl_decoder_add(). A
* 'CXL root' decoder is one that decodes from a top-level / static platform
@@ -1815,8 +1799,7 @@ static int cxl_switch_decoder_init(struct cxl_port *port,
* topology.
*/
struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets,
cxl_calc_hb_fn calc_hb)
unsigned int nr_targets)
{
struct cxl_root_decoder *cxlrd;
struct cxl_switch_decoder *cxlsd;
@@ -1838,7 +1821,6 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
return ERR_PTR(rc);
}
cxlrd->calc_hb = calc_hb;
mutex_init(&cxlrd->range_lock);
cxld = &cxlsd->cxld;
@@ -2356,5 +2338,6 @@ static void cxl_core_exit(void)
subsys_initcall(cxl_core_init);
module_exit(cxl_core_exit);
MODULE_DESCRIPTION("CXL: Core Compute Express Link support");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
+71 -34
View File
@@ -9,6 +9,7 @@
#include <linux/uuid.h>
#include <linux/sort.h>
#include <linux/idr.h>
#include <linux/memory-tiers.h>
#include <cxlmem.h>
#include <cxl.h>
#include "core.h"
@@ -1632,10 +1633,13 @@ static int cxl_region_attach_position(struct cxl_region *cxlr,
const struct cxl_dport *dport, int pos)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
struct cxl_decoder *cxld = &cxlsd->cxld;
int iw = cxld->interleave_ways;
struct cxl_port *iter;
int rc;
if (cxlrd->calc_hb(cxlrd, pos) != dport) {
if (dport != cxlrd->cxlsd.target[pos % iw]) {
dev_dbg(&cxlr->dev, "%s:%s invalid target position for %s\n",
dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
dev_name(&cxlrd->cxlsd.cxld.dev));
@@ -2310,6 +2314,7 @@ static void unregister_region(void *_cxlr)
int i;
unregister_memory_notifier(&cxlr->memory_notifier);
unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
device_del(&cxlr->dev);
/*
@@ -2386,14 +2391,23 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
return true;
}
static int cxl_region_nid(struct cxl_region *cxlr)
{
struct cxl_region_params *p = &cxlr->params;
struct resource *res;
guard(rwsem_read)(&cxl_region_rwsem);
res = p->res;
if (!res)
return NUMA_NO_NODE;
return phys_to_target_node(res->start);
}
static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
unsigned long action, void *arg)
{
struct cxl_region *cxlr = container_of(nb, struct cxl_region,
memory_notifier);
struct cxl_region_params *p = &cxlr->params;
struct cxl_endpoint_decoder *cxled = p->targets[0];
struct cxl_decoder *cxld = &cxled->cxld;
struct memory_notify *mnb = arg;
int nid = mnb->status_change_nid;
int region_nid;
@@ -2401,7 +2415,7 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
return NOTIFY_DONE;
region_nid = phys_to_target_node(cxld->hpa_range.start);
region_nid = cxl_region_nid(cxlr);
if (nid != region_nid)
return NOTIFY_DONE;
@@ -2411,6 +2425,27 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
return NOTIFY_OK;
}
static int cxl_region_calculate_adistance(struct notifier_block *nb,
unsigned long nid, void *data)
{
struct cxl_region *cxlr = container_of(nb, struct cxl_region,
adist_notifier);
struct access_coordinate *perf;
int *adist = data;
int region_nid;
region_nid = cxl_region_nid(cxlr);
if (nid != region_nid)
return NOTIFY_OK;
perf = &cxlr->coord[ACCESS_COORDINATE_CPU];
if (mt_perf_to_adistance(perf, adist))
return NOTIFY_OK;
return NOTIFY_STOP;
}
/**
* devm_cxl_add_region - Adds a region to a decoder
* @cxlrd: root decoder
@@ -2453,6 +2488,10 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
cxlr->memory_notifier.priority = CXL_CALLBACK_PRI;
register_memory_notifier(&cxlr->memory_notifier);
cxlr->adist_notifier.notifier_call = cxl_region_calculate_adistance;
cxlr->adist_notifier.priority = 100;
register_mt_adistance_algorithm(&cxlr->adist_notifier);
rc = devm_add_action_or_reset(port->uport_dev, unregister_region, cxlr);
if (rc)
return ERR_PTR(rc);
@@ -2816,20 +2855,13 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
return ctx.cxlr;
}
static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos)
static bool cxl_is_hpa_in_chunk(u64 hpa, struct cxl_region *cxlr, int pos)
{
struct cxl_region_params *p = &cxlr->params;
int gran = p->interleave_granularity;
int ways = p->interleave_ways;
u64 offset;
/* Is the hpa within this region at all */
if (hpa < p->res->start || hpa > p->res->end) {
dev_dbg(&cxlr->dev,
"Addr trans fail: hpa 0x%llx not in region\n", hpa);
return false;
}
/* Is the hpa in an expected chunk for its pos(-ition) */
offset = hpa - p->res->start;
offset = do_div(offset, gran * ways);
@@ -2842,15 +2874,26 @@ static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos)
return false;
}
static u64 cxl_dpa_to_hpa(u64 dpa, struct cxl_region *cxlr,
struct cxl_endpoint_decoder *cxled)
u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa)
{
struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
u64 dpa_offset, hpa_offset, bits_upper, mask_upper, hpa;
struct cxl_region_params *p = &cxlr->params;
int pos = cxled->pos;
struct cxl_endpoint_decoder *cxled = NULL;
u16 eig = 0;
u8 eiw = 0;
int pos;
for (int i = 0; i < p->nr_targets; i++) {
cxled = p->targets[i];
if (cxlmd == cxled_to_memdev(cxled))
break;
}
if (!cxled || cxlmd != cxled_to_memdev(cxled))
return ULLONG_MAX;
pos = cxled->pos;
ways_to_eiw(p->interleave_ways, &eiw);
granularity_to_eig(p->interleave_granularity, &eig);
@@ -2884,29 +2927,23 @@ static u64 cxl_dpa_to_hpa(u64 dpa, struct cxl_region *cxlr,
/* Apply the hpa_offset to the region base address */
hpa = hpa_offset + p->res->start;
if (!cxl_is_hpa_in_range(hpa, cxlr, cxled->pos))
/* Root decoder translation overrides typical modulo decode */
if (cxlrd->hpa_to_spa)
hpa = cxlrd->hpa_to_spa(cxlrd, hpa);
if (hpa < p->res->start || hpa > p->res->end) {
dev_dbg(&cxlr->dev,
"Addr trans fail: hpa 0x%llx not in region\n", hpa);
return ULLONG_MAX;
}
/* Simple chunk check, by pos & gran, only applies to modulo decodes */
if (!cxlrd->hpa_to_spa && (!cxl_is_hpa_in_chunk(hpa, cxlr, pos)))
return ULLONG_MAX;
return hpa;
}
u64 cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa)
{
struct cxl_region_params *p = &cxlr->params;
struct cxl_endpoint_decoder *cxled = NULL;
for (int i = 0; i < p->nr_targets; i++) {
cxled = p->targets[i];
if (cxlmd == cxled_to_memdev(cxled))
break;
}
if (!cxled || cxlmd != cxled_to_memdev(cxled))
return ULLONG_MAX;
return cxl_dpa_to_hpa(dpa, cxlr, cxled);
}
static struct lock_class_key cxl_pmem_region_key;
static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
+18 -18
View File
@@ -340,23 +340,23 @@ TRACE_EVENT(cxl_general_media,
),
TP_fast_assign(
CXL_EVT_TP_fast_assign(cxlmd, log, rec->hdr);
CXL_EVT_TP_fast_assign(cxlmd, log, rec->media_hdr.hdr);
__entry->hdr_uuid = CXL_EVENT_GEN_MEDIA_UUID;
/* General Media */
__entry->dpa = le64_to_cpu(rec->phys_addr);
__entry->dpa = le64_to_cpu(rec->media_hdr.phys_addr);
__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
/* Mask after flags have been parsed */
__entry->dpa &= CXL_DPA_MASK;
__entry->descriptor = rec->descriptor;
__entry->type = rec->type;
__entry->transaction_type = rec->transaction_type;
__entry->channel = rec->channel;
__entry->rank = rec->rank;
__entry->descriptor = rec->media_hdr.descriptor;
__entry->type = rec->media_hdr.type;
__entry->transaction_type = rec->media_hdr.transaction_type;
__entry->channel = rec->media_hdr.channel;
__entry->rank = rec->media_hdr.rank;
__entry->device = get_unaligned_le24(rec->device);
memcpy(__entry->comp_id, &rec->component_id,
CXL_EVENT_GEN_MED_COMP_ID_SIZE);
__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
__entry->validity_flags = get_unaligned_le16(&rec->media_hdr.validity_flags);
__entry->hpa = hpa;
if (cxlr) {
__assign_str(region_name);
@@ -440,19 +440,19 @@ TRACE_EVENT(cxl_dram,
),
TP_fast_assign(
CXL_EVT_TP_fast_assign(cxlmd, log, rec->hdr);
CXL_EVT_TP_fast_assign(cxlmd, log, rec->media_hdr.hdr);
__entry->hdr_uuid = CXL_EVENT_DRAM_UUID;
/* DRAM */
__entry->dpa = le64_to_cpu(rec->phys_addr);
__entry->dpa = le64_to_cpu(rec->media_hdr.phys_addr);
__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
__entry->dpa &= CXL_DPA_MASK;
__entry->descriptor = rec->descriptor;
__entry->type = rec->type;
__entry->transaction_type = rec->transaction_type;
__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
__entry->channel = rec->channel;
__entry->rank = rec->rank;
__entry->descriptor = rec->media_hdr.descriptor;
__entry->type = rec->media_hdr.type;
__entry->transaction_type = rec->media_hdr.transaction_type;
__entry->validity_flags = get_unaligned_le16(rec->media_hdr.validity_flags);
__entry->channel = rec->media_hdr.channel;
__entry->rank = rec->media_hdr.rank;
__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
__entry->bank_group = rec->bank_group;
__entry->bank = rec->bank;
@@ -704,8 +704,8 @@ TRACE_EVENT(cxl_poison,
if (cxlr) {
__assign_str(region);
memcpy(__entry->uuid, &cxlr->params.uuid, 16);
__entry->hpa = cxl_trace_hpa(cxlr, cxlmd,
__entry->dpa);
__entry->hpa = cxl_dpa_to_hpa(cxlr, cxlmd,
__entry->dpa);
} else {
__assign_str(region);
memset(__entry->uuid, 0, 16);
+6 -7
View File
@@ -434,14 +434,13 @@ struct cxl_switch_decoder {
};
struct cxl_root_decoder;
typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd,
int pos);
typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
/**
* struct cxl_root_decoder - Static platform CXL address decoder
* @res: host / parent resource for region allocations
* @region_id: region id for next region provisioning event
* @calc_hb: which host bridge covers the n'th position by granularity
* @hpa_to_spa: translate CXL host-physical-address to Platform system-physical-address
* @platform_data: platform specific configuration data
* @range_lock: sync region autodiscovery by address range
* @qos_class: QoS performance class cookie
@@ -450,7 +449,7 @@ typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd,
struct cxl_root_decoder {
struct resource *res;
atomic_t region_id;
cxl_calc_hb_fn calc_hb;
cxl_hpa_to_spa_fn hpa_to_spa;
void *platform_data;
struct mutex range_lock;
int qos_class;
@@ -524,6 +523,7 @@ struct cxl_region_params {
* @params: active + config params for the region
* @coord: QoS access coordinates for the region
* @memory_notifier: notifier for setting the access coordinates to node
* @adist_notifier: notifier for calculating the abstract distance of node
*/
struct cxl_region {
struct device dev;
@@ -536,6 +536,7 @@ struct cxl_region {
struct cxl_region_params params;
struct access_coordinate coord[ACCESS_COORDINATE_MAX];
struct notifier_block memory_notifier;
struct notifier_block adist_notifier;
};
struct cxl_nvdimm_bridge {
@@ -774,9 +775,7 @@ bool is_root_decoder(struct device *dev);
bool is_switch_decoder(struct device *dev);
bool is_endpoint_decoder(struct device *dev);
struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets,
cxl_calc_hb_fn calc_hb);
struct cxl_dport *cxl_hb_modulo(struct cxl_root_decoder *cxlrd, int pos);
unsigned int nr_targets);
struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets);
int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
+2 -2
View File
@@ -161,7 +161,7 @@ struct cxl_mbox_cmd {
C(FWRESET, -ENXIO, "FW failed to activate, needs cold reset"), \
C(HANDLE, -ENXIO, "one or more Event Record Handles were invalid"), \
C(PADDR, -EFAULT, "physical address specified is invalid"), \
C(POISONLMT, -ENXIO, "poison injection limit has been reached"), \
C(POISONLMT, -EBUSY, "poison injection limit has been reached"), \
C(MEDIAFAILURE, -ENXIO, "permanent issue with the media"), \
C(ABORT, -ENXIO, "background cmd was aborted by device"), \
C(SECURITY, -ENXIO, "not valid in the current security state"), \
@@ -563,7 +563,7 @@ enum cxl_opcode {
0x3b, 0x3f, 0x17)
#define DEFINE_CXL_VENDOR_DEBUG_UUID \
UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f, 0xd6, 0x07, 0x19, \
UUID_INIT(0x5e1819d9, 0x11a9, 0x400c, 0x81, 0x1f, 0xd6, 0x07, 0x19, \
0x40, 0x3d, 0x86)
struct cxl_mbox_get_supported_logs {
+1
View File
@@ -253,6 +253,7 @@ static struct cxl_driver cxl_mem_driver = {
module_cxl_driver(cxl_mem_driver);
MODULE_DESCRIPTION("CXL: Memory Expansion");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER);
+1
View File
@@ -1066,5 +1066,6 @@ static void __exit cxl_pci_driver_exit(void)
module_init(cxl_pci_driver_init);
module_exit(cxl_pci_driver_exit);
MODULE_DESCRIPTION("CXL: PCI manageability");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
+1
View File
@@ -453,6 +453,7 @@ static __exit void cxl_pmem_exit(void)
cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
}
MODULE_DESCRIPTION("CXL PMEM: Persistent Memory Support");
MODULE_LICENSE("GPL v2");
module_init(cxl_pmem_init);
module_exit(cxl_pmem_exit);
+1
View File
@@ -209,6 +209,7 @@ static struct cxl_driver cxl_port_driver = {
};
module_cxl_driver(cxl_port_driver);
MODULE_DESCRIPTION("CXL: Port enumeration and services");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
MODULE_ALIAS_CXL(CXL_DEVICE_PORT);
+2 -2
View File
@@ -109,8 +109,8 @@ static const u32 knl_interleave_list[] = {
0x104, 0x10c, 0x114, 0x11c, /* 20-23 */
};
#define MAX_INTERLEAVE \
(max_t(unsigned int, ARRAY_SIZE(sbridge_interleave_list), \
max_t(unsigned int, ARRAY_SIZE(ibridge_interleave_list), \
(MAX_T(unsigned int, ARRAY_SIZE(sbridge_interleave_list), \
MAX_T(unsigned int, ARRAY_SIZE(ibridge_interleave_list), \
ARRAY_SIZE(knl_interleave_list))))
struct interleave_pkg {
+1 -1
View File
@@ -532,7 +532,7 @@ int drm_plane_create_color_properties(struct drm_plane *plane,
{
struct drm_device *dev = plane->dev;
struct drm_property *prop;
struct drm_prop_enum_list enum_list[max_t(int, DRM_COLOR_ENCODING_MAX,
struct drm_prop_enum_list enum_list[MAX_T(int, DRM_COLOR_ENCODING_MAX,
DRM_COLOR_RANGE_MAX)];
int i, len;
+3 -3
View File
@@ -1776,7 +1776,7 @@ static void integrity_metadata(struct work_struct *w)
struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct dm_integrity_io));
char *checksums;
unsigned int extra_space = unlikely(digest_size > ic->tag_size) ? digest_size - ic->tag_size : 0;
char checksums_onstack[max_t(size_t, HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
char checksums_onstack[MAX_T(size_t, HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
sector_t sector;
unsigned int sectors_to_process;
@@ -2064,7 +2064,7 @@ retry_kmap:
} while (++s < ic->sectors_per_block);
#ifdef INTERNAL_VERIFY
if (ic->internal_hash) {
char checksums_onstack[max_t(size_t, HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
char checksums_onstack[MAX_T(size_t, HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
integrity_sector_checksum(ic, logical_sector, mem + bv.bv_offset, checksums_onstack);
if (unlikely(memcmp(checksums_onstack, journal_entry_tag(ic, je), ic->tag_size))) {
@@ -2837,7 +2837,7 @@ static void do_journal_write(struct dm_integrity_c *ic, unsigned int write_start
unlikely(from_replay) &&
#endif
ic->internal_hash) {
char test_tag[max_t(size_t, HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
char test_tag[MAX_T(size_t, HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
integrity_sector_checksum(ic, sec + ((l - j) << ic->sb->log2_sectors_per_block),
(char *)access_journal_data(ic, i, l), test_tag);
+5 -4
View File
@@ -390,7 +390,8 @@ int ubiblock_create(struct ubi_volume_info *vi)
ret = blk_mq_alloc_tag_set(&dev->tag_set);
if (ret) {
dev_err(disk_to_dev(dev->gd), "blk_mq_alloc_tag_set failed");
pr_err("ubiblock%d_%d: blk_mq_alloc_tag_set failed\n",
dev->ubi_num, dev->vol_id);
goto out_free_dev;
}
@@ -407,8 +408,8 @@ int ubiblock_create(struct ubi_volume_info *vi)
gd->minors = 1;
gd->first_minor = idr_alloc(&ubiblock_minor_idr, dev, 0, 0, GFP_KERNEL);
if (gd->first_minor < 0) {
dev_err(disk_to_dev(gd),
"block: dynamic minor allocation failed");
pr_err("ubiblock%d_%d: block: dynamic minor allocation failed\n",
dev->ubi_num, dev->vol_id);
ret = -ENODEV;
goto out_cleanup_disk;
}
@@ -669,7 +670,7 @@ err_unreg:
return ret;
}
void __exit ubiblock_exit(void)
void ubiblock_exit(void)
{
ubi_unregister_volume_notifier(&ubiblock_notifier);
ubiblock_remove_all();
+5 -2
View File
@@ -112,7 +112,7 @@ static struct attribute *ubi_class_attrs[] = {
ATTRIBUTE_GROUPS(ubi_class);
/* Root UBI "class" object (corresponds to '/<sysfs>/class/ubi/') */
struct class ubi_class = {
const struct class ubi_class = {
.name = UBI_NAME_STR,
.class_groups = ubi_class_groups,
};
@@ -1372,7 +1372,7 @@ static int __init ubi_init(void)
/* See comment above re-ubi_is_module(). */
if (ubi_is_module())
goto out_slab;
goto out_debugfs;
}
register_mtd_user(&ubi_mtd_notifier);
@@ -1387,6 +1387,9 @@ static int __init ubi_init(void)
out_mtd_notifier:
unregister_mtd_user(&ubi_mtd_notifier);
ubiblock_exit();
out_debugfs:
ubi_debugfs_exit();
out_slab:
kmem_cache_destroy(ubi_wl_entry_slab);
out_dev_unreg:
+2 -2
View File
@@ -598,9 +598,9 @@ int ubi_debugfs_init_dev(struct ubi_device *ubi)
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
n = snprintf(d->dfs_dir_name, UBI_DFS_DIR_LEN + 1, UBI_DFS_DIR_NAME,
n = snprintf(d->dfs_dir_name, UBI_DFS_DIR_LEN, UBI_DFS_DIR_NAME,
ubi->ubi_num);
if (n > UBI_DFS_DIR_LEN) {
if (n >= UBI_DFS_DIR_LEN) {
/* The array size is too small */
return -EINVAL;
}
+2 -1
View File
@@ -1564,6 +1564,7 @@ int self_check_eba(struct ubi_device *ubi, struct ubi_attach_info *ai_fastmap,
GFP_KERNEL);
if (!fm_eba[i]) {
ret = -ENOMEM;
kfree(scan_eba[i]);
goto out_free;
}
@@ -1599,7 +1600,7 @@ int self_check_eba(struct ubi_device *ubi, struct ubi_attach_info *ai_fastmap,
}
out_free:
for (i = 0; i < num_volumes; i++) {
while (--i >= 0) {
if (!ubi->volumes[i])
continue;
+3 -3
View File
@@ -6,7 +6,6 @@
/* UBI NVMEM provider */
#include "ubi.h"
#include <linux/nvmem-provider.h>
#include <asm/div64.h>
/* List of all NVMEM devices */
static LIST_HEAD(nvmem_devices);
@@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from,
struct ubi_nvmem *unv = priv;
struct ubi_volume_desc *desc;
uint32_t offs;
uint64_t lnum = from;
uint32_t lnum;
int err = 0;
desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY);
if (IS_ERR(desc))
return PTR_ERR(desc);
offs = do_div(lnum, unv->usable_leb_size);
offs = from % unv->usable_leb_size;
lnum = from / unv->usable_leb_size;
while (bytes_left) {
to_read = unv->usable_leb_size - offs;
+2 -2
View File
@@ -420,7 +420,7 @@ struct ubi_debug_info {
unsigned int power_cut_min;
unsigned int power_cut_max;
unsigned int emulate_failures;
char dfs_dir_name[UBI_DFS_DIR_LEN + 1];
char dfs_dir_name[UBI_DFS_DIR_LEN];
struct dentry *dfs_dir;
struct dentry *dfs_chk_gen;
struct dentry *dfs_chk_io;
@@ -814,7 +814,7 @@ extern struct kmem_cache *ubi_wl_entry_slab;
extern const struct file_operations ubi_ctrl_cdev_operations;
extern const struct file_operations ubi_cdev_operations;
extern const struct file_operations ubi_vol_cdev_operations;
extern struct class ubi_class;
extern const struct class ubi_class;
extern struct mutex ubi_devices_mutex;
extern struct blocking_notifier_head ubi_notifiers;
@@ -2912,7 +2912,7 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
u32 channels_to_check = tx_channel_count > rx_channel_count ?
tx_channel_count : rx_channel_count;
u32 chan;
int status[max_t(u32, MTL_MAX_TX_QUEUES, MTL_MAX_RX_QUEUES)];
int status[MAX_T(u32, MTL_MAX_TX_QUEUES, MTL_MAX_RX_QUEUES)];
/* Make sure we never check beyond our status buffer. */
if (WARN_ON_ONCE(channels_to_check > ARRAY_SIZE(status)))
+7 -1
View File
@@ -1876,12 +1876,18 @@ static void nvme_configure_pi_elbas(struct nvme_ns_head *head,
struct nvme_id_ns *id, struct nvme_id_ns_nvm *nvm)
{
u32 elbaf = le32_to_cpu(nvm->elbaf[nvme_lbaf_index(id->flbas)]);
u8 guard_type;
/* no support for storage tag formats right now */
if (nvme_elbaf_sts(elbaf))
return;
head->guard_type = nvme_elbaf_guard_type(elbaf);
guard_type = nvme_elbaf_guard_type(elbaf);
if ((nvm->pic & NVME_ID_NS_NVM_QPIFS) &&
guard_type == NVME_NVM_NS_QTYPE_GUARD)
guard_type = nvme_elbaf_qualified_guard_type(elbaf);
head->guard_type = guard_type;
switch (head->guard_type) {
case NVME_NVM_NS_64B_GUARD:
head->pi_size = sizeof(struct crc64_pi_tuple);
+2 -2
View File
@@ -1403,10 +1403,10 @@ static void __nvmf_concat_opt_tokens(struct seq_file *seq_file)
tok = &opt_tokens[idx];
if (tok->token == NVMF_OPT_ERR)
continue;
seq_puts(seq_file, ",");
seq_putc(seq_file, ',');
seq_puts(seq_file, tok->pattern);
}
seq_puts(seq_file, "\n");
seq_putc(seq_file, '\n');
}
static int nvmf_dev_show(struct seq_file *seq_file, void *private)
+10 -2
View File
@@ -863,7 +863,8 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req)
nvme_start_request(req);
return BLK_STS_OK;
out_unmap_data:
nvme_unmap_data(dev, req);
if (blk_rq_nr_phys_segments(req))
nvme_unmap_data(dev, req);
out_free_cmd:
nvme_cleanup_cmd(req);
return ret;
@@ -1309,7 +1310,7 @@ static void nvme_warn_reset(struct nvme_dev *dev, u32 csts)
dev_warn(dev->ctrl.device,
"Does your device have a faulty power saving mode enabled?\n");
dev_warn(dev->ctrl.device,
"Try \"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off\" and report a bug\n");
"Try \"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off\" and report a bug\n");
}
static enum blk_eh_timer_return nvme_timeout(struct request *req)
@@ -2968,6 +2969,13 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
return NVME_QUIRK_FORCE_NO_SIMPLE_SUSPEND;
}
/*
* NVMe SSD drops off the PCIe bus after system idle
* for 10 hours on a Lenovo N60z board.
*/
if (dmi_match(DMI_BOARD_NAME, "LXKT-ZXEG-N6"))
return NVME_QUIRK_NO_APST;
return 0;
}
+2 -3
View File
@@ -233,13 +233,12 @@ static ssize_t nuse_show(struct device *dev, struct device_attribute *attr,
{
struct nvme_ns_head *head = dev_to_ns_head(dev);
struct gendisk *disk = dev_to_disk(dev);
struct block_device *bdev = disk->part0;
int ret;
if (nvme_disk_is_ns_head(bdev->bd_disk))
if (nvme_disk_is_ns_head(disk))
ret = ns_head_update_nuse(head);
else
ret = ns_update_nuse(bdev->bd_disk->private_data);
ret = ns_update_nuse(disk->private_data);
if (ret)
return ret;
+55 -10
View File
@@ -17,6 +17,7 @@
#include <linux/writeback.h>
#include <linux/mount.h>
#include <linux/fs_context.h>
#include <linux/fs_parser.h>
#include <linux/namei.h>
#include "hostfs.h"
#include <init.h>
@@ -929,7 +930,6 @@ static const struct inode_operations hostfs_link_iops = {
static int hostfs_fill_super(struct super_block *sb, struct fs_context *fc)
{
struct hostfs_fs_info *fsi = sb->s_fs_info;
const char *host_root = fc->source;
struct inode *root_inode;
int err;
@@ -943,15 +943,6 @@ static int hostfs_fill_super(struct super_block *sb, struct fs_context *fc)
if (err)
return err;
/* NULL is printed as '(null)' by printf(): avoid that. */
if (fc->source == NULL)
host_root = "";
fsi->host_root_path =
kasprintf(GFP_KERNEL, "%s/%s", root_ino, host_root);
if (fsi->host_root_path == NULL)
return -ENOMEM;
root_inode = hostfs_iget(sb, fsi->host_root_path);
if (IS_ERR(root_inode))
return PTR_ERR(root_inode);
@@ -977,6 +968,58 @@ static int hostfs_fill_super(struct super_block *sb, struct fs_context *fc)
return 0;
}
enum hostfs_parma {
Opt_hostfs,
};
static const struct fs_parameter_spec hostfs_param_specs[] = {
fsparam_string_empty("hostfs", Opt_hostfs),
{}
};
static int hostfs_parse_param(struct fs_context *fc, struct fs_parameter *param)
{
struct hostfs_fs_info *fsi = fc->s_fs_info;
struct fs_parse_result result;
char *host_root;
int opt;
opt = fs_parse(fc, hostfs_param_specs, param, &result);
if (opt < 0)
return opt;
switch (opt) {
case Opt_hostfs:
host_root = param->string;
if (!*host_root)
host_root = "";
fsi->host_root_path =
kasprintf(GFP_KERNEL, "%s/%s", root_ino, host_root);
if (fsi->host_root_path == NULL)
return -ENOMEM;
break;
}
return 0;
}
static int hostfs_parse_monolithic(struct fs_context *fc, void *data)
{
struct hostfs_fs_info *fsi = fc->s_fs_info;
char *host_root = (char *)data;
/* NULL is printed as '(null)' by printf(): avoid that. */
if (host_root == NULL)
host_root = "";
fsi->host_root_path =
kasprintf(GFP_KERNEL, "%s/%s", root_ino, host_root);
if (fsi->host_root_path == NULL)
return -ENOMEM;
return 0;
}
static int hostfs_fc_get_tree(struct fs_context *fc)
{
return get_tree_nodev(fc, hostfs_fill_super);
@@ -994,6 +1037,8 @@ static void hostfs_fc_free(struct fs_context *fc)
}
static const struct fs_context_operations hostfs_context_ops = {
.parse_monolithic = hostfs_parse_monolithic,
.parse_param = hostfs_parse_param,
.get_tree = hostfs_fc_get_tree,
.free = hostfs_fc_free,
};
+4 -4
View File
@@ -1894,12 +1894,12 @@ init_cifs(void)
WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
if (!serverclose_wq) {
rc = -ENOMEM;
goto out_destroy_serverclose_wq;
goto out_destroy_deferredclose_wq;
}
rc = cifs_init_inodecache();
if (rc)
goto out_destroy_deferredclose_wq;
goto out_destroy_serverclose_wq;
rc = cifs_init_netfs();
if (rc)
@@ -1967,6 +1967,8 @@ out_destroy_netfs:
cifs_destroy_netfs();
out_destroy_inodecache:
cifs_destroy_inodecache();
out_destroy_serverclose_wq:
destroy_workqueue(serverclose_wq);
out_destroy_deferredclose_wq:
destroy_workqueue(deferredclose_wq);
out_destroy_cifsoplockd_wq:
@@ -1977,8 +1979,6 @@ out_destroy_decrypt_wq:
destroy_workqueue(decrypt_wq);
out_destroy_cifsiod_wq:
destroy_workqueue(cifsiod_wq);
out_destroy_serverclose_wq:
destroy_workqueue(serverclose_wq);
out_clean_proc:
cifs_proc_clean();
return rc;
+23 -1
View File
@@ -2614,6 +2614,13 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
cifs_dbg(VFS, "Server does not support mounting with posix SMB3.11 extensions\n");
rc = -EOPNOTSUPP;
goto out_fail;
} else if (ses->server->vals->protocol_id == SMB10_PROT_ID)
if (cap_unix(ses))
cifs_dbg(FYI, "Unix Extensions requested on SMB1 mount\n");
else {
cifs_dbg(VFS, "SMB1 Unix Extensions not supported by server\n");
rc = -EOPNOTSUPP;
goto out_fail;
} else {
cifs_dbg(VFS,
"Check vers= mount option. SMB3.11 disabled but required for POSIX extensions\n");
@@ -3686,6 +3693,7 @@ error:
}
#endif
#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
/*
* Issue a TREE_CONNECT request.
*/
@@ -3807,11 +3815,25 @@ CIFSTCon(const unsigned int xid, struct cifs_ses *ses,
else
tcon->Flags = 0;
cifs_dbg(FYI, "Tcon flags: 0x%x\n", tcon->Flags);
}
/*
* reset_cifs_unix_caps calls QFSInfo which requires
* need_reconnect to be false, but we would not need to call
* reset_caps if this were not a reconnect case so must check
* need_reconnect flag here. The caller will also clear
* need_reconnect when tcon was successful but needed to be
* cleared earlier in the case of unix extensions reconnect
*/
if (tcon->need_reconnect && tcon->unix_ext) {
cifs_dbg(FYI, "resetting caps for %s\n", tcon->tree_name);
tcon->need_reconnect = false;
reset_cifs_unix_caps(xid, tcon, NULL, NULL);
}
}
cifs_buf_release(smb_buffer);
return rc;
}
#endif /* CONFIG_CIFS_ALLOW_INSECURE_LEGACY */
static void delayed_free(struct rcu_head *p)
{
+19 -1
View File
@@ -1812,6 +1812,10 @@ smb2_copychunk_range(const unsigned int xid,
tcon = tlink_tcon(trgtfile->tlink);
trace_smb3_copychunk_enter(xid, srcfile->fid.volatile_fid,
trgtfile->fid.volatile_fid, tcon->tid,
tcon->ses->Suid, src_off, dest_off, len);
while (len > 0) {
pcchunk->SourceOffset = cpu_to_le64(src_off);
pcchunk->TargetOffset = cpu_to_le64(dest_off);
@@ -1863,6 +1867,9 @@ smb2_copychunk_range(const unsigned int xid,
le32_to_cpu(retbuf->ChunksWritten),
le32_to_cpu(retbuf->ChunkBytesWritten),
bytes_written);
trace_smb3_copychunk_done(xid, srcfile->fid.volatile_fid,
trgtfile->fid.volatile_fid, tcon->tid,
tcon->ses->Suid, src_off, dest_off, len);
} else if (rc == -EINVAL) {
if (ret_data_len != sizeof(struct copychunk_ioctl_rsp))
goto cchunk_out;
@@ -2046,7 +2053,9 @@ smb2_duplicate_extents(const unsigned int xid,
dup_ext_buf.ByteCount = cpu_to_le64(len);
cifs_dbg(FYI, "Duplicate extents: src off %lld dst off %lld len %lld\n",
src_off, dest_off, len);
trace_smb3_clone_enter(xid, srcfile->fid.volatile_fid,
trgtfile->fid.volatile_fid, tcon->tid,
tcon->ses->Suid, src_off, dest_off, len);
inode = d_inode(trgtfile->dentry);
if (inode->i_size < dest_off + len) {
rc = smb2_set_file_size(xid, tcon, trgtfile, dest_off + len, false);
@@ -2075,6 +2084,15 @@ smb2_duplicate_extents(const unsigned int xid,
cifs_dbg(FYI, "Non-zero response length in duplicate extents\n");
duplicate_extents_out:
if (rc)
trace_smb3_clone_err(xid, srcfile->fid.volatile_fid,
trgtfile->fid.volatile_fid,
tcon->tid, tcon->ses->Suid, src_off,
dest_off, len, rc);
else
trace_smb3_clone_done(xid, srcfile->fid.volatile_fid,
trgtfile->fid.volatile_fid, tcon->tid,
tcon->ses->Suid, src_off, dest_off, len);
return rc;
}
+7 -1
View File
@@ -1562,8 +1562,14 @@ SMB2_sess_sendreceive(struct SMB2_sess_data *sess_data)
cifs_small_buf_release(sess_data->iov[0].iov_base);
if (rc == 0)
sess_data->ses->expired_pwd = false;
else if ((rc == -EACCES) || (rc == -EKEYEXPIRED) || (rc == -EKEYREVOKED))
else if ((rc == -EACCES) || (rc == -EKEYEXPIRED) || (rc == -EKEYREVOKED)) {
if (sess_data->ses->expired_pwd == false)
trace_smb3_key_expired(sess_data->server->hostname,
sess_data->ses->user_name,
sess_data->server->conn_id,
&sess_data->server->dstaddr, rc);
sess_data->ses->expired_pwd = true;
}
memcpy(&sess_data->iov[0], &rsp_iov, sizeof(struct kvec));
+150
View File
@@ -206,6 +206,116 @@ DEFINE_SMB3_OTHER_ERR_EVENT(query_dir_err);
DEFINE_SMB3_OTHER_ERR_EVENT(zero_err);
DEFINE_SMB3_OTHER_ERR_EVENT(falloc_err);
/*
* For logging errors in reflink and copy_range ops e.g. smb2_copychunk_range
* and smb2_duplicate_extents
*/
DECLARE_EVENT_CLASS(smb3_copy_range_err_class,
TP_PROTO(unsigned int xid,
__u64 src_fid,
__u64 target_fid,
__u32 tid,
__u64 sesid,
__u64 src_offset,
__u64 target_offset,
__u32 len,
int rc),
TP_ARGS(xid, src_fid, target_fid, tid, sesid, src_offset, target_offset, len, rc),
TP_STRUCT__entry(
__field(unsigned int, xid)
__field(__u64, src_fid)
__field(__u64, target_fid)
__field(__u32, tid)
__field(__u64, sesid)
__field(__u64, src_offset)
__field(__u64, target_offset)
__field(__u32, len)
__field(int, rc)
),
TP_fast_assign(
__entry->xid = xid;
__entry->src_fid = src_fid;
__entry->target_fid = target_fid;
__entry->tid = tid;
__entry->sesid = sesid;
__entry->src_offset = src_offset;
__entry->target_offset = target_offset;
__entry->len = len;
__entry->rc = rc;
),
TP_printk("\txid=%u sid=0x%llx tid=0x%x source fid=0x%llx source offset=0x%llx target fid=0x%llx target offset=0x%llx len=0x%x rc=%d",
__entry->xid, __entry->sesid, __entry->tid, __entry->target_fid,
__entry->src_offset, __entry->target_fid, __entry->target_offset, __entry->len, __entry->rc)
)
#define DEFINE_SMB3_COPY_RANGE_ERR_EVENT(name) \
DEFINE_EVENT(smb3_copy_range_err_class, smb3_##name, \
TP_PROTO(unsigned int xid, \
__u64 src_fid, \
__u64 target_fid, \
__u32 tid, \
__u64 sesid, \
__u64 src_offset, \
__u64 target_offset, \
__u32 len, \
int rc), \
TP_ARGS(xid, src_fid, target_fid, tid, sesid, src_offset, target_offset, len, rc))
DEFINE_SMB3_COPY_RANGE_ERR_EVENT(clone_err);
/* TODO: Add SMB3_COPY_RANGE_ERR_EVENT(copychunk_err) */
DECLARE_EVENT_CLASS(smb3_copy_range_done_class,
TP_PROTO(unsigned int xid,
__u64 src_fid,
__u64 target_fid,
__u32 tid,
__u64 sesid,
__u64 src_offset,
__u64 target_offset,
__u32 len),
TP_ARGS(xid, src_fid, target_fid, tid, sesid, src_offset, target_offset, len),
TP_STRUCT__entry(
__field(unsigned int, xid)
__field(__u64, src_fid)
__field(__u64, target_fid)
__field(__u32, tid)
__field(__u64, sesid)
__field(__u64, src_offset)
__field(__u64, target_offset)
__field(__u32, len)
),
TP_fast_assign(
__entry->xid = xid;
__entry->src_fid = src_fid;
__entry->target_fid = target_fid;
__entry->tid = tid;
__entry->sesid = sesid;
__entry->src_offset = src_offset;
__entry->target_offset = target_offset;
__entry->len = len;
),
TP_printk("\txid=%u sid=0x%llx tid=0x%x source fid=0x%llx source offset=0x%llx target fid=0x%llx target offset=0x%llx len=0x%x",
__entry->xid, __entry->sesid, __entry->tid, __entry->target_fid,
__entry->src_offset, __entry->target_fid, __entry->target_offset, __entry->len)
)
#define DEFINE_SMB3_COPY_RANGE_DONE_EVENT(name) \
DEFINE_EVENT(smb3_copy_range_done_class, smb3_##name, \
TP_PROTO(unsigned int xid, \
__u64 src_fid, \
__u64 target_fid, \
__u32 tid, \
__u64 sesid, \
__u64 src_offset, \
__u64 target_offset, \
__u32 len), \
TP_ARGS(xid, src_fid, target_fid, tid, sesid, src_offset, target_offset, len))
DEFINE_SMB3_COPY_RANGE_DONE_EVENT(copychunk_enter);
DEFINE_SMB3_COPY_RANGE_DONE_EVENT(clone_enter);
DEFINE_SMB3_COPY_RANGE_DONE_EVENT(copychunk_done);
DEFINE_SMB3_COPY_RANGE_DONE_EVENT(clone_done);
/* For logging successful read or write */
DECLARE_EVENT_CLASS(smb3_rw_done_class,
@@ -1171,6 +1281,46 @@ DEFINE_EVENT(smb3_connect_err_class, smb3_##name, \
DEFINE_SMB3_CONNECT_ERR_EVENT(connect_err);
DECLARE_EVENT_CLASS(smb3_sess_setup_err_class,
TP_PROTO(char *hostname, char *username, __u64 conn_id,
const struct __kernel_sockaddr_storage *dst_addr, int rc),
TP_ARGS(hostname, username, conn_id, dst_addr, rc),
TP_STRUCT__entry(
__string(hostname, hostname)
__string(username, username)
__field(__u64, conn_id)
__array(__u8, dst_addr, sizeof(struct sockaddr_storage))
__field(int, rc)
),
TP_fast_assign(
struct sockaddr_storage *pss = NULL;
__entry->conn_id = conn_id;
__entry->rc = rc;
pss = (struct sockaddr_storage *)__entry->dst_addr;
*pss = *dst_addr;
__assign_str(hostname);
__assign_str(username);
),
TP_printk("rc=%d user=%s conn_id=0x%llx server=%s addr=%pISpsfc",
__entry->rc,
__get_str(username),
__entry->conn_id,
__get_str(hostname),
__entry->dst_addr)
)
#define DEFINE_SMB3_SES_SETUP_ERR_EVENT(name) \
DEFINE_EVENT(smb3_sess_setup_err_class, smb3_##name, \
TP_PROTO(char *hostname, \
char *username, \
__u64 conn_id, \
const struct __kernel_sockaddr_storage *addr, \
int rc), \
TP_ARGS(hostname, username, conn_id, addr, rc))
DEFINE_SMB3_SES_SETUP_ERR_EVENT(key_expired);
DECLARE_EVENT_CLASS(smb3_reconnect_class,
TP_PROTO(__u64 currmid,
__u64 conn_id,
+11
View File
@@ -736,6 +736,17 @@ struct super_block *sget_fc(struct fs_context *fc,
struct user_namespace *user_ns = fc->global ? &init_user_ns : fc->user_ns;
int err;
/*
* Never allow s_user_ns != &init_user_ns when FS_USERNS_MOUNT is
* not set, as the filesystem is likely unprepared to handle it.
* This can happen when fsconfig() is called from init_user_ns with
* an fs_fd opened in another user namespace.
*/
if (user_ns != &init_user_ns && !(fc->fs_type->fs_flags & FS_USERNS_MOUNT)) {
errorfc(fc, "VFS: Mounting from non-initial user namespace is not allowed");
return ERR_PTR(-EPERM);
}
retry:
spin_lock(&sb_lock);
if (test) {
+2
View File
@@ -82,6 +82,7 @@ struct ubifs_compressor *ubifs_compressors[UBIFS_COMPR_TYPES_CNT];
/**
* ubifs_compress - compress data.
* @c: UBIFS file-system description object
* @in_buf: data to compress
* @in_len: length of the data to compress
* @out_buf: output buffer where compressed data should be stored
@@ -140,6 +141,7 @@ no_compr:
/**
* ubifs_decompress - decompress data.
* @c: UBIFS file-system description object
* @in_buf: data to decompress
* @in_len: length of the data to decompress
* @out_buf: output buffer where decompressed data should
+2 -2
View File
@@ -2827,9 +2827,9 @@ void dbg_debugfs_init_fs(struct ubifs_info *c)
const char *fname;
struct ubifs_debug_info *d = c->dbg;
n = snprintf(d->dfs_dir_name, UBIFS_DFS_DIR_LEN + 1, UBIFS_DFS_DIR_NAME,
n = snprintf(d->dfs_dir_name, UBIFS_DFS_DIR_LEN, UBIFS_DFS_DIR_NAME,
c->vi.ubi_num, c->vi.vol_id);
if (n > UBIFS_DFS_DIR_LEN) {
if (n >= UBIFS_DFS_DIR_LEN) {
/* The array size is too small */
return;
}
+4 -3
View File
@@ -19,10 +19,11 @@ typedef int (*dbg_znode_callback)(struct ubifs_info *c,
/*
* The UBIFS debugfs directory name pattern and maximum name length (3 for "ubi"
* + 1 for "_" and plus 2x2 for 2 UBI numbers and 1 for the trailing zero byte.
* + 1 for "_" and 2 for UBI device numbers and 3 for volume number and 1 for
* the trailing zero byte.
*/
#define UBIFS_DFS_DIR_NAME "ubi%d_%d"
#define UBIFS_DFS_DIR_LEN (3 + 1 + 2*2 + 1)
#define UBIFS_DFS_DIR_LEN (3 + 1 + 2 + 3 + 1)
/**
* ubifs_debug_info - per-FS debugging information.
@@ -103,7 +104,7 @@ struct ubifs_debug_info {
unsigned int chk_fs:1;
unsigned int tst_rcvry:1;
char dfs_dir_name[UBIFS_DFS_DIR_LEN + 1];
char dfs_dir_name[UBIFS_DFS_DIR_LEN];
struct dentry *dfs_dir;
struct dentry *dfs_dump_lprops;
struct dentry *dfs_dump_budg;
+53 -38
View File
@@ -71,8 +71,13 @@ static int inherit_flags(const struct inode *dir, umode_t mode)
* @is_xattr: whether the inode is xattr inode
*
* This function finds an unused inode number, allocates new inode and
* initializes it. Returns new inode in case of success and an error code in
* case of failure.
* initializes it. Non-xattr new inode may be written with xattrs(selinux/
* encryption) before writing dentry, which could cause inconsistent problem
* when powercut happens between two operations. To deal with it, non-xattr
* new inode is initialized with zero-nlink and added into orphan list, caller
* should make sure that inode is relinked later, and make sure that orphan
* removing and journal writing into an committing atomic operation. Returns
* new inode in case of success and an error code in case of failure.
*/
struct inode *ubifs_new_inode(struct ubifs_info *c, struct inode *dir,
umode_t mode, bool is_xattr)
@@ -163,9 +168,25 @@ struct inode *ubifs_new_inode(struct ubifs_info *c, struct inode *dir,
ui->creat_sqnum = ++c->max_sqnum;
spin_unlock(&c->cnt_lock);
if (!is_xattr) {
set_nlink(inode, 0);
err = ubifs_add_orphan(c, inode->i_ino);
if (err) {
ubifs_err(c, "ubifs_add_orphan failed: %i", err);
goto out_iput;
}
down_read(&c->commit_sem);
ui->del_cmtno = c->cmt_no;
up_read(&c->commit_sem);
}
if (encrypted) {
err = fscrypt_set_context(inode, NULL);
if (err) {
if (!is_xattr) {
set_nlink(inode, 1);
ubifs_delete_orphan(c, inode->i_ino);
}
ubifs_err(c, "fscrypt_set_context failed: %i", err);
goto out_iput;
}
@@ -320,12 +341,13 @@ static int ubifs_create(struct mnt_idmap *idmap, struct inode *dir,
if (err)
goto out_inode;
set_nlink(inode, 1);
mutex_lock(&dir_ui->ui_mutex);
dir->i_size += sz_change;
dir_ui->ui_size = dir->i_size;
inode_set_mtime_to_ts(dir,
inode_set_ctime_to_ts(dir, inode_get_ctime(inode)));
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0, 1);
if (err)
goto out_cancel;
mutex_unlock(&dir_ui->ui_mutex);
@@ -340,8 +362,8 @@ out_cancel:
dir->i_size -= sz_change;
dir_ui->ui_size = dir->i_size;
mutex_unlock(&dir_ui->ui_mutex);
set_nlink(inode, 0);
out_inode:
make_bad_inode(inode);
iput(inode);
out_fname:
fscrypt_free_filename(&nm);
@@ -386,7 +408,6 @@ static struct inode *create_whiteout(struct inode *dir, struct dentry *dentry)
return inode;
out_inode:
make_bad_inode(inode);
iput(inode);
out_free:
ubifs_err(c, "cannot create whiteout file, error %d", err);
@@ -470,6 +491,7 @@ static int ubifs_tmpfile(struct mnt_idmap *idmap, struct inode *dir,
if (err)
goto out_inode;
set_nlink(inode, 1);
mutex_lock(&ui->ui_mutex);
insert_inode_hash(inode);
d_tmpfile(file, inode);
@@ -479,7 +501,7 @@ static int ubifs_tmpfile(struct mnt_idmap *idmap, struct inode *dir,
mutex_unlock(&ui->ui_mutex);
lock_2_inodes(dir, inode);
err = ubifs_jnl_update(c, dir, &nm, inode, 1, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 1, 0, 1);
if (err)
goto out_cancel;
unlock_2_inodes(dir, inode);
@@ -492,7 +514,6 @@ static int ubifs_tmpfile(struct mnt_idmap *idmap, struct inode *dir,
out_cancel:
unlock_2_inodes(dir, inode);
out_inode:
make_bad_inode(inode);
if (!instantiated)
iput(inode);
out_budg:
@@ -760,10 +781,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir,
lock_2_inodes(dir, inode);
/* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */
if (inode->i_nlink == 0)
ubifs_delete_orphan(c, inode->i_ino);
inc_nlink(inode);
ihold(inode);
inode_set_ctime_current(inode);
@@ -771,7 +788,7 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir,
dir_ui->ui_size = dir->i_size;
inode_set_mtime_to_ts(dir,
inode_set_ctime_to_ts(dir, inode_get_ctime(inode)));
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0, inode->i_nlink == 1);
if (err)
goto out_cancel;
unlock_2_inodes(dir, inode);
@@ -785,8 +802,6 @@ out_cancel:
dir->i_size -= sz_change;
dir_ui->ui_size = dir->i_size;
drop_nlink(inode);
if (inode->i_nlink == 0)
ubifs_add_orphan(c, inode->i_ino);
unlock_2_inodes(dir, inode);
ubifs_release_budget(c, &req);
iput(inode);
@@ -846,7 +861,7 @@ static int ubifs_unlink(struct inode *dir, struct dentry *dentry)
dir_ui->ui_size = dir->i_size;
inode_set_mtime_to_ts(dir,
inode_set_ctime_to_ts(dir, inode_get_ctime(inode)));
err = ubifs_jnl_update(c, dir, &nm, inode, 1, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 1, 0, 0);
if (err)
goto out_cancel;
unlock_2_inodes(dir, inode);
@@ -950,7 +965,7 @@ static int ubifs_rmdir(struct inode *dir, struct dentry *dentry)
dir_ui->ui_size = dir->i_size;
inode_set_mtime_to_ts(dir,
inode_set_ctime_to_ts(dir, inode_get_ctime(inode)));
err = ubifs_jnl_update(c, dir, &nm, inode, 1, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 1, 0, 0);
if (err)
goto out_cancel;
unlock_2_inodes(dir, inode);
@@ -1017,6 +1032,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
if (err)
goto out_inode;
set_nlink(inode, 1);
mutex_lock(&dir_ui->ui_mutex);
insert_inode_hash(inode);
inc_nlink(inode);
@@ -1025,7 +1041,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
dir_ui->ui_size = dir->i_size;
inode_set_mtime_to_ts(dir,
inode_set_ctime_to_ts(dir, inode_get_ctime(inode)));
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0, 1);
if (err) {
ubifs_err(c, "cannot create directory, error %d", err);
goto out_cancel;
@@ -1042,8 +1058,8 @@ out_cancel:
dir_ui->ui_size = dir->i_size;
drop_nlink(dir);
mutex_unlock(&dir_ui->ui_mutex);
set_nlink(inode, 0);
out_inode:
make_bad_inode(inode);
iput(inode);
out_fname:
fscrypt_free_filename(&nm);
@@ -1102,22 +1118,25 @@ static int ubifs_mknod(struct mnt_idmap *idmap, struct inode *dir,
goto out_fname;
}
err = ubifs_init_security(dir, inode, &dentry->d_name);
if (err) {
kfree(dev);
goto out_inode;
}
init_special_inode(inode, inode->i_mode, rdev);
inode->i_size = ubifs_inode(inode)->ui_size = devlen;
ui = ubifs_inode(inode);
ui->data = dev;
ui->data_len = devlen;
err = ubifs_init_security(dir, inode, &dentry->d_name);
if (err)
goto out_inode;
set_nlink(inode, 1);
mutex_lock(&dir_ui->ui_mutex);
dir->i_size += sz_change;
dir_ui->ui_size = dir->i_size;
inode_set_mtime_to_ts(dir,
inode_set_ctime_to_ts(dir, inode_get_ctime(inode)));
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0, 1);
if (err)
goto out_cancel;
mutex_unlock(&dir_ui->ui_mutex);
@@ -1132,10 +1151,8 @@ out_cancel:
dir->i_size -= sz_change;
dir_ui->ui_size = dir->i_size;
mutex_unlock(&dir_ui->ui_mutex);
set_nlink(inode, 0);
out_inode:
/* Free inode->i_link before inode is marked as bad. */
fscrypt_free_inode(inode);
make_bad_inode(inode);
iput(inode);
out_fname:
fscrypt_free_filename(&nm);
@@ -1186,6 +1203,10 @@ static int ubifs_symlink(struct mnt_idmap *idmap, struct inode *dir,
goto out_fname;
}
err = ubifs_init_security(dir, inode, &dentry->d_name);
if (err)
goto out_inode;
ui = ubifs_inode(inode);
ui->data = kmalloc(disk_link.len, GFP_NOFS);
if (!ui->data) {
@@ -1210,17 +1231,14 @@ static int ubifs_symlink(struct mnt_idmap *idmap, struct inode *dir,
*/
ui->data_len = disk_link.len - 1;
inode->i_size = ubifs_inode(inode)->ui_size = disk_link.len - 1;
err = ubifs_init_security(dir, inode, &dentry->d_name);
if (err)
goto out_inode;
set_nlink(inode, 1);
mutex_lock(&dir_ui->ui_mutex);
dir->i_size += sz_change;
dir_ui->ui_size = dir->i_size;
inode_set_mtime_to_ts(dir,
inode_set_ctime_to_ts(dir, inode_get_ctime(inode)));
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0, 1);
if (err)
goto out_cancel;
mutex_unlock(&dir_ui->ui_mutex);
@@ -1234,10 +1252,10 @@ out_cancel:
dir->i_size -= sz_change;
dir_ui->ui_size = dir->i_size;
mutex_unlock(&dir_ui->ui_mutex);
set_nlink(inode, 0);
out_inode:
/* Free inode->i_link before inode is marked as bad. */
fscrypt_free_inode(inode);
make_bad_inode(inode);
iput(inode);
out_fname:
fscrypt_free_filename(&nm);
@@ -1405,14 +1423,10 @@ static int do_rename(struct inode *old_dir, struct dentry *old_dentry,
*/
err = ubifs_budget_space(c, &wht_req);
if (err) {
/*
* Whiteout inode can not be written on flash by
* ubifs_jnl_write_inode(), because it's neither
* dirty nor zero-nlink.
*/
iput(whiteout);
goto out_release;
}
set_nlink(whiteout, 1);
/* Add the old_dentry size to the old_dir size. */
old_sz -= CALC_DENT_SIZE(fname_len(&old_nm));
@@ -1491,7 +1505,7 @@ static int do_rename(struct inode *old_dir, struct dentry *old_dentry,
}
err = ubifs_jnl_rename(c, old_dir, old_inode, &old_nm, new_dir,
new_inode, &new_nm, whiteout, sync);
new_inode, &new_nm, whiteout, sync, !!whiteout);
if (err)
goto out_cancel;
@@ -1544,6 +1558,7 @@ out_cancel:
unlock_4_inodes(old_dir, new_dir, new_inode, whiteout);
if (whiteout) {
ubifs_release_budget(c, &wht_req);
set_nlink(whiteout, 0);
iput(whiteout);
}
out_release:
+1 -1
View File
@@ -1027,7 +1027,7 @@ static int ubifs_writepage(struct folio *folio, struct writeback_control *wbc,
/* Is the folio fully inside i_size? */
if (folio_pos(folio) + len <= i_size) {
if (folio_pos(folio) >= synced_i_size) {
if (folio_pos(folio) + len > synced_i_size) {
err = inode->i_sb->s_op->write_inode(inode, NULL);
if (err)
goto out_redirty;
+4 -4
View File
@@ -73,7 +73,7 @@ static int valuable(struct ubifs_info *c, const struct ubifs_lprops *lprops)
* @c: the UBIFS file-system description object
* @lprops: LEB properties to scan
* @in_tree: whether the LEB properties are in main memory
* @data: information passed to and from the caller of the scan
* @arg: information passed to and from the caller of the scan
*
* This function returns a code that indicates whether the scan should continue
* (%LPT_SCAN_CONTINUE), whether the LEB properties should be added to the tree
@@ -340,7 +340,7 @@ out:
* @c: the UBIFS file-system description object
* @lprops: LEB properties to scan
* @in_tree: whether the LEB properties are in main memory
* @data: information passed to and from the caller of the scan
* @arg: information passed to and from the caller of the scan
*
* This function returns a code that indicates whether the scan should continue
* (%LPT_SCAN_CONTINUE), whether the LEB properties should be added to the tree
@@ -581,7 +581,7 @@ out:
* @c: the UBIFS file-system description object
* @lprops: LEB properties to scan
* @in_tree: whether the LEB properties are in main memory
* @data: information passed to and from the caller of the scan
* @arg: information passed to and from the caller of the scan
*
* This function returns a code that indicates whether the scan should continue
* (%LPT_SCAN_CONTINUE), whether the LEB properties should be added to the tree
@@ -773,7 +773,7 @@ int ubifs_save_dirty_idx_lnums(struct ubifs_info *c)
* @c: the UBIFS file-system description object
* @lprops: LEB properties to scan
* @in_tree: whether the LEB properties are in main memory
* @data: information passed to and from the caller of the scan
* @arg: information passed to and from the caller of the scan
*
* This function returns a code that indicates whether the scan should continue
* (%LPT_SCAN_CONTINUE), whether the LEB properties should be added to the tree
+12 -4
View File
@@ -359,7 +359,7 @@ static void wake_up_reservation(struct ubifs_info *c)
}
/**
* wake_up_reservation - add current task in queue or start queuing.
* add_or_start_queue - add current task in queue or start queuing.
* @c: UBIFS file-system description object
*
* This function starts queuing if queuing is not started, otherwise adds
@@ -643,6 +643,7 @@ static void set_dent_cookie(struct ubifs_info *c, struct ubifs_dent_node *dent)
* @inode: inode to update
* @deletion: indicates a directory entry deletion i.e unlink or rmdir
* @xent: non-zero if the directory entry is an extended attribute entry
* @in_orphan: indicates whether the @inode is in orphan list
*
* This function updates an inode by writing a directory entry (or extended
* attribute entry), the inode itself, and the parent directory inode (or the
@@ -664,7 +665,7 @@ static void set_dent_cookie(struct ubifs_info *c, struct ubifs_dent_node *dent)
*/
int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
const struct fscrypt_name *nm, const struct inode *inode,
int deletion, int xent)
int deletion, int xent, int in_orphan)
{
int err, dlen, ilen, len, lnum, ino_offs, dent_offs, orphan_added = 0;
int aligned_dlen, aligned_ilen, sync = IS_DIRSYNC(dir);
@@ -750,7 +751,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
if (err)
goto out_release;
if (last_reference) {
if (last_reference && !in_orphan) {
err = ubifs_add_orphan(c, inode->i_ino);
if (err) {
release_head(c, BASEHD);
@@ -806,6 +807,9 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
if (err)
goto out_ro;
if (in_orphan && inode->i_nlink)
ubifs_delete_orphan(c, inode->i_ino);
finish_reservation(c);
spin_lock(&ui->ui_lock);
ui->synced_i_size = ui->ui_size;
@@ -1336,6 +1340,7 @@ out_free:
* @new_nm: new name of the new directory entry
* @whiteout: whiteout inode
* @sync: non-zero if the write-buffer has to be synchronized
* @delete_orphan: indicates an orphan entry deletion for @whiteout
*
* This function implements the re-name operation which may involve writing up
* to 4 inodes(new inode, whiteout inode, old and new parent directory inodes)
@@ -1348,7 +1353,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,
const struct inode *new_dir,
const struct inode *new_inode,
const struct fscrypt_name *new_nm,
const struct inode *whiteout, int sync)
const struct inode *whiteout, int sync, int delete_orphan)
{
void *p;
union ubifs_key key;
@@ -1565,6 +1570,9 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,
goto out_ro;
}
if (delete_orphan)
ubifs_delete_orphan(c, whiteout->i_ino);
finish_reservation(c);
if (new_inode) {
mark_inode_clean(c, new_ui);
+1 -1
View File
@@ -1005,7 +1005,7 @@ out:
* @c: the UBIFS file-system description object
* @lp: LEB properties to scan
* @in_tree: whether the LEB properties are in main memory
* @lst: lprops statistics to update
* @arg: lprops statistics to update
*
* This function returns a code that indicates whether the scan should continue
* (%LPT_SCAN_CONTINUE), whether the LEB properties should be added to the tree
+1
View File
@@ -1918,6 +1918,7 @@ out_err:
* @pnode: where to keep a pnode
* @cnode: where to keep a cnode
* @in_tree: is the node in the tree in memory
* @ptr: union of node pointers
* @ptr.nnode: pointer to the nnode (if it is an nnode) which may be here or in
* the tree
* @ptr.pnode: ditto for pnode
+4 -1
View File
@@ -67,10 +67,13 @@ static int mst_node_check_hash(const struct ubifs_info *c,
{
u8 calc[UBIFS_MAX_HASH_LEN];
const void *node = mst;
int ret;
crypto_shash_tfm_digest(c->hash_tfm, node + sizeof(struct ubifs_ch),
ret = crypto_shash_tfm_digest(c->hash_tfm, node + sizeof(struct ubifs_ch),
UBIFS_MST_NODE_SZ - sizeof(struct ubifs_ch),
calc);
if (ret)
return ret;
if (ubifs_check_hash(c, expected, calc))
return -EPERM;
+25 -130
View File
@@ -42,24 +42,30 @@
static int dbg_check_orphans(struct ubifs_info *c);
static struct ubifs_orphan *orphan_add(struct ubifs_info *c, ino_t inum,
struct ubifs_orphan *parent_orphan)
/**
* ubifs_add_orphan - add an orphan.
* @c: UBIFS file-system description object
* @inum: orphan inode number
*
* Add an orphan. This function is called when an inodes link count drops to
* zero.
*/
int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
{
struct ubifs_orphan *orphan, *o;
struct rb_node **p, *parent = NULL;
orphan = kzalloc(sizeof(struct ubifs_orphan), GFP_NOFS);
if (!orphan)
return ERR_PTR(-ENOMEM);
return -ENOMEM;
orphan->inum = inum;
orphan->new = 1;
INIT_LIST_HEAD(&orphan->child_list);
spin_lock(&c->orphan_lock);
if (c->tot_orphans >= c->max_orphans) {
spin_unlock(&c->orphan_lock);
kfree(orphan);
return ERR_PTR(-ENFILE);
return -ENFILE;
}
p = &c->orph_tree.rb_node;
while (*p) {
@@ -73,7 +79,7 @@ static struct ubifs_orphan *orphan_add(struct ubifs_info *c, ino_t inum,
ubifs_err(c, "orphaned twice");
spin_unlock(&c->orphan_lock);
kfree(orphan);
return ERR_PTR(-EINVAL);
return -EINVAL;
}
}
c->tot_orphans += 1;
@@ -83,14 +89,9 @@ static struct ubifs_orphan *orphan_add(struct ubifs_info *c, ino_t inum,
list_add_tail(&orphan->list, &c->orph_list);
list_add_tail(&orphan->new_list, &c->orph_new);
if (parent_orphan) {
list_add_tail(&orphan->child_list,
&parent_orphan->child_list);
}
spin_unlock(&c->orphan_lock);
dbg_gen("ino %lu", (unsigned long)inum);
return orphan;
return 0;
}
static struct ubifs_orphan *lookup_orphan(struct ubifs_info *c, ino_t inum)
@@ -135,6 +136,7 @@ static void orphan_delete(struct ubifs_info *c, struct ubifs_orphan *orph)
if (orph->cmt) {
orph->del = 1;
rb_erase(&orph->rb, &c->orph_tree);
orph->dnext = c->orph_dnext;
c->orph_dnext = orph;
dbg_gen("delete later ino %lu", (unsigned long)orph->inum);
@@ -144,59 +146,6 @@ static void orphan_delete(struct ubifs_info *c, struct ubifs_orphan *orph)
__orphan_drop(c, orph);
}
/**
* ubifs_add_orphan - add an orphan.
* @c: UBIFS file-system description object
* @inum: orphan inode number
*
* Add an orphan. This function is called when an inodes link count drops to
* zero.
*/
int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
{
int err = 0;
ino_t xattr_inum;
union ubifs_key key;
struct ubifs_dent_node *xent, *pxent = NULL;
struct fscrypt_name nm = {0};
struct ubifs_orphan *xattr_orphan;
struct ubifs_orphan *orphan;
orphan = orphan_add(c, inum, NULL);
if (IS_ERR(orphan))
return PTR_ERR(orphan);
lowest_xent_key(c, &key, inum);
while (1) {
xent = ubifs_tnc_next_ent(c, &key, &nm);
if (IS_ERR(xent)) {
err = PTR_ERR(xent);
if (err == -ENOENT)
break;
kfree(pxent);
return err;
}
fname_name(&nm) = xent->name;
fname_len(&nm) = le16_to_cpu(xent->nlen);
xattr_inum = le64_to_cpu(xent->inum);
xattr_orphan = orphan_add(c, xattr_inum, orphan);
if (IS_ERR(xattr_orphan)) {
kfree(pxent);
kfree(xent);
return PTR_ERR(xattr_orphan);
}
kfree(pxent);
pxent = xent;
key_read(c, &xent->key, &key);
}
kfree(pxent);
return 0;
}
/**
* ubifs_delete_orphan - delete an orphan.
* @c: UBIFS file-system description object
@@ -206,7 +155,7 @@ int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
*/
void ubifs_delete_orphan(struct ubifs_info *c, ino_t inum)
{
struct ubifs_orphan *orph, *child_orph, *tmp_o;
struct ubifs_orphan *orph;
spin_lock(&c->orphan_lock);
@@ -219,11 +168,6 @@ void ubifs_delete_orphan(struct ubifs_info *c, ino_t inum)
return;
}
list_for_each_entry_safe(child_orph, tmp_o, &orph->child_list, child_list) {
list_del(&child_orph->child_list);
orphan_delete(c, child_orph);
}
orphan_delete(c, orph);
spin_unlock(&c->orphan_lock);
@@ -518,7 +462,6 @@ static void erase_deleted(struct ubifs_info *c)
dnext = orphan->dnext;
ubifs_assert(c, !orphan->new);
ubifs_assert(c, orphan->del);
rb_erase(&orphan->rb, &c->orph_tree);
list_del(&orphan->list);
c->tot_orphans -= 1;
dbg_gen("deleting orphan ino %lu", (unsigned long)orphan->inum);
@@ -570,51 +513,6 @@ int ubifs_clear_orphans(struct ubifs_info *c)
return 0;
}
/**
* insert_dead_orphan - insert an orphan.
* @c: UBIFS file-system description object
* @inum: orphan inode number
*
* This function is a helper to the 'do_kill_orphans()' function. The orphan
* must be kept until the next commit, so it is added to the rb-tree and the
* deletion list.
*/
static int insert_dead_orphan(struct ubifs_info *c, ino_t inum)
{
struct ubifs_orphan *orphan, *o;
struct rb_node **p, *parent = NULL;
orphan = kzalloc(sizeof(struct ubifs_orphan), GFP_KERNEL);
if (!orphan)
return -ENOMEM;
orphan->inum = inum;
p = &c->orph_tree.rb_node;
while (*p) {
parent = *p;
o = rb_entry(parent, struct ubifs_orphan, rb);
if (inum < o->inum)
p = &(*p)->rb_left;
else if (inum > o->inum)
p = &(*p)->rb_right;
else {
/* Already added - no problem */
kfree(orphan);
return 0;
}
}
c->tot_orphans += 1;
rb_link_node(&orphan->rb, parent, p);
rb_insert_color(&orphan->rb, &c->orph_tree);
list_add_tail(&orphan->list, &c->orph_list);
orphan->del = 1;
orphan->dnext = c->orph_dnext;
c->orph_dnext = orphan;
dbg_mnt("ino %lu, new %d, tot %d", (unsigned long)inum,
c->new_orphans, c->tot_orphans);
return 0;
}
/**
* do_kill_orphans - remove orphan inodes from the index.
* @c: UBIFS file-system description object
@@ -691,12 +589,12 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3;
for (i = 0; i < n; i++) {
union ubifs_key key1, key2;
union ubifs_key key;
inum = le64_to_cpu(orph->inos[i]);
ino_key_init(c, &key1, inum);
err = ubifs_tnc_lookup(c, &key1, ino);
ino_key_init(c, &key, inum);
err = ubifs_tnc_lookup(c, &key, ino);
if (err && err != -ENOENT)
goto out_free;
@@ -708,17 +606,10 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
dbg_rcvry("deleting orphaned inode %lu",
(unsigned long)inum);
lowest_ino_key(c, &key1, inum);
highest_ino_key(c, &key2, inum);
err = ubifs_tnc_remove_range(c, &key1, &key2);
err = ubifs_tnc_remove_ino(c, inum);
if (err)
goto out_ro;
}
err = insert_dead_orphan(c, inum);
if (err)
goto out_free;
}
*last_cmt_no = cmt_no;
@@ -925,8 +816,12 @@ static int dbg_orphan_check(struct ubifs_info *c, struct ubifs_zbranch *zbr,
inum = key_inum(c, &zbr->key);
if (inum != ci->last_ino) {
/* Lowest node type is the inode node, so it comes first */
if (key_type(c, &zbr->key) != UBIFS_INO_KEY)
/*
* Lowest node type is the inode node or xattr entry(when
* selinux/encryption is enabled), so it comes first
*/
if (key_type(c, &zbr->key) != UBIFS_INO_KEY &&
key_type(c, &zbr->key) != UBIFS_XENT_KEY)
ubifs_err(c, "found orphan node ino %lu, type %d",
(unsigned long)inum, key_type(c, &zbr->key));
ci->last_ino = inum;
+1
View File
@@ -29,6 +29,7 @@
* @lnum: logical eraseblock number of the node
* @offs: node offset
* @len: node length
* @hash: node hash
* @deletion: non-zero if this entry corresponds to a node deletion
* @sqnum: node sequence number
* @list: links the replay list
+3 -3
View File
@@ -91,17 +91,17 @@ static struct kset ubifs_kset = {
int ubifs_sysfs_register(struct ubifs_info *c)
{
int ret, n;
char dfs_dir_name[UBIFS_DFS_DIR_LEN+1];
char dfs_dir_name[UBIFS_DFS_DIR_LEN];
c->stats = kzalloc(sizeof(struct ubifs_stats_info), GFP_KERNEL);
if (!c->stats) {
ret = -ENOMEM;
goto out_last;
}
n = snprintf(dfs_dir_name, UBIFS_DFS_DIR_LEN + 1, UBIFS_DFS_DIR_NAME,
n = snprintf(dfs_dir_name, UBIFS_DFS_DIR_LEN, UBIFS_DFS_DIR_NAME,
c->vi.ubi_num, c->vi.vol_id);
if (n > UBIFS_DFS_DIR_LEN) {
if (n >= UBIFS_DFS_DIR_LEN) {
/* The array size is too small */
ret = -EINVAL;
goto out_free;
+2 -12
View File
@@ -157,13 +157,6 @@
#define UBIFS_HMAC_ARR_SZ 0
#endif
/*
* The UBIFS sysfs directory name pattern and maximum name length (3 for "ubi"
* + 1 for "_" and plus 2x2 for 2 UBI numbers and 1 for the trailing zero byte.
*/
#define UBIFS_DFS_DIR_NAME "ubi%d_%d"
#define UBIFS_DFS_DIR_LEN (3 + 1 + 2*2 + 1)
/*
* Lockdep classes for UBIFS inode @ui_mutex.
*/
@@ -923,8 +916,6 @@ struct ubifs_budget_req {
* @rb: rb-tree node of rb-tree of orphans sorted by inode number
* @list: list head of list of orphans in order added
* @new_list: list head of list of orphans added since the last commit
* @child_list: list of xattr children if this orphan hosts xattrs, list head
* if this orphan is a xattr, not used otherwise.
* @cnext: next orphan to commit
* @dnext: next orphan to delete
* @inum: inode number
@@ -936,7 +927,6 @@ struct ubifs_orphan {
struct rb_node rb;
struct list_head list;
struct list_head new_list;
struct list_head child_list;
struct ubifs_orphan *cnext;
struct ubifs_orphan *dnext;
ino_t inum;
@@ -1803,7 +1793,7 @@ int ubifs_consolidate_log(struct ubifs_info *c);
/* journal.c */
int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
const struct fscrypt_name *nm, const struct inode *inode,
int deletion, int xent);
int deletion, int xent, int in_orphan);
int ubifs_jnl_write_data(struct ubifs_info *c, const struct inode *inode,
const union ubifs_key *key, const void *buf, int len);
int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode);
@@ -1820,7 +1810,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,
const struct inode *new_dir,
const struct inode *new_inode,
const struct fscrypt_name *new_nm,
const struct inode *whiteout, int sync);
const struct inode *whiteout, int sync, int delete_orphan);
int ubifs_jnl_truncate(struct ubifs_info *c, const struct inode *inode,
loff_t old_size, loff_t new_size);
int ubifs_jnl_delete_xattr(struct ubifs_info *c, const struct inode *host,
+1 -1
View File
@@ -149,7 +149,7 @@ static int create_xattr(struct ubifs_info *c, struct inode *host,
if (strcmp(fname_name(nm), UBIFS_XATTR_NAME_ENCRYPTION_CONTEXT) == 0)
host_ui->flags |= UBIFS_CRYPT_FL;
err = ubifs_jnl_update(c, host, nm, inode, 0, 1);
err = ubifs_jnl_update(c, host, nm, inode, 0, 1, 0);
if (err)
goto out_cancel;
ubifs_set_inode_flags(host);
+1
View File
@@ -3352,6 +3352,7 @@ static void write_file(void)
fprintf(file, "};\n");
fprintf(file, "EXPORT_SYMBOL_GPL(utf8_data_table);");
fprintf(file, "\n");
fprintf(file, "MODULE_DESCRIPTION(\"UTF8 data table\");\n");
fprintf(file, "MODULE_LICENSE(\"GPL v2\");\n");
fclose(file);
}
+3 -2
View File
@@ -14,8 +14,8 @@
#include "utf8n.h"
unsigned int failed_tests;
unsigned int total_tests;
static unsigned int failed_tests;
static unsigned int total_tests;
/* Tests will be based on this version. */
#define UTF8_LATEST UNICODE_AGE(12, 1, 0)
@@ -307,4 +307,5 @@ module_init(init_test_ucd);
module_exit(exit_test_ucd);
MODULE_AUTHOR("Gabriel Krisman Bertazi <krisman@collabora.co.uk>");
MODULE_DESCRIPTION("Kernel module for testing utf-8 support");
MODULE_LICENSE("GPL");
+1
View File
@@ -4120,4 +4120,5 @@ struct utf8data_table utf8_data_table = {
.utf8data = utf8data,
};
EXPORT_SYMBOL_GPL(utf8_data_table);
MODULE_DESCRIPTION("UTF8 data table");
MODULE_LICENSE("GPL v2");
+19 -26
View File
@@ -21,6 +21,21 @@ struct cxl_event_record_hdr {
u8 reserved[15];
} __packed;
struct cxl_event_media_hdr {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
u8 descriptor;
u8 type;
u8 transaction_type;
/*
* The meaning of Validity Flags from bit 2 is
* different across DRAM and General Media records
*/
u8 validity_flags[2];
u8 channel;
u8 rank;
} __packed;
#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
struct cxl_event_generic {
struct cxl_event_record_hdr hdr;
@@ -33,14 +48,7 @@ struct cxl_event_generic {
*/
#define CXL_EVENT_GEN_MED_COMP_ID_SIZE 0x10
struct cxl_event_gen_media {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
u8 descriptor;
u8 type;
u8 transaction_type;
u8 validity_flags[2];
u8 channel;
u8 rank;
struct cxl_event_media_hdr media_hdr;
u8 device[3];
u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
u8 reserved[46];
@@ -52,14 +60,7 @@ struct cxl_event_gen_media {
*/
#define CXL_EVENT_DER_CORRECTION_MASK_SIZE 0x20
struct cxl_event_dram {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
u8 descriptor;
u8 type;
u8 transaction_type;
u8 validity_flags[2];
u8 channel;
u8 rank;
struct cxl_event_media_hdr media_hdr;
u8 nibble_mask[3];
u8 bank_group;
u8 bank;
@@ -95,21 +96,13 @@ struct cxl_event_mem_module {
u8 reserved[0x3d];
} __packed;
/*
* General Media or DRAM Event Common Fields
* - provides common access to phys_addr
*/
struct cxl_event_common {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
} __packed;
union cxl_event {
struct cxl_event_generic generic;
struct cxl_event_gen_media gen_media;
struct cxl_event_dram dram;
struct cxl_event_mem_module mem_module;
struct cxl_event_common common;
/* dram & gen_media event header */
struct cxl_event_media_hdr media_hdr;
} __packed;
/*
+1 -4
View File
@@ -404,7 +404,7 @@ struct io_ring_ctx {
spinlock_t napi_lock; /* napi_list lock */
/* napi busy poll default timeout */
unsigned int napi_busy_poll_to;
ktime_t napi_busy_poll_dt;
bool napi_prefer_busy_poll;
bool napi_enabled;
@@ -461,7 +461,6 @@ enum {
REQ_F_SUPPORT_NOWAIT_BIT,
REQ_F_ISREG_BIT,
REQ_F_POLL_NO_LAZY_BIT,
REQ_F_CANCEL_SEQ_BIT,
REQ_F_CAN_POLL_BIT,
REQ_F_BL_EMPTY_BIT,
REQ_F_BL_NO_RECYCLE_BIT,
@@ -536,8 +535,6 @@ enum {
REQ_F_HASH_LOCKED = IO_REQ_FLAG(REQ_F_HASH_LOCKED_BIT),
/* don't use lazy poll wake for this request */
REQ_F_POLL_NO_LAZY = IO_REQ_FLAG(REQ_F_POLL_NO_LAZY_BIT),
/* cancel sequence is set and valid */
REQ_F_CANCEL_SEQ = IO_REQ_FLAG(REQ_F_CANCEL_SEQ_BIT),
/* file is pollable */
REQ_F_CAN_POLL = IO_REQ_FLAG(REQ_F_CAN_POLL_BIT),
/* buffer list was empty after selection of buffer */
+11 -8
View File
@@ -45,17 +45,20 @@
#define __cmp(op, x, y) ((x) __cmp_op_##op (y) ? (x) : (y))
#define __cmp_once(op, x, y, unique_x, unique_y) ({ \
typeof(x) unique_x = (x); \
typeof(y) unique_y = (y); \
#define __cmp_once_unique(op, type, x, y, ux, uy) \
({ type ux = (x); type uy = (y); __cmp(op, ux, uy); })
#define __cmp_once(op, type, x, y) \
__cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_))
#define __careful_cmp_once(op, x, y) ({ \
static_assert(__types_ok(x, y), \
#op "(" #x ", " #y ") signedness error, fix types or consider u" #op "() before " #op "_t()"); \
__cmp(op, unique_x, unique_y); })
__cmp_once(op, __auto_type, x, y); })
#define __careful_cmp(op, x, y) \
__builtin_choose_expr(__is_constexpr((x) - (y)), \
__cmp(op, x, y), \
__cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
__cmp(op, x, y), __careful_cmp_once(op, x, y))
#define __clamp(val, lo, hi) \
((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
@@ -158,7 +161,7 @@
* @x: first value
* @y: second value
*/
#define min_t(type, x, y) __careful_cmp(min, (type)(x), (type)(y))
#define min_t(type, x, y) __cmp_once(min, type, x, y)
/**
* max_t - return maximum of two values, using the specified type
@@ -166,7 +169,7 @@
* @x: first value
* @y: second value
*/
#define max_t(type, x, y) __careful_cmp(max, (type)(x), (type)(y))
#define max_t(type, x, y) __cmp_once(max, type, x, y)
/*
* Do not check the array parameter using __must_be_array().
+9
View File
@@ -485,6 +485,9 @@ enum {
NVME_ID_NS_NVM_STS_MASK = 0x7f,
NVME_ID_NS_NVM_GUARD_SHIFT = 7,
NVME_ID_NS_NVM_GUARD_MASK = 0x3,
NVME_ID_NS_NVM_QPIF_SHIFT = 9,
NVME_ID_NS_NVM_QPIF_MASK = 0xf,
NVME_ID_NS_NVM_QPIFS = 1 << 3,
};
static inline __u8 nvme_elbaf_sts(__u32 elbaf)
@@ -497,6 +500,11 @@ static inline __u8 nvme_elbaf_guard_type(__u32 elbaf)
return (elbaf >> NVME_ID_NS_NVM_GUARD_SHIFT) & NVME_ID_NS_NVM_GUARD_MASK;
}
static inline __u8 nvme_elbaf_qualified_guard_type(__u32 elbaf)
{
return (elbaf >> NVME_ID_NS_NVM_QPIF_SHIFT) & NVME_ID_NS_NVM_QPIF_MASK;
}
struct nvme_id_ctrl_nvm {
__u8 vsl;
__u8 wzsl;
@@ -576,6 +584,7 @@ enum {
NVME_NVM_NS_16B_GUARD = 0,
NVME_NVM_NS_32B_GUARD = 1,
NVME_NVM_NS_64B_GUARD = 2,
NVME_NVM_NS_QTYPE_GUARD = 3,
};
static inline __u8 nvme_lbaf_index(__u8 flbas)
+32 -14
View File
@@ -5,6 +5,7 @@
#include <linux/fault-inject-usercopy.h>
#include <linux/instrumented.h>
#include <linux/minmax.h>
#include <linux/nospec.h>
#include <linux/sched.h>
#include <linux/thread_info.h>
@@ -138,13 +139,26 @@ __copy_to_user(void __user *to, const void *from, unsigned long n)
return raw_copy_to_user(to, from, n);
}
#ifdef INLINE_COPY_FROM_USER
/*
* Architectures that #define INLINE_COPY_TO_USER use this function
* directly in the normal copy_to/from_user(), the other ones go
* through an extern _copy_to/from_user(), which expands the same code
* here.
*
* Rust code always uses the extern definition.
*/
static inline __must_check unsigned long
_copy_from_user(void *to, const void __user *from, unsigned long n)
_inline_copy_from_user(void *to, const void __user *from, unsigned long n)
{
unsigned long res = n;
might_fault();
if (!should_fail_usercopy() && likely(access_ok(from, n))) {
/*
* Ensure that bad access_ok() speculation will not
* lead to nasty side effects *after* the copy is
* finished:
*/
barrier_nospec();
instrument_copy_from_user_before(to, from, n);
res = raw_copy_from_user(to, from, n);
instrument_copy_from_user_after(to, from, n, res);
@@ -153,14 +167,11 @@ _copy_from_user(void *to, const void __user *from, unsigned long n)
memset(to + (n - res), 0, res);
return res;
}
#else
extern __must_check unsigned long
_copy_from_user(void *, const void __user *, unsigned long);
#endif
#ifdef INLINE_COPY_TO_USER
static inline __must_check unsigned long
_copy_to_user(void __user *to, const void *from, unsigned long n)
_inline_copy_to_user(void __user *to, const void *from, unsigned long n)
{
might_fault();
if (should_fail_usercopy())
@@ -171,25 +182,32 @@ _copy_to_user(void __user *to, const void *from, unsigned long n)
}
return n;
}
#else
extern __must_check unsigned long
_copy_to_user(void __user *, const void *, unsigned long);
#endif
static __always_inline unsigned long __must_check
copy_from_user(void *to, const void __user *from, unsigned long n)
{
if (check_copy_size(to, n, false))
n = _copy_from_user(to, from, n);
return n;
if (!check_copy_size(to, n, false))
return n;
#ifdef INLINE_COPY_FROM_USER
return _inline_copy_from_user(to, from, n);
#else
return _copy_from_user(to, from, n);
#endif
}
static __always_inline unsigned long __must_check
copy_to_user(void __user *to, const void *from, unsigned long n)
{
if (check_copy_size(from, n, true))
n = _copy_to_user(to, from, n);
return n;
if (!check_copy_size(from, n, true))
return n;
#ifdef INLINE_COPY_TO_USER
return _inline_copy_to_user(to, from, n);
#else
return _copy_to_user(to, from, n);
#endif
}
#ifndef copy_mc_to_kernel
+4 -1
View File
@@ -1924,7 +1924,10 @@ config RUSTC_VERSION_TEXT
config BINDGEN_VERSION_TEXT
string
depends on RUST
default $(shell,command -v $(BINDGEN) >/dev/null 2>&1 && $(BINDGEN) --version || echo n)
# The dummy parameter `workaround-for-0.69.0` is required to support 0.69.0
# (https://github.com/rust-lang/rust-bindgen/pull/2678). It can be removed when
# the minimum version is upgraded past that (0.69.1 already fixed the issue).
default $(shell,command -v $(BINDGEN) >/dev/null 2>&1 && $(BINDGEN) --version workaround-for-0.69.0 || echo n)
#
# Place an empty function call at each tracepoint site. Can be
+9 -4
View File
@@ -1849,7 +1849,7 @@ fail:
} while (1);
/* avoid locking problems by failing it from a clean context */
if (ret < 0)
if (ret)
io_req_task_queue_fail(req, ret);
}
@@ -2416,12 +2416,14 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
if (uts) {
struct timespec64 ts;
ktime_t dt;
if (get_timespec64(&ts, uts))
return -EFAULT;
iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
io_napi_adjust_timeout(ctx, &iowq, &ts);
dt = timespec64_to_ktime(ts);
iowq.timeout = ktime_add(dt, ktime_get());
io_napi_adjust_timeout(ctx, &iowq, dt);
}
if (sig) {
@@ -3031,8 +3033,11 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd)
bool loop = false;
io_uring_drop_tctx_refs(current);
if (!tctx_inflight(tctx, !cancel_all))
break;
/* read completions before cancelations */
inflight = tctx_inflight(tctx, !cancel_all);
inflight = tctx_inflight(tctx, false);
if (!inflight)
break;
+1 -1
View File
@@ -43,7 +43,7 @@ struct io_wait_queue {
ktime_t timeout;
#ifdef CONFIG_NET_RX_BUSY_POLL
unsigned int napi_busy_poll_to;
ktime_t napi_busy_poll_dt;
bool napi_prefer_busy_poll;
#endif
};
+3 -3
View File
@@ -110,10 +110,10 @@ static struct io_kiocb *io_msg_get_kiocb(struct io_ring_ctx *ctx)
if (spin_trylock(&ctx->msg_lock)) {
req = io_alloc_cache_get(&ctx->msg_cache);
spin_unlock(&ctx->msg_lock);
if (req)
return req;
}
if (req)
return req;
return kmem_cache_alloc(req_cachep, GFP_KERNEL | __GFP_NOWARN);
return kmem_cache_alloc(req_cachep, GFP_KERNEL | __GFP_NOWARN | __GFP_ZERO);
}
static int io_msg_data_remote(struct io_kiocb *req)
+29 -29
View File
@@ -33,6 +33,12 @@ static struct io_napi_entry *io_napi_hash_find(struct hlist_head *hash_list,
return NULL;
}
static inline ktime_t net_to_ktime(unsigned long t)
{
/* napi approximating usecs, reverse busy_loop_current_time */
return ns_to_ktime(t << 10);
}
void __io_napi_add(struct io_ring_ctx *ctx, struct socket *sock)
{
struct hlist_head *hash_list;
@@ -102,14 +108,14 @@ static inline void io_napi_remove_stale(struct io_ring_ctx *ctx, bool is_stale)
__io_napi_remove_stale(ctx);
}
static inline bool io_napi_busy_loop_timeout(unsigned long start_time,
unsigned long bp_usec)
static inline bool io_napi_busy_loop_timeout(ktime_t start_time,
ktime_t bp)
{
if (bp_usec) {
unsigned long end_time = start_time + bp_usec;
unsigned long now = busy_loop_current_time();
if (bp) {
ktime_t end_time = ktime_add(start_time, bp);
ktime_t now = net_to_ktime(busy_loop_current_time());
return time_after(now, end_time);
return ktime_after(now, end_time);
}
return true;
@@ -124,7 +130,8 @@ static bool io_napi_busy_loop_should_end(void *data,
return true;
if (io_should_wake(iowq) || io_has_work(iowq->ctx))
return true;
if (io_napi_busy_loop_timeout(start_time, iowq->napi_busy_poll_to))
if (io_napi_busy_loop_timeout(net_to_ktime(start_time),
iowq->napi_busy_poll_dt))
return true;
return false;
@@ -181,10 +188,12 @@ static void io_napi_blocking_busy_loop(struct io_ring_ctx *ctx,
*/
void io_napi_init(struct io_ring_ctx *ctx)
{
u64 sys_dt = READ_ONCE(sysctl_net_busy_poll) * NSEC_PER_USEC;
INIT_LIST_HEAD(&ctx->napi_list);
spin_lock_init(&ctx->napi_lock);
ctx->napi_prefer_busy_poll = false;
ctx->napi_busy_poll_to = READ_ONCE(sysctl_net_busy_poll);
ctx->napi_busy_poll_dt = ns_to_ktime(sys_dt);
}
/*
@@ -217,11 +226,13 @@ void io_napi_free(struct io_ring_ctx *ctx)
int io_register_napi(struct io_ring_ctx *ctx, void __user *arg)
{
const struct io_uring_napi curr = {
.busy_poll_to = ctx->napi_busy_poll_to,
.busy_poll_to = ktime_to_us(ctx->napi_busy_poll_dt),
.prefer_busy_poll = ctx->napi_prefer_busy_poll
};
struct io_uring_napi napi;
if (ctx->flags & IORING_SETUP_IOPOLL)
return -EINVAL;
if (copy_from_user(&napi, arg, sizeof(napi)))
return -EFAULT;
if (napi.pad[0] || napi.pad[1] || napi.pad[2] || napi.resv)
@@ -230,7 +241,7 @@ int io_register_napi(struct io_ring_ctx *ctx, void __user *arg)
if (copy_to_user(arg, &curr, sizeof(curr)))
return -EFAULT;
WRITE_ONCE(ctx->napi_busy_poll_to, napi.busy_poll_to);
WRITE_ONCE(ctx->napi_busy_poll_dt, napi.busy_poll_to * NSEC_PER_USEC);
WRITE_ONCE(ctx->napi_prefer_busy_poll, !!napi.prefer_busy_poll);
WRITE_ONCE(ctx->napi_enabled, true);
return 0;
@@ -247,14 +258,14 @@ int io_register_napi(struct io_ring_ctx *ctx, void __user *arg)
int io_unregister_napi(struct io_ring_ctx *ctx, void __user *arg)
{
const struct io_uring_napi curr = {
.busy_poll_to = ctx->napi_busy_poll_to,
.busy_poll_to = ktime_to_us(ctx->napi_busy_poll_dt),
.prefer_busy_poll = ctx->napi_prefer_busy_poll
};
if (arg && copy_to_user(arg, &curr, sizeof(curr)))
return -EFAULT;
WRITE_ONCE(ctx->napi_busy_poll_to, 0);
WRITE_ONCE(ctx->napi_busy_poll_dt, 0);
WRITE_ONCE(ctx->napi_prefer_busy_poll, false);
WRITE_ONCE(ctx->napi_enabled, false);
return 0;
@@ -271,25 +282,14 @@ int io_unregister_napi(struct io_ring_ctx *ctx, void __user *arg)
* the NAPI timeout accordingly.
*/
void __io_napi_adjust_timeout(struct io_ring_ctx *ctx, struct io_wait_queue *iowq,
struct timespec64 *ts)
ktime_t to_wait)
{
unsigned int poll_to = READ_ONCE(ctx->napi_busy_poll_to);
ktime_t poll_dt = READ_ONCE(ctx->napi_busy_poll_dt);
if (ts) {
struct timespec64 poll_to_ts;
if (to_wait)
poll_dt = min(poll_dt, to_wait);
poll_to_ts = ns_to_timespec64(1000 * (s64)poll_to);
if (timespec64_compare(ts, &poll_to_ts) < 0) {
s64 poll_to_ns = timespec64_to_ns(ts);
if (poll_to_ns > 0) {
u64 val = poll_to_ns + 999;
do_div(val, 1000);
poll_to = val;
}
}
}
iowq->napi_busy_poll_to = poll_to;
iowq->napi_busy_poll_dt = poll_dt;
}
/*
@@ -318,7 +318,7 @@ int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx)
LIST_HEAD(napi_list);
bool is_stale = false;
if (!READ_ONCE(ctx->napi_busy_poll_to))
if (!READ_ONCE(ctx->napi_busy_poll_dt))
return 0;
if (list_empty_careful(&ctx->napi_list))
return 0;
+5 -5
View File
@@ -18,7 +18,7 @@ int io_unregister_napi(struct io_ring_ctx *ctx, void __user *arg);
void __io_napi_add(struct io_ring_ctx *ctx, struct socket *sock);
void __io_napi_adjust_timeout(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq, struct timespec64 *ts);
struct io_wait_queue *iowq, ktime_t to_wait);
void __io_napi_busy_loop(struct io_ring_ctx *ctx, struct io_wait_queue *iowq);
int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx);
@@ -29,11 +29,11 @@ static inline bool io_napi(struct io_ring_ctx *ctx)
static inline void io_napi_adjust_timeout(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
struct timespec64 *ts)
ktime_t to_wait)
{
if (!io_napi(ctx))
return;
__io_napi_adjust_timeout(ctx, iowq, ts);
__io_napi_adjust_timeout(ctx, iowq, to_wait);
}
static inline void io_napi_busy_loop(struct io_ring_ctx *ctx,
@@ -55,7 +55,7 @@ static inline void io_napi_add(struct io_kiocb *req)
struct io_ring_ctx *ctx = req->ctx;
struct socket *sock;
if (!READ_ONCE(ctx->napi_busy_poll_to))
if (!READ_ONCE(ctx->napi_busy_poll_dt))
return;
sock = sock_from_file(req->file);
@@ -88,7 +88,7 @@ static inline void io_napi_add(struct io_kiocb *req)
}
static inline void io_napi_adjust_timeout(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
struct timespec64 *ts)
ktime_t to_wait)
{
}
static inline void io_napi_busy_loop(struct io_ring_ctx *ctx,
+1 -1
View File
@@ -639,7 +639,7 @@ void io_queue_linked_timeout(struct io_kiocb *req)
static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
bool cancel_all)
__must_hold(&req->ctx->timeout_lock)
__must_hold(&head->ctx->timeout_lock)
{
struct io_kiocb *req;
+1 -1
View File
@@ -265,7 +265,7 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
req_set_fail(req);
io_req_uring_cleanup(req, issue_flags);
io_req_set_res(req, ret, 0);
return ret < 0 ? ret : IOU_OK;
return IOU_OK;
}
int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+4 -26
View File
@@ -12,40 +12,18 @@
/* out-of-line parts */
#ifndef INLINE_COPY_FROM_USER
#if !defined(INLINE_COPY_FROM_USER) || defined(CONFIG_RUST)
unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n)
{
unsigned long res = n;
might_fault();
if (!should_fail_usercopy() && likely(access_ok(from, n))) {
/*
* Ensure that bad access_ok() speculation will not
* lead to nasty side effects *after* the copy is
* finished:
*/
barrier_nospec();
instrument_copy_from_user_before(to, from, n);
res = raw_copy_from_user(to, from, n);
instrument_copy_from_user_after(to, from, n, res);
}
if (unlikely(res))
memset(to + (n - res), 0, res);
return res;
return _inline_copy_from_user(to, from, n);
}
EXPORT_SYMBOL(_copy_from_user);
#endif
#ifndef INLINE_COPY_TO_USER
#if !defined(INLINE_COPY_TO_USER) || defined(CONFIG_RUST)
unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
{
might_fault();
if (should_fail_usercopy())
return n;
if (likely(access_ok(to, n))) {
instrument_copy_to_user(to, from, n);
n = raw_copy_to_user(to, from, n);
}
return n;
return _inline_copy_to_user(to, from, n);
}
EXPORT_SYMBOL(_copy_to_user);
#endif
+1 -1
View File
@@ -44,7 +44,7 @@
#include <net/sock.h>
#include <net/raw.h>
#define TCPUDP_MIB_MAX max_t(u32, UDP_MIB_MAX, TCP_MIB_MAX)
#define TCPUDP_MIB_MAX MAX_T(u32, UDP_MIB_MAX, TCP_MIB_MAX)
/*
* Report socket allocation statistics [mea@utu.fi]
+1 -1
View File
@@ -27,7 +27,7 @@
#include <net/ipv6.h>
#define MAX4(a, b, c, d) \
max_t(u32, max_t(u32, a, b), max_t(u32, c, d))
MAX_T(u32, MAX_T(u32, a, b), MAX_T(u32, c, d))
#define SNMP_MIB_MAX MAX4(UDP_MIB_MAX, TCP_MIB_MAX, \
IPSTATS_MIB_MAX, ICMP_MIB_MAX)
+10 -64
View File
@@ -44,17 +44,10 @@ rustc_sysroot := $(shell MAKEFLAGS= $(RUSTC) $(rust_flags) --print sysroot)
rustc_host_target := $(shell $(RUSTC) --version --verbose | grep -F 'host: ' | cut -d' ' -f2)
RUST_LIB_SRC ?= $(rustc_sysroot)/lib/rustlib/src/rust/library
ifeq ($(quiet),silent_)
cargo_quiet=-q
ifneq ($(quiet),)
rust_test_quiet=-q
rustdoc_test_quiet=--test-args -q
rustdoc_test_kernel_quiet=>/dev/null
else ifeq ($(quiet),quiet_)
rust_test_quiet=-q
rustdoc_test_quiet=--test-args -q
rustdoc_test_kernel_quiet=>/dev/null
else
cargo_quiet=--verbose
endif
core-cfgs = \
@@ -135,22 +128,21 @@ quiet_cmd_rustc_test_library = RUSTC TL $<
@$(objtree)/include/generated/rustc_cfg $(rustc_target_flags) \
--crate-type $(if $(rustc_test_library_proc),proc-macro,rlib) \
--out-dir $(objtree)/$(obj)/test --cfg testlib \
--sysroot $(objtree)/$(obj)/test/sysroot \
-L$(objtree)/$(obj)/test \
--crate-name $(subst rusttest-,,$(subst rusttestlib-,,$@)) $<
rusttestlib-build_error: $(src)/build_error.rs rusttest-prepare FORCE
rusttestlib-build_error: $(src)/build_error.rs FORCE
+$(call if_changed,rustc_test_library)
rusttestlib-macros: private rustc_target_flags = --extern proc_macro
rusttestlib-macros: private rustc_test_library_proc = yes
rusttestlib-macros: $(src)/macros/lib.rs rusttest-prepare FORCE
rusttestlib-macros: $(src)/macros/lib.rs FORCE
+$(call if_changed,rustc_test_library)
rusttestlib-bindings: $(src)/bindings/lib.rs rusttest-prepare FORCE
rusttestlib-bindings: $(src)/bindings/lib.rs FORCE
+$(call if_changed,rustc_test_library)
rusttestlib-uapi: $(src)/uapi/lib.rs rusttest-prepare FORCE
rusttestlib-uapi: $(src)/uapi/lib.rs FORCE
+$(call if_changed,rustc_test_library)
quiet_cmd_rustdoc_test = RUSTDOC T $<
@@ -159,7 +151,7 @@ quiet_cmd_rustdoc_test = RUSTDOC T $<
$(RUSTDOC) --test $(rust_common_flags) \
@$(objtree)/include/generated/rustc_cfg \
$(rustc_target_flags) $(rustdoc_test_target_flags) \
--sysroot $(objtree)/$(obj)/test/sysroot $(rustdoc_test_quiet) \
$(rustdoc_test_quiet) \
-L$(objtree)/$(obj)/test --output $(rustdoc_output) \
--crate-name $(subst rusttest-,,$@) $<
@@ -192,7 +184,6 @@ quiet_cmd_rustc_test = RUSTC T $<
$(RUSTC) --test $(rust_common_flags) \
@$(objtree)/include/generated/rustc_cfg \
$(rustc_target_flags) --out-dir $(objtree)/$(obj)/test \
--sysroot $(objtree)/$(obj)/test/sysroot \
-L$(objtree)/$(obj)/test \
--crate-name $(subst rusttest-,,$@) $<; \
$(objtree)/$(obj)/test/$(subst rusttest-,,$@) $(rust_test_quiet) \
@@ -200,60 +191,15 @@ quiet_cmd_rustc_test = RUSTC T $<
rusttest: rusttest-macros rusttest-kernel
# This prepares a custom sysroot with our custom `alloc` instead of
# the standard one.
#
# This requires several hacks:
# - Unlike `core` and `alloc`, `std` depends on more than a dozen crates,
# including third-party crates that need to be downloaded, plus custom
# `build.rs` steps. Thus hardcoding things here is not maintainable.
# - `cargo` knows how to build the standard library, but it is an unstable
# feature so far (`-Zbuild-std`).
# - `cargo` only considers the use case of building the standard library
# to use it in a given package. Thus we need to create a dummy package
# and pick the generated libraries from there.
# - The usual ways of modifying the dependency graph in `cargo` do not seem
# to apply for the `-Zbuild-std` steps, thus we have to mislead it
# by modifying the sources in the sysroot.
# - To avoid messing with the user's Rust installation, we create a clone
# of the sysroot. However, `cargo` ignores `RUSTFLAGS` in the `-Zbuild-std`
# steps, thus we use a wrapper binary passed via `RUSTC` to pass the flag.
#
# In the future, we hope to avoid the whole ordeal by either:
# - Making the `test` crate not depend on `std` (either improving upstream
# or having our own custom crate).
# - Making the tests run in kernel space (requires the previous point).
# - Making `std` and friends be more like a "normal" crate, so that
# `-Zbuild-std` and related hacks are not needed.
quiet_cmd_rustsysroot = RUSTSYSROOT
cmd_rustsysroot = \
rm -rf $(objtree)/$(obj)/test; \
mkdir -p $(objtree)/$(obj)/test; \
cp -a $(rustc_sysroot) $(objtree)/$(obj)/test/sysroot; \
echo '\#!/bin/sh' > $(objtree)/$(obj)/test/rustc_sysroot; \
echo "$(RUSTC) --sysroot=$(abspath $(objtree)/$(obj)/test/sysroot) \"\$$@\"" \
>> $(objtree)/$(obj)/test/rustc_sysroot; \
chmod u+x $(objtree)/$(obj)/test/rustc_sysroot; \
$(CARGO) -q new $(objtree)/$(obj)/test/dummy; \
RUSTC=$(objtree)/$(obj)/test/rustc_sysroot $(CARGO) $(cargo_quiet) \
test -Zbuild-std --target $(rustc_host_target) \
--manifest-path $(objtree)/$(obj)/test/dummy/Cargo.toml; \
rm $(objtree)/$(obj)/test/sysroot/lib/rustlib/$(rustc_host_target)/lib/*; \
cp $(objtree)/$(obj)/test/dummy/target/$(rustc_host_target)/debug/deps/* \
$(objtree)/$(obj)/test/sysroot/lib/rustlib/$(rustc_host_target)/lib
rusttest-prepare: FORCE
+$(call if_changed,rustsysroot)
rusttest-macros: private rustc_target_flags = --extern proc_macro
rusttest-macros: private rustdoc_test_target_flags = --crate-type proc-macro
rusttest-macros: $(src)/macros/lib.rs rusttest-prepare FORCE
rusttest-macros: $(src)/macros/lib.rs FORCE
+$(call if_changed,rustc_test)
+$(call if_changed,rustdoc_test)
rusttest-kernel: private rustc_target_flags = --extern alloc \
--extern build_error --extern macros --extern bindings --extern uapi
rusttest-kernel: $(src)/kernel/lib.rs rusttest-prepare \
rusttest-kernel: $(src)/kernel/lib.rs \
rusttestlib-build_error rusttestlib-macros rusttestlib-bindings \
rusttestlib-uapi FORCE
+$(call if_changed,rustc_test)
@@ -421,7 +367,7 @@ ifneq ($(or $(CONFIG_ARM64),$(and $(CONFIG_RISCV),$(CONFIG_64BIT))),)
endif
$(obj)/core.o: private skip_clippy = 1
$(obj)/core.o: private skip_flags = -Dunreachable_pub
$(obj)/core.o: private skip_flags = -Wunreachable_pub
$(obj)/core.o: private rustc_objcopy = $(foreach sym,$(redirect-intrinsics),--redefine-sym $(sym)=__rust$(sym))
$(obj)/core.o: private rustc_target_flags = $(core-cfgs)
$(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs FORCE
@@ -435,7 +381,7 @@ $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE
+$(call if_changed_dep,rustc_library)
$(obj)/alloc.o: private skip_clippy = 1
$(obj)/alloc.o: private skip_flags = -Dunreachable_pub
$(obj)/alloc.o: private skip_flags = -Wunreachable_pub
$(obj)/alloc.o: private rustc_target_flags = $(alloc-cfgs)
$(obj)/alloc.o: $(RUST_LIB_SRC)/alloc/src/lib.rs $(obj)/compiler_builtins.o FORCE
+$(call if_changed_dep,rustc_library)
+1
View File
@@ -30,4 +30,5 @@ const gfp_t RUST_CONST_HELPER_GFP_KERNEL = GFP_KERNEL;
const gfp_t RUST_CONST_HELPER_GFP_KERNEL_ACCOUNT = GFP_KERNEL_ACCOUNT;
const gfp_t RUST_CONST_HELPER_GFP_NOWAIT = GFP_NOWAIT;
const gfp_t RUST_CONST_HELPER___GFP_ZERO = __GFP_ZERO;
const gfp_t RUST_CONST_HELPER___GFP_HIGHMEM = ___GFP_HIGHMEM;
const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ROTATIONAL = BLK_FEAT_ROTATIONAL;
+1
View File
@@ -24,6 +24,7 @@
unsafe_op_in_unsafe_fn
)]
#[allow(dead_code)]
mod bindings_raw {
// Use glob import here to expose all helpers.
// Symbols defined within the module will take precedence to the glob import.
+34
View File
@@ -26,6 +26,8 @@
#include <linux/device.h>
#include <linux/err.h>
#include <linux/errname.h>
#include <linux/gfp.h>
#include <linux/highmem.h>
#include <linux/mutex.h>
#include <linux/refcount.h>
#include <linux/sched/signal.h>
@@ -40,6 +42,20 @@ __noreturn void rust_helper_BUG(void)
}
EXPORT_SYMBOL_GPL(rust_helper_BUG);
unsigned long rust_helper_copy_from_user(void *to, const void __user *from,
unsigned long n)
{
return copy_from_user(to, from, n);
}
EXPORT_SYMBOL_GPL(rust_helper_copy_from_user);
unsigned long rust_helper_copy_to_user(void __user *to, const void *from,
unsigned long n)
{
return copy_to_user(to, from, n);
}
EXPORT_SYMBOL_GPL(rust_helper_copy_to_user);
void rust_helper_mutex_lock(struct mutex *lock)
{
mutex_lock(lock);
@@ -81,6 +97,24 @@ int rust_helper_signal_pending(struct task_struct *t)
}
EXPORT_SYMBOL_GPL(rust_helper_signal_pending);
struct page *rust_helper_alloc_pages(gfp_t gfp_mask, unsigned int order)
{
return alloc_pages(gfp_mask, order);
}
EXPORT_SYMBOL_GPL(rust_helper_alloc_pages);
void *rust_helper_kmap_local_page(struct page *page)
{
return kmap_local_page(page);
}
EXPORT_SYMBOL_GPL(rust_helper_kmap_local_page);
void rust_helper_kunmap_local(const void *addr)
{
kunmap_local(addr);
}
EXPORT_SYMBOL_GPL(rust_helper_kunmap_local);
refcount_t rust_helper_REFCOUNT_INIT(int n)
{
return (refcount_t)REFCOUNT_INIT(n);
+16 -1
View File
@@ -20,6 +20,13 @@ pub struct AllocError;
#[derive(Clone, Copy)]
pub struct Flags(u32);
impl Flags {
/// Get the raw representation of this flag.
pub(crate) fn as_raw(self) -> u32 {
self.0
}
}
impl core::ops::BitOr for Flags {
type Output = Self;
fn bitor(self, rhs: Self) -> Self::Output {
@@ -52,6 +59,14 @@ pub mod flags {
/// This is normally or'd with other flags.
pub const __GFP_ZERO: Flags = Flags(bindings::__GFP_ZERO);
/// Allow the allocation to be in high memory.
///
/// Allocations in high memory may not be mapped into the kernel's address space, so this can't
/// be used with `kmalloc` and other similar methods.
///
/// This is normally or'd with other flags.
pub const __GFP_HIGHMEM: Flags = Flags(bindings::__GFP_HIGHMEM);
/// Users can not sleep and need the allocation to succeed.
///
/// A lower watermark is applied to allow access to "atomic reserves". The current
@@ -66,7 +81,7 @@ pub mod flags {
/// The same as [`GFP_KERNEL`], except the allocation is accounted to kmemcg.
pub const GFP_KERNEL_ACCOUNT: Flags = Flags(bindings::GFP_KERNEL_ACCOUNT);
/// Ror kernel allocations that should not stall for direct reclaim, start physical IO or
/// For kernel allocations that should not stall for direct reclaim, start physical IO or
/// use any filesystem callback. It is very likely to fail to allocate memory, even for very
/// small allocations.
pub const GFP_NOWAIT: Flags = Flags(bindings::GFP_NOWAIT);
+4 -9
View File
@@ -843,11 +843,8 @@ where
let val = unsafe { &mut *slot };
// SAFETY: `slot` is considered pinned.
let val = unsafe { Pin::new_unchecked(val) };
(self.1)(val).map_err(|e| {
// SAFETY: `slot` was initialized above.
unsafe { core::ptr::drop_in_place(slot) };
e
})
// SAFETY: `slot` was initialized above.
(self.1)(val).inspect_err(|_| unsafe { core::ptr::drop_in_place(slot) })
}
}
@@ -941,11 +938,9 @@ where
// SAFETY: All requirements fulfilled since this function is `__init`.
unsafe { self.0.__pinned_init(slot)? };
// SAFETY: The above call initialized `slot` and we still have unique access.
(self.1)(unsafe { &mut *slot }).map_err(|e| {
(self.1)(unsafe { &mut *slot }).inspect_err(|_|
// SAFETY: `slot` was initialized above.
unsafe { core::ptr::drop_in_place(slot) };
e
})
unsafe { core::ptr::drop_in_place(slot) })
}
}
+2
View File
@@ -40,6 +40,7 @@ pub mod ioctl;
pub mod kunit;
#[cfg(CONFIG_NET)]
pub mod net;
pub mod page;
pub mod prelude;
pub mod print;
mod static_assert;
@@ -50,6 +51,7 @@ pub mod sync;
pub mod task;
pub mod time;
pub mod types;
pub mod uaccess;
pub mod workqueue;
#[doc(hidden)]
+250
View File
@@ -0,0 +1,250 @@
// SPDX-License-Identifier: GPL-2.0
//! Kernel page allocation and management.
use crate::{
alloc::{AllocError, Flags},
bindings,
error::code::*,
error::Result,
uaccess::UserSliceReader,
};
use core::ptr::{self, NonNull};
/// A bitwise shift for the page size.
pub const PAGE_SHIFT: usize = bindings::PAGE_SHIFT as usize;
/// The number of bytes in a page.
pub const PAGE_SIZE: usize = bindings::PAGE_SIZE;
/// A bitmask that gives the page containing a given address.
pub const PAGE_MASK: usize = !(PAGE_SIZE - 1);
/// A pointer to a page that owns the page allocation.
///
/// # Invariants
///
/// The pointer is valid, and has ownership over the page.
pub struct Page {
page: NonNull<bindings::page>,
}
// SAFETY: Pages have no logic that relies on them staying on a given thread, so moving them across
// threads is safe.
unsafe impl Send for Page {}
// SAFETY: Pages have no logic that relies on them not being accessed concurrently, so accessing
// them concurrently is safe.
unsafe impl Sync for Page {}
impl Page {
/// Allocates a new page.
///
/// # Examples
///
/// Allocate memory for a page.
///
/// ```
/// use kernel::page::Page;
///
/// # fn dox() -> Result<(), kernel::alloc::AllocError> {
/// let page = Page::alloc_page(GFP_KERNEL)?;
/// # Ok(()) }
/// ```
///
/// Allocate memory for a page and zero its contents.
///
/// ```
/// use kernel::page::Page;
///
/// # fn dox() -> Result<(), kernel::alloc::AllocError> {
/// let page = Page::alloc_page(GFP_KERNEL | __GFP_ZERO)?;
/// # Ok(()) }
/// ```
pub fn alloc_page(flags: Flags) -> Result<Self, AllocError> {
// SAFETY: Depending on the value of `gfp_flags`, this call may sleep. Other than that, it
// is always safe to call this method.
let page = unsafe { bindings::alloc_pages(flags.as_raw(), 0) };
let page = NonNull::new(page).ok_or(AllocError)?;
// INVARIANT: We just successfully allocated a page, so we now have ownership of the newly
// allocated page. We transfer that ownership to the new `Page` object.
Ok(Self { page })
}
/// Returns a raw pointer to the page.
pub fn as_ptr(&self) -> *mut bindings::page {
self.page.as_ptr()
}
/// Runs a piece of code with this page mapped to an address.
///
/// The page is unmapped when this call returns.
///
/// # Using the raw pointer
///
/// It is up to the caller to use the provided raw pointer correctly. The pointer is valid for
/// `PAGE_SIZE` bytes and for the duration in which the closure is called. The pointer might
/// only be mapped on the current thread, and when that is the case, dereferencing it on other
/// threads is UB. Other than that, the usual rules for dereferencing a raw pointer apply: don't
/// cause data races, the memory may be uninitialized, and so on.
///
/// If multiple threads map the same page at the same time, then they may reference with
/// different addresses. However, even if the addresses are different, the underlying memory is
/// still the same for these purposes (e.g., it's still a data race if they both write to the
/// same underlying byte at the same time).
fn with_page_mapped<T>(&self, f: impl FnOnce(*mut u8) -> T) -> T {
// SAFETY: `page` is valid due to the type invariants on `Page`.
let mapped_addr = unsafe { bindings::kmap_local_page(self.as_ptr()) };
let res = f(mapped_addr.cast());
// This unmaps the page mapped above.
//
// SAFETY: Since this API takes the user code as a closure, it can only be used in a manner
// where the pages are unmapped in reverse order. This is as required by `kunmap_local`.
//
// In other words, if this call to `kunmap_local` happens when a different page should be
// unmapped first, then there must necessarily be a call to `kmap_local_page` other than the
// call just above in `with_page_mapped` that made that possible. In this case, it is the
// unsafe block that wraps that other call that is incorrect.
unsafe { bindings::kunmap_local(mapped_addr) };
res
}
/// Runs a piece of code with a raw pointer to a slice of this page, with bounds checking.
///
/// If `f` is called, then it will be called with a pointer that points at `off` bytes into the
/// page, and the pointer will be valid for at least `len` bytes. The pointer is only valid on
/// this task, as this method uses a local mapping.
///
/// If `off` and `len` refers to a region outside of this page, then this method returns
/// [`EINVAL`] and does not call `f`.
///
/// # Using the raw pointer
///
/// It is up to the caller to use the provided raw pointer correctly. The pointer is valid for
/// `len` bytes and for the duration in which the closure is called. The pointer might only be
/// mapped on the current thread, and when that is the case, dereferencing it on other threads
/// is UB. Other than that, the usual rules for dereferencing a raw pointer apply: don't cause
/// data races, the memory may be uninitialized, and so on.
///
/// If multiple threads map the same page at the same time, then they may reference with
/// different addresses. However, even if the addresses are different, the underlying memory is
/// still the same for these purposes (e.g., it's still a data race if they both write to the
/// same underlying byte at the same time).
fn with_pointer_into_page<T>(
&self,
off: usize,
len: usize,
f: impl FnOnce(*mut u8) -> Result<T>,
) -> Result<T> {
let bounds_ok = off <= PAGE_SIZE && len <= PAGE_SIZE && (off + len) <= PAGE_SIZE;
if bounds_ok {
self.with_page_mapped(move |page_addr| {
// SAFETY: The `off` integer is at most `PAGE_SIZE`, so this pointer offset will
// result in a pointer that is in bounds or one off the end of the page.
f(unsafe { page_addr.add(off) })
})
} else {
Err(EINVAL)
}
}
/// Maps the page and reads from it into the given buffer.
///
/// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
/// outside of the page, then this call returns [`EINVAL`].
///
/// # Safety
///
/// * Callers must ensure that `dst` is valid for writing `len` bytes.
/// * Callers must ensure that this call does not race with a write to the same page that
/// overlaps with this read.
pub unsafe fn read_raw(&self, dst: *mut u8, offset: usize, len: usize) -> Result {
self.with_pointer_into_page(offset, len, move |src| {
// SAFETY: If `with_pointer_into_page` calls into this closure, then
// it has performed a bounds check and guarantees that `src` is
// valid for `len` bytes.
//
// There caller guarantees that there is no data race.
unsafe { ptr::copy_nonoverlapping(src, dst, len) };
Ok(())
})
}
/// Maps the page and writes into it from the given buffer.
///
/// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
/// outside of the page, then this call returns [`EINVAL`].
///
/// # Safety
///
/// * Callers must ensure that `src` is valid for reading `len` bytes.
/// * Callers must ensure that this call does not race with a read or write to the same page
/// that overlaps with this write.
pub unsafe fn write_raw(&self, src: *const u8, offset: usize, len: usize) -> Result {
self.with_pointer_into_page(offset, len, move |dst| {
// SAFETY: If `with_pointer_into_page` calls into this closure, then it has performed a
// bounds check and guarantees that `dst` is valid for `len` bytes.
//
// There caller guarantees that there is no data race.
unsafe { ptr::copy_nonoverlapping(src, dst, len) };
Ok(())
})
}
/// Maps the page and zeroes the given slice.
///
/// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
/// outside of the page, then this call returns [`EINVAL`].
///
/// # Safety
///
/// Callers must ensure that this call does not race with a read or write to the same page that
/// overlaps with this write.
pub unsafe fn fill_zero_raw(&self, offset: usize, len: usize) -> Result {
self.with_pointer_into_page(offset, len, move |dst| {
// SAFETY: If `with_pointer_into_page` calls into this closure, then it has performed a
// bounds check and guarantees that `dst` is valid for `len` bytes.
//
// There caller guarantees that there is no data race.
unsafe { ptr::write_bytes(dst, 0u8, len) };
Ok(())
})
}
/// Copies data from userspace into this page.
///
/// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
/// outside of the page, then this call returns [`EINVAL`].
///
/// Like the other `UserSliceReader` methods, data races are allowed on the userspace address.
/// However, they are not allowed on the page you are copying into.
///
/// # Safety
///
/// Callers must ensure that this call does not race with a read or write to the same page that
/// overlaps with this write.
pub unsafe fn copy_from_user_slice_raw(
&self,
reader: &mut UserSliceReader,
offset: usize,
len: usize,
) -> Result {
self.with_pointer_into_page(offset, len, move |dst| {
// SAFETY: If `with_pointer_into_page` calls into this closure, then it has performed a
// bounds check and guarantees that `dst` is valid for `len` bytes. Furthermore, we have
// exclusive access to the slice since the caller guarantees that there are no races.
reader.read_raw(unsafe { core::slice::from_raw_parts_mut(dst.cast(), len) })
})
}
}
impl Drop for Page {
fn drop(&mut self) {
// SAFETY: By the type invariants, we have ownership of the page and can free it.
unsafe { bindings::__free_pages(self.page.as_ptr(), 0) };
}
}
+64
View File
@@ -409,3 +409,67 @@ pub enum Either<L, R> {
/// Constructs an instance of [`Either`] containing a value of type `R`.
Right(R),
}
/// Types for which any bit pattern is valid.
///
/// Not all types are valid for all values. For example, a `bool` must be either zero or one, so
/// reading arbitrary bytes into something that contains a `bool` is not okay.
///
/// It's okay for the type to have padding, as initializing those bytes has no effect.
///
/// # Safety
///
/// All bit-patterns must be valid for this type. This type must not have interior mutability.
pub unsafe trait FromBytes {}
// SAFETY: All bit patterns are acceptable values of the types below.
unsafe impl FromBytes for u8 {}
unsafe impl FromBytes for u16 {}
unsafe impl FromBytes for u32 {}
unsafe impl FromBytes for u64 {}
unsafe impl FromBytes for usize {}
unsafe impl FromBytes for i8 {}
unsafe impl FromBytes for i16 {}
unsafe impl FromBytes for i32 {}
unsafe impl FromBytes for i64 {}
unsafe impl FromBytes for isize {}
// SAFETY: If all bit patterns are acceptable for individual values in an array, then all bit
// patterns are also acceptable for arrays of that type.
unsafe impl<T: FromBytes> FromBytes for [T] {}
unsafe impl<T: FromBytes, const N: usize> FromBytes for [T; N] {}
/// Types that can be viewed as an immutable slice of initialized bytes.
///
/// If a struct implements this trait, then it is okay to copy it byte-for-byte to userspace. This
/// means that it should not have any padding, as padding bytes are uninitialized. Reading
/// uninitialized memory is not just undefined behavior, it may even lead to leaking sensitive
/// information on the stack to userspace.
///
/// The struct should also not hold kernel pointers, as kernel pointer addresses are also considered
/// sensitive. However, leaking kernel pointers is not considered undefined behavior by Rust, so
/// this is a correctness requirement, but not a safety requirement.
///
/// # Safety
///
/// Values of this type may not contain any uninitialized bytes. This type must not have interior
/// mutability.
pub unsafe trait AsBytes {}
// SAFETY: Instances of the following types have no uninitialized portions.
unsafe impl AsBytes for u8 {}
unsafe impl AsBytes for u16 {}
unsafe impl AsBytes for u32 {}
unsafe impl AsBytes for u64 {}
unsafe impl AsBytes for usize {}
unsafe impl AsBytes for i8 {}
unsafe impl AsBytes for i16 {}
unsafe impl AsBytes for i32 {}
unsafe impl AsBytes for i64 {}
unsafe impl AsBytes for isize {}
unsafe impl AsBytes for bool {}
unsafe impl AsBytes for char {}
unsafe impl AsBytes for str {}
// SAFETY: If individual values in an array have no uninitialized portions, then the array itself
// does not have any uninitialized portions either.
unsafe impl<T: AsBytes> AsBytes for [T] {}
unsafe impl<T: AsBytes, const N: usize> AsBytes for [T; N] {}
+388
View File
@@ -0,0 +1,388 @@
// SPDX-License-Identifier: GPL-2.0
//! Slices to user space memory regions.
//!
//! C header: [`include/linux/uaccess.h`](srctree/include/linux/uaccess.h)
use crate::{
alloc::Flags,
bindings,
error::Result,
prelude::*,
types::{AsBytes, FromBytes},
};
use alloc::vec::Vec;
use core::ffi::{c_ulong, c_void};
use core::mem::{size_of, MaybeUninit};
/// The type used for userspace addresses.
pub type UserPtr = usize;
/// A pointer to an area in userspace memory, which can be either read-only or read-write.
///
/// All methods on this struct are safe: attempting to read or write on bad addresses (either out of
/// the bound of the slice or unmapped addresses) will return [`EFAULT`]. Concurrent access,
/// *including data races to/from userspace memory*, is permitted, because fundamentally another
/// userspace thread/process could always be modifying memory at the same time (in the same way that
/// userspace Rust's [`std::io`] permits data races with the contents of files on disk). In the
/// presence of a race, the exact byte values read/written are unspecified but the operation is
/// well-defined. Kernelspace code should validate its copy of data after completing a read, and not
/// expect that multiple reads of the same address will return the same value.
///
/// These APIs are designed to make it difficult to accidentally write TOCTOU (time-of-check to
/// time-of-use) bugs. Every time a memory location is read, the reader's position is advanced by
/// the read length and the next read will start from there. This helps prevent accidentally reading
/// the same location twice and causing a TOCTOU bug.
///
/// Creating a [`UserSliceReader`] and/or [`UserSliceWriter`] consumes the `UserSlice`, helping
/// ensure that there aren't multiple readers or writers to the same location.
///
/// If double-fetching a memory location is necessary for some reason, then that is done by creating
/// multiple readers to the same memory location, e.g. using [`clone_reader`].
///
/// # Examples
///
/// Takes a region of userspace memory from the current process, and modify it by adding one to
/// every byte in the region.
///
/// ```no_run
/// use alloc::vec::Vec;
/// use core::ffi::c_void;
/// use kernel::error::Result;
/// use kernel::uaccess::{UserPtr, UserSlice};
///
/// fn bytes_add_one(uptr: UserPtr, len: usize) -> Result<()> {
/// let (read, mut write) = UserSlice::new(uptr, len).reader_writer();
///
/// let mut buf = Vec::new();
/// read.read_all(&mut buf, GFP_KERNEL)?;
///
/// for b in &mut buf {
/// *b = b.wrapping_add(1);
/// }
///
/// write.write_slice(&buf)?;
/// Ok(())
/// }
/// ```
///
/// Example illustrating a TOCTOU (time-of-check to time-of-use) bug.
///
/// ```no_run
/// use alloc::vec::Vec;
/// use core::ffi::c_void;
/// use kernel::error::{code::EINVAL, Result};
/// use kernel::uaccess::{UserPtr, UserSlice};
///
/// /// Returns whether the data in this region is valid.
/// fn is_valid(uptr: UserPtr, len: usize) -> Result<bool> {
/// let read = UserSlice::new(uptr, len).reader();
///
/// let mut buf = Vec::new();
/// read.read_all(&mut buf, GFP_KERNEL)?;
///
/// todo!()
/// }
///
/// /// Returns the bytes behind this user pointer if they are valid.
/// fn get_bytes_if_valid(uptr: UserPtr, len: usize) -> Result<Vec<u8>> {
/// if !is_valid(uptr, len)? {
/// return Err(EINVAL);
/// }
///
/// let read = UserSlice::new(uptr, len).reader();
///
/// let mut buf = Vec::new();
/// read.read_all(&mut buf, GFP_KERNEL)?;
///
/// // THIS IS A BUG! The bytes could have changed since we checked them.
/// //
/// // To avoid this kind of bug, don't call `UserSlice::new` multiple
/// // times with the same address.
/// Ok(buf)
/// }
/// ```
///
/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
/// [`clone_reader`]: UserSliceReader::clone_reader
pub struct UserSlice {
ptr: UserPtr,
length: usize,
}
impl UserSlice {
/// Constructs a user slice from a raw pointer and a length in bytes.
///
/// Constructing a [`UserSlice`] performs no checks on the provided address and length, it can
/// safely be constructed inside a kernel thread with no current userspace process. Reads and
/// writes wrap the kernel APIs `copy_from_user` and `copy_to_user`, which check the memory map
/// of the current process and enforce that the address range is within the user range (no
/// additional calls to `access_ok` are needed). Validity of the pointer is checked when you
/// attempt to read or write, not in the call to `UserSlice::new`.
///
/// Callers must be careful to avoid time-of-check-time-of-use (TOCTOU) issues. The simplest way
/// is to create a single instance of [`UserSlice`] per user memory block as it reads each byte
/// at most once.
pub fn new(ptr: UserPtr, length: usize) -> Self {
UserSlice { ptr, length }
}
/// Reads the entirety of the user slice, appending it to the end of the provided buffer.
///
/// Fails with [`EFAULT`] if the read happens on a bad address.
pub fn read_all(self, buf: &mut Vec<u8>, flags: Flags) -> Result {
self.reader().read_all(buf, flags)
}
/// Constructs a [`UserSliceReader`].
pub fn reader(self) -> UserSliceReader {
UserSliceReader {
ptr: self.ptr,
length: self.length,
}
}
/// Constructs a [`UserSliceWriter`].
pub fn writer(self) -> UserSliceWriter {
UserSliceWriter {
ptr: self.ptr,
length: self.length,
}
}
/// Constructs both a [`UserSliceReader`] and a [`UserSliceWriter`].
///
/// Usually when this is used, you will first read the data, and then overwrite it afterwards.
pub fn reader_writer(self) -> (UserSliceReader, UserSliceWriter) {
(
UserSliceReader {
ptr: self.ptr,
length: self.length,
},
UserSliceWriter {
ptr: self.ptr,
length: self.length,
},
)
}
}
/// A reader for [`UserSlice`].
///
/// Used to incrementally read from the user slice.
pub struct UserSliceReader {
ptr: UserPtr,
length: usize,
}
impl UserSliceReader {
/// Skip the provided number of bytes.
///
/// Returns an error if skipping more than the length of the buffer.
pub fn skip(&mut self, num_skip: usize) -> Result {
// Update `self.length` first since that's the fallible part of this operation.
self.length = self.length.checked_sub(num_skip).ok_or(EFAULT)?;
self.ptr = self.ptr.wrapping_add(num_skip);
Ok(())
}
/// Create a reader that can access the same range of data.
///
/// Reading from the clone does not advance the current reader.
///
/// The caller should take care to not introduce TOCTOU issues, as described in the
/// documentation for [`UserSlice`].
pub fn clone_reader(&self) -> UserSliceReader {
UserSliceReader {
ptr: self.ptr,
length: self.length,
}
}
/// Returns the number of bytes left to be read from this reader.
///
/// Note that even reading less than this number of bytes may fail.
pub fn len(&self) -> usize {
self.length
}
/// Returns `true` if no data is available in the io buffer.
pub fn is_empty(&self) -> bool {
self.length == 0
}
/// Reads raw data from the user slice into a kernel buffer.
///
/// For a version that uses `&mut [u8]`, please see [`UserSliceReader::read_slice`].
///
/// Fails with [`EFAULT`] if the read happens on a bad address, or if the read goes out of
/// bounds of this [`UserSliceReader`]. This call may modify `out` even if it returns an error.
///
/// # Guarantees
///
/// After a successful call to this method, all bytes in `out` are initialized.
pub fn read_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> Result {
let len = out.len();
let out_ptr = out.as_mut_ptr().cast::<c_void>();
if len > self.length {
return Err(EFAULT);
}
let Ok(len_ulong) = c_ulong::try_from(len) else {
return Err(EFAULT);
};
// SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
// that many bytes to it.
let res =
unsafe { bindings::copy_from_user(out_ptr, self.ptr as *const c_void, len_ulong) };
if res != 0 {
return Err(EFAULT);
}
self.ptr = self.ptr.wrapping_add(len);
self.length -= len;
Ok(())
}
/// Reads raw data from the user slice into a kernel buffer.
///
/// Fails with [`EFAULT`] if the read happens on a bad address, or if the read goes out of
/// bounds of this [`UserSliceReader`]. This call may modify `out` even if it returns an error.
pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
// SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
// `out`.
let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
self.read_raw(out)
}
/// Reads a value of the specified type.
///
/// Fails with [`EFAULT`] if the read happens on a bad address, or if the read goes out of
/// bounds of this [`UserSliceReader`].
pub fn read<T: FromBytes>(&mut self) -> Result<T> {
let len = size_of::<T>();
if len > self.length {
return Err(EFAULT);
}
let Ok(len_ulong) = c_ulong::try_from(len) else {
return Err(EFAULT);
};
let mut out: MaybeUninit<T> = MaybeUninit::uninit();
// SAFETY: The local variable `out` is valid for writing `size_of::<T>()` bytes.
//
// By using the _copy_from_user variant, we skip the check_object_size check that verifies
// the kernel pointer. This mirrors the logic on the C side that skips the check when the
// length is a compile-time constant.
let res = unsafe {
bindings::_copy_from_user(
out.as_mut_ptr().cast::<c_void>(),
self.ptr as *const c_void,
len_ulong,
)
};
if res != 0 {
return Err(EFAULT);
}
self.ptr = self.ptr.wrapping_add(len);
self.length -= len;
// SAFETY: The read above has initialized all bytes in `out`, and since `T` implements
// `FromBytes`, any bit-pattern is a valid value for this type.
Ok(unsafe { out.assume_init() })
}
/// Reads the entirety of the user slice, appending it to the end of the provided buffer.
///
/// Fails with [`EFAULT`] if the read happens on a bad address.
pub fn read_all(mut self, buf: &mut Vec<u8>, flags: Flags) -> Result {
let len = self.length;
VecExt::<u8>::reserve(buf, len, flags)?;
// The call to `try_reserve` was successful, so the spare capacity is at least `len` bytes
// long.
self.read_raw(&mut buf.spare_capacity_mut()[..len])?;
// SAFETY: Since the call to `read_raw` was successful, so the next `len` bytes of the
// vector have been initialized.
unsafe { buf.set_len(buf.len() + len) };
Ok(())
}
}
/// A writer for [`UserSlice`].
///
/// Used to incrementally write into the user slice.
pub struct UserSliceWriter {
ptr: UserPtr,
length: usize,
}
impl UserSliceWriter {
/// Returns the amount of space remaining in this buffer.
///
/// Note that even writing less than this number of bytes may fail.
pub fn len(&self) -> usize {
self.length
}
/// Returns `true` if no more data can be written to this buffer.
pub fn is_empty(&self) -> bool {
self.length == 0
}
/// Writes raw data to this user pointer from a kernel buffer.
///
/// Fails with [`EFAULT`] if the write happens on a bad address, or if the write goes out of
/// bounds of this [`UserSliceWriter`]. This call may modify the associated userspace slice even
/// if it returns an error.
pub fn write_slice(&mut self, data: &[u8]) -> Result {
let len = data.len();
let data_ptr = data.as_ptr().cast::<c_void>();
if len > self.length {
return Err(EFAULT);
}
let Ok(len_ulong) = c_ulong::try_from(len) else {
return Err(EFAULT);
};
// SAFETY: `data_ptr` points into an immutable slice of length `len_ulong`, so we may read
// that many bytes from it.
let res = unsafe { bindings::copy_to_user(self.ptr as *mut c_void, data_ptr, len_ulong) };
if res != 0 {
return Err(EFAULT);
}
self.ptr = self.ptr.wrapping_add(len);
self.length -= len;
Ok(())
}
/// Writes the provided Rust value to this userspace pointer.
///
/// Fails with [`EFAULT`] if the write happens on a bad address, or if the write goes out of
/// bounds of this [`UserSliceWriter`]. This call may modify the associated userspace slice even
/// if it returns an error.
pub fn write<T: AsBytes>(&mut self, value: &T) -> Result {
let len = size_of::<T>();
if len > self.length {
return Err(EFAULT);
}
let Ok(len_ulong) = c_ulong::try_from(len) else {
return Err(EFAULT);
};
// SAFETY: The reference points to a value of type `T`, so it is valid for reading
// `size_of::<T>()` bytes.
//
// By using the _copy_to_user variant, we skip the check_object_size check that verifies the
// kernel pointer. This mirrors the logic on the C side that skips the check when the length
// is a compile-time constant.
let res = unsafe {
bindings::_copy_to_user(
self.ptr as *mut c_void,
(value as *const T).cast::<c_void>(),
len_ulong,
)
};
if res != 0 {
return Err(EFAULT);
}
self.ptr = self.ptr.wrapping_add(len);
self.length -= len;
Ok(())
}
}
+9 -7
View File
@@ -482,24 +482,26 @@ pub unsafe trait HasWork<T, const ID: u64 = 0> {
/// use kernel::sync::Arc;
/// use kernel::workqueue::{self, impl_has_work, Work};
///
/// struct MyStruct {
/// work_field: Work<MyStruct, 17>,
/// struct MyStruct<'a, T, const N: usize> {
/// work_field: Work<MyStruct<'a, T, N>, 17>,
/// f: fn(&'a [T; N]),
/// }
///
/// impl_has_work! {
/// impl HasWork<MyStruct, 17> for MyStruct { self.work_field }
/// impl{'a, T, const N: usize} HasWork<MyStruct<'a, T, N>, 17>
/// for MyStruct<'a, T, N> { self.work_field }
/// }
/// ```
#[macro_export]
macro_rules! impl_has_work {
($(impl$(<$($implarg:ident),*>)?
($(impl$({$($generics:tt)*})?
HasWork<$work_type:ty $(, $id:tt)?>
for $self:ident $(<$($selfarg:ident),*>)?
for $self:ty
{ self.$field:ident }
)*) => {$(
// SAFETY: The implementation of `raw_get_work` only compiles if the field has the right
// type.
unsafe impl$(<$($implarg),*>)? $crate::workqueue::HasWork<$work_type $(, $id)?> for $self $(<$($selfarg),*>)? {
unsafe impl$(<$($generics)+>)? $crate::workqueue::HasWork<$work_type $(, $id)?> for $self {
const OFFSET: usize = ::core::mem::offset_of!(Self, $field) as usize;
#[inline]
@@ -515,7 +517,7 @@ macro_rules! impl_has_work {
pub use impl_has_work;
impl_has_work! {
impl<T> HasWork<Self> for ClosureWork<T> { self.work }
impl{T} HasWork<Self> for ClosureWork<T> { self.work }
}
unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Arc<T>
+39 -6
View File
@@ -35,6 +35,7 @@ use proc_macro::TokenStream;
/// author: "Rust for Linux Contributors",
/// description: "My very own kernel module!",
/// license: "GPL",
/// alias: ["alternate_module_name"],
/// }
///
/// struct MyModule;
@@ -55,13 +56,45 @@ use proc_macro::TokenStream;
/// }
/// ```
///
/// ## Firmware
///
/// The following example shows how to declare a kernel module that needs
/// to load binary firmware files. You need to specify the file names of
/// the firmware in the `firmware` field. The information is embedded
/// in the `modinfo` section of the kernel module. For example, a tool to
/// build an initramfs uses this information to put the firmware files into
/// the initramfs image.
///
/// ```ignore
/// use kernel::prelude::*;
///
/// module!{
/// type: MyDeviceDriverModule,
/// name: "my_device_driver_module",
/// author: "Rust for Linux Contributors",
/// description: "My device driver requires firmware",
/// license: "GPL",
/// firmware: ["my_device_firmware1.bin", "my_device_firmware2.bin"],
/// }
///
/// struct MyDeviceDriverModule;
///
/// impl kernel::Module for MyDeviceDriverModule {
/// fn init() -> Result<Self> {
/// Ok(Self)
/// }
/// }
/// ```
///
/// # Supported argument types
/// - `type`: type which implements the [`Module`] trait (required).
/// - `name`: byte array of the name of the kernel module (required).
/// - `author`: byte array of the author of the kernel module.
/// - `description`: byte array of the description of the kernel module.
/// - `license`: byte array of the license of the kernel module (required).
/// - `alias`: byte array of alias name of the kernel module.
/// - `name`: ASCII string literal of the name of the kernel module (required).
/// - `author`: string literal of the author of the kernel module.
/// - `description`: string literal of the description of the kernel module.
/// - `license`: ASCII string literal of the license of the kernel module (required).
/// - `alias`: array of ASCII string literals of the alias names of the kernel module.
/// - `firmware`: array of ASCII string literals of the firmware files of
/// the kernel module.
#[proc_macro]
pub fn module(ts: TokenStream) -> TokenStream {
module::module(ts)
@@ -312,7 +345,7 @@ pub fn pinned_drop(args: TokenStream, input: TokenStream) -> TokenStream {
///
/// Currently supported modifiers are:
/// * `span`: change the span of concatenated identifier to the span of the specified token. By
/// default the span of the `[< >]` group is used.
/// default the span of the `[< >]` group is used.
/// * `lower`: change the identifier to lower case.
/// * `upper`: change the identifier to upper case.
///
+16 -2
View File
@@ -97,14 +97,22 @@ struct ModuleInfo {
author: Option<String>,
description: Option<String>,
alias: Option<Vec<String>>,
firmware: Option<Vec<String>>,
}
impl ModuleInfo {
fn parse(it: &mut token_stream::IntoIter) -> Self {
let mut info = ModuleInfo::default();
const EXPECTED_KEYS: &[&str] =
&["type", "name", "author", "description", "license", "alias"];
const EXPECTED_KEYS: &[&str] = &[
"type",
"name",
"author",
"description",
"license",
"alias",
"firmware",
];
const REQUIRED_KEYS: &[&str] = &["type", "name", "license"];
let mut seen_keys = Vec::new();
@@ -131,6 +139,7 @@ impl ModuleInfo {
"description" => info.description = Some(expect_string(it)),
"license" => info.license = expect_string_ascii(it),
"alias" => info.alias = Some(expect_string_array(it)),
"firmware" => info.firmware = Some(expect_string_array(it)),
_ => panic!(
"Unknown key \"{}\". Valid keys are: {:?}.",
key, EXPECTED_KEYS
@@ -186,6 +195,11 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
modinfo.emit("alias", &alias);
}
}
if let Some(firmware) = info.firmware {
for fw in firmware {
modinfo.emit("firmware", &fw);
}
}
// Built-in modules also export the `file` modinfo string.
let file =
+1
View File
@@ -14,6 +14,7 @@
#![cfg_attr(test, allow(unsafe_op_in_unsafe_fn))]
#![allow(
clippy::all,
dead_code,
missing_docs,
non_camel_case_types,
non_upper_case_globals,
+1 -1
View File
@@ -5,4 +5,4 @@
# -mstack-protector-guard-reg, added by
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81708
echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m32 -O0 -fstack-protector -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard - -o - 2> /dev/null | grep -q "%fs"
echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -m32 -O0 -fstack-protector -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard - -o - 2> /dev/null | grep -q "%fs"
+1 -1
View File
@@ -1,4 +1,4 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m64 -O0 -mcmodel=kernel -fno-PIE -fstack-protector - -o - 2> /dev/null | grep -q "%gs"
echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -m64 -O0 -mcmodel=kernel -fno-PIE -fstack-protector - -o - 2> /dev/null | grep -q "%gs"
+1 -1
View File
@@ -74,7 +74,7 @@ ln -fns /usr/src/kernels/%{KERNELRELEASE} %{buildroot}/lib/modules/%{KERNELRELEA
echo "/lib/modules/%{KERNELRELEASE}"
for x in alias alias.bin builtin.alias.bin builtin.bin dep dep.bin \
devname softdep symbols symbols.bin; do
devname softdep symbols symbols.bin weakdep; do
echo "%ghost /lib/modules/%{KERNELRELEASE}/modules.${x}"
done
+1 -1
View File
@@ -50,6 +50,6 @@ fi
cat << EOF
%changelog
* $(LC_ALL=C; date +'%a %b %d %Y') ${name} <${email}>
* $(LC_ALL=C date +'%a %b %d %Y') ${name} <${email}>
- Custom built Linux kernel.
EOF
+17 -16
View File
@@ -117,20 +117,16 @@ if [ "$rust_compiler_cversion" -lt "$rust_compiler_min_cversion" ]; then
echo >&2 "***"
exit 1
fi
if [ "$rust_compiler_cversion" -gt "$rust_compiler_min_cversion" ]; then
echo >&2 "***"
echo >&2 "*** Rust compiler '$RUSTC' is too new. This may or may not work."
echo >&2 "*** Your version: $rust_compiler_version"
echo >&2 "*** Expected version: $rust_compiler_min_version"
echo >&2 "***"
warning=1
fi
# Check that the Rust bindings generator is suitable.
#
# Non-stable and distributions' versions may have a version suffix, e.g. `-dev`.
#
# The dummy parameter `workaround-for-0.69.0` is required to support 0.69.0
# (https://github.com/rust-lang/rust-bindgen/pull/2678). It can be removed when
# the minimum version is upgraded past that (0.69.1 already fixed the issue).
rust_bindings_generator_output=$( \
LC_ALL=C "$BINDGEN" --version 2>/dev/null
LC_ALL=C "$BINDGEN" --version workaround-for-0.69.0 2>/dev/null
) || rust_bindings_generator_code=$?
if [ -n "$rust_bindings_generator_code" ]; then
echo >&2 "***"
@@ -165,13 +161,18 @@ if [ "$rust_bindings_generator_cversion" -lt "$rust_bindings_generator_min_cvers
echo >&2 "***"
exit 1
fi
if [ "$rust_bindings_generator_cversion" -gt "$rust_bindings_generator_min_cversion" ]; then
echo >&2 "***"
echo >&2 "*** Rust bindings generator '$BINDGEN' is too new. This may or may not work."
echo >&2 "*** Your version: $rust_bindings_generator_version"
echo >&2 "*** Expected version: $rust_bindings_generator_min_version"
echo >&2 "***"
warning=1
if [ "$rust_bindings_generator_cversion" -eq 6600 ] ||
[ "$rust_bindings_generator_cversion" -eq 6601 ]; then
# Distributions may have patched the issue (e.g. Debian did).
if ! "$BINDGEN" $(dirname $0)/rust_is_available_bindgen_0_66.h >/dev/null; then
echo >&2 "***"
echo >&2 "*** Rust bindings generator '$BINDGEN' versions 0.66.0 and 0.66.1 may not"
echo >&2 "*** work due to a bug (https://github.com/rust-lang/rust-bindgen/pull/2567),"
echo >&2 "*** unless patched (like Debian's)."
echo >&2 "*** Your version: $rust_bindings_generator_version"
echo >&2 "***"
warning=1
fi
fi
# Check that the `libclang` used by the Rust bindings generator is suitable.

Some files were not shown because too many files have changed in this diff Show More