This feature is only enabled with the new per-interface or ipv4 global
sysctls called 'ignore_routes_with_linkdown'.
net.ipv4.conf.all.ignore_routes_with_linkdown = 0
net.ipv4.conf.default.ignore_routes_with_linkdown = 0
net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
...
When the above sysctls are set, will report to userspace that a route is
dead and will no longer resolve to this nexthop when performing a fib
lookup. This will signal to userspace that the route will not be
selected. The signalling of a RTNH_F_DEAD is only passed to userspace
if the sysctl is enabled and link is down. This was done as without it
the netlink listeners would have no idea whether or not a nexthop would
be selected. The kernel only sets RTNH_F_DEAD internally if the
interface has IFF_UP cleared.
With the new sysctl set, the following behavior can be observed
(interface p8p1 is link-down):
default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 dead linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
90.0.0.1 via 70.0.0.2 dev p7p1 src 70.0.0.1
cache
local 80.0.0.1 dev lo src 80.0.0.1
cache <local>
80.0.0.2 via 10.0.5.2 dev p9p1 src 10.0.5.15
cache
While the route does remain in the table (so it can be modified if
needed rather than being wiped away as it would be if IFF_UP was
cleared), the proper next-hop is chosen automatically when the link is
down. Now interface p8p1 is linked-up:
default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1
90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1
90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2
90.0.0.1 via 80.0.0.2 dev p8p1 src 80.0.0.1
cache
local 80.0.0.1 dev lo src 80.0.0.1
cache <local>
80.0.0.2 dev p8p1 src 80.0.0.1
cache
and the output changes to what one would expect.
If the sysctl is not set, the following output would be expected when
p8p1 is down:
default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
Since the dead flag does not appear, there should be no expectation that
the kernel would skip using this route due to link being down.
v2: Split kernel changes into 2 patches, this actually makes a
behavioral change if the sysctl is set. Also took suggestion from Alex
to simplify code by only checking sysctl during fib lookup and
suggestion from Scott to add a per-interface sysctl.
v3: Code clean-ups to make it more readable and efficient as well as a
reverse path check fix.
v4: Drop binary sysctl
v5: Whitespace fixups from Dave
v6: Style changes from Dave and checkpatch suggestions
v7: One more checkpatch fixup
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a fib flag called RTNH_F_LINKDOWN to any ipv4 nexthops that are
reachable via an interface where carrier is off. No action is taken,
but additional flags are passed to userspace to indicate carrier status.
This also includes a cleanup to fib_disable_ip to more clearly indicate
what event made the function call to replace the more cryptic force
option previously used.
v2: Split out kernel functionality into 2 patches, this patch simply
sets and clears new nexthop flag RTNH_F_LINKDOWN.
v3: Cleanups suggested by Alex as well as a bug noticed in
fib_sync_down_dev and fib_sync_up when multipath was not enabled.
v5: Whitespace and variable declaration fixups suggested by Dave.
v6: Style fixups noticed by Dave; ran checkpatch to be sure I got them
all.
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull HSI updates from Sebastian Reichel:
"Misc fixes"
* tag 'hsi-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
HSI: nokia-modem: use flags argument of devm_gpiod_get to set direction
HSI: nokia-modem: Reduce missing driver message to debug level
HSI: cmt_speech: fix timestamp interface
Pull rdma updates from Doug Ledford:
- a large cleanup of how device capabilities are checked for various
features
- additional cleanups in the MAD processing
- update to the srp driver
- creation and use of centralized log message helpers
- add const to a number of args to calls and clean up call chain
- add support for extended cq create verb
- add support for timestamps on cq completion
- add support for processing OPA MAD packets
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (92 commits)
IB/mad: Add final OPA MAD processing
IB/mad: Add partial Intel OPA MAD support
IB/mad: Add partial Intel OPA MAD support
IB/core: Add OPA MAD core capability flag
IB/mad: Add support for additional MAD info to/from drivers
IB/mad: Convert allocations from kmem_cache to kzalloc
IB/core: Add ability for drivers to report an alternate MAD size.
IB/mad: Support alternate Base Versions when creating MADs
IB/mad: Create a generic helper for DR forwarding checks
IB/mad: Create a generic helper for DR SMP Recv processing
IB/mad: Create a generic helper for DR SMP Send processing
IB/mad: Split IB SMI handling from MAD Recv handler
IB/mad cleanup: Generalize processing of MAD data
IB/mad cleanup: Clean up function params -- find_mad_agent
IB/mlx4: Add support for CQ time-stamping
IB/mlx4: Add mmap call to map the hardware clock
IB/core: Pass hardware specific data in query_device
IB/core: Add timestamp_mask and hca_core_clock to query_device
IB/core: Extend ib_uverbs_create_cq
IB/core: Add CQ creation time-stamping flag
...
While IEEE and CEE use the same structure to store apps, the selector
and priority fields for both are different. Only the priority field is
explained, add documentation explaining how the selector field differs
for both.
cgdcbxd code shows an example of how selector fields differ.
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull crypto update from Herbert Xu:
"Here is the crypto update for 4.2:
API:
- Convert RNG interface to new style.
- New AEAD interface with one SG list for AD and plain/cipher text.
All external AEAD users have been converted.
- New asymmetric key interface (akcipher).
Algorithms:
- Chacha20, Poly1305 and RFC7539 support.
- New RSA implementation.
- Jitter RNG.
- DRBG is now seeded with both /dev/random and Jitter RNG. If kernel
pool isn't ready then DRBG will be reseeded when it is.
- DRBG is now the default crypto API RNG, replacing krng.
- 842 compression (previously part of powerpc nx driver).
Drivers:
- Accelerated SHA-512 for arm64.
- New Marvell CESA driver that supports DMA and more algorithms.
- Updated powerpc nx 842 support.
- Added support for SEC1 hardware to talitos"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (292 commits)
crypto: marvell/cesa - remove COMPILE_TEST dependency
crypto: algif_aead - Temporarily disable all AEAD algorithms
crypto: af_alg - Forbid the use internal algorithms
crypto: echainiv - Only hold RNG during initialisation
crypto: seqiv - Add compatibility support without RNG
crypto: eseqiv - Offer normal cipher functionality without RNG
crypto: chainiv - Offer normal cipher functionality without RNG
crypto: user - Add CRYPTO_MSG_DELRNG
crypto: user - Move cryptouser.h to uapi
crypto: rng - Do not free default RNG when it becomes unused
crypto: skcipher - Allow givencrypt to be NULL
crypto: sahara - propagate the error on clk_disable_unprepare() failure
crypto: rsa - fix invalid select for AKCIPHER
crypto: picoxcell - Update to the current clk API
crypto: nx - Check for bogus firmware properties
crypto: marvell/cesa - add DT bindings documentation
crypto: marvell/cesa - add support for Kirkwood and Dove SoCs
crypto: marvell/cesa - add support for Orion SoCs
crypto: marvell/cesa - add allhwsupport module parameter
crypto: marvell/cesa - add support for all armada SoCs
...
The user interface for timestamps in the new cmt_speech
driver is broken in multiple ways:
- The layout is incompatible between 32-bit and 64-bit user
space, because of the size differences in 'struct timespec'.
This means that the driver can not work when used with 32-bit
user space on a 64-bit kernel.
- As there are plans to change 32-bit user space to use
a 64-bit time_t type in the future, it will also be
incompatible with new 32-bit user space.
- It is using ktime_get_ts under it's deprecated alias
(do_posix_clock_monotonic_gettime).
To keep support for the user space tools written for this driver (which
have lived many years out-of-tree), the interface has been hardened to
unsigned 32-bit values.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Pull perf updates from Ingo Molnar:
"Kernel side changes mostly consist of work on x86 PMU drivers:
- x86 Intel PT (hardware CPU tracer) improvements (Alexander
Shishkin)
- x86 Intel CQM (cache quality monitoring) improvements (Thomas
Gleixner)
- x86 Intel PEBSv3 support (Peter Zijlstra)
- x86 Intel PEBS interrupt batching support for lower overhead
sampling (Zheng Yan, Kan Liang)
- x86 PMU scheduler fixes and improvements (Peter Zijlstra)
There's too many tooling improvements to list them all - here are a
few select highlights:
'perf bench':
- Introduce new 'perf bench futex' benchmark: 'wake-parallel', to
measure parallel waker threads generating contention for kernel
locks (hb->lock). (Davidlohr Bueso)
'perf top', 'perf report':
- Allow disabling/enabling events dynamicaly in 'perf top':
a 'perf top' session can instantly become a 'perf report'
one, i.e. going from dynamic analysis to a static one,
returning to a dynamic one is possible, to toogle the
modes, just press 'f' to 'freeze/unfreeze' the sampling. (Arnaldo Carvalho de Melo)
- Make Ctrl-C stop processing on TUI, allowing interrupting the load of big
perf.data files (Namhyung Kim)
'perf probe': (Masami Hiramatsu)
- Support glob wildcards for function name
- Support $params special probe argument: Collect all function arguments
- Make --line checks validate C-style function name.
- Add --no-inlines option to avoid searching inline functions
- Greatly speed up 'perf probe --list' by caching debuginfo.
- Improve --filter support for 'perf probe', allowing using its arguments
on other commands, as --add, --del, etc.
'perf sched':
- Add option in 'perf sched' to merge like comms to lat output (Josef Bacik)
Plus tons of infrastructure work - in particular preparation for
upcoming threaded perf report support, but also lots of other work -
and fixes and other improvements. See (much) more details in the
shortlog and in the git log"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (305 commits)
perf tools: Configurable per thread proc map processing time out
perf tools: Add time out to force stop proc map processing
perf report: Fix sort__sym_cmp to also compare end of symbol
perf hists browser: React to unassigned hotkey pressing
perf top: Tell the user how to unfreeze events after pressing 'f'
perf hists browser: Honour the help line provided by builtin-{top,report}.c
perf hists browser: Do not exit when 'f' is pressed in 'report' mode
perf top: Replace CTRL+z with 'f' as hotkey for enable/disable events
perf annotate: Rename source_line_percent to source_line_samples
perf annotate: Display total number of samples with --show-total-period
perf tools: Ensure thread-stack is flushed
perf top: Allow disabling/enabling events dynamicly
perf evlist: Add toggle_enable() method
perf trace: Fix race condition at the end of started workloads
perf probe: Speed up perf probe --list by caching debuginfo
perf probe: Show usage even if the last event is skipped
perf tools: Move libtraceevent dynamic list to separated LDFLAGS variable
perf tools: Fix a problem when opening old perf.data with different byte order
perf tools: Ignore .config-detected in .gitignore
perf probe: Fix to return error if no probe is added
...
ASoC: Updates for v4.2
The big thing this release has been Liam's addition of topology support
to the core. We've also seen quite a bit of driver work and the
continuation of Lars' refactoring for component support.
- Support for loading ASoC topology maps from firmware, intended to be
used to allow self-describing DSP firmware images to be built which
can map controls added by the DSP to userspace without the kernel
needing to know about individual DSP firmwares.
- Lots of refactoring to avoid direct access to snd_soc_codec where
it's not needed supporting future refactoring.
- Big refactoring and cleanup serieses for the Wolfson ADSP and TI
TAS2552 drivers.
- Support for TI TAS571x power amplifiers.
- Support for Qualcomm APQ8016 and ZTE ZX296702 SoCs.
- Support for x86 systems with RT5650 and Qualcomm Storm.
# gpg: Signature made Mon 08 Jun 2015 18:48:37 BST using RSA key ID 5D5487D0
# gpg: Oops: keyid_from_fingerprint: no pubkey
# gpg: Good signature from "Mark Brown <broonie@sirena.org.uk>"
# gpg: aka "Mark Brown <broonie@debian.org>"
# gpg: aka "Mark Brown <broonie@kernel.org>"
# gpg: aka "Mark Brown <broonie@tardis.ed.ac.uk>"
# gpg: aka "Mark Brown <broonie@linaro.org>"
# gpg: aka "Mark Brown <Mark.Brown@linaro.org>"
This patch adds a new crypto_user command that allows the admin to
delete the crypto system RNG. Note that this can only be done if
the RNG is currently not in use. The next time it is used a new
system RNG will be allocated.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds getsockopt(SOL_NETLINK, NETLINK_LIST_MEMBERSHIPS) to
retrieve all groups a socket is a member of. Currently, we have to use
getsockname() and look at the nl.nl_groups bitmask. However, this mask is
limited to 32 groups. Hence, similar to NETLINK_ADD_MEMBERSHIP and
NETLINK_DROP_MEMBERSHIP, this adds a separate sockopt to manager higher
groups IDs than 32.
This new NETLINK_LIST_MEMBERSHIPS option takes a pointer to __u32 and the
size of the array. The array is filled with the full membership-set of the
socket, and the required array size is returned in optlen. Hence,
user-space can retry with a properly sized array in case it was too small.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
System wide sampling like 'perf top' or 'perf record -a' read all
threads /proc/xxx/maps before sampling. If there are any threads which
generating a keeping growing huge maps, perf will do infinite loop
during synthesizing. Nothing will be sampled.
This patch fixes this issue by adding per-thread timeout to force stop
this kind of endless proc map processing.
PERF_RECORD_MISC_PROC_MAP_PARSE_TIME_OUT is introduced to indicate that
the mmap record are truncated by time out. User will get warning
notification when truncated mmap records are detected.
Reported-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ying Huang <ying.huang@intel.com>
Link: http://lkml.kernel.org/r/1434549071-25611-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This pulls the full hook netfilter definitions from all those that include
net_namespace.h.
Instead let's just include the bare minimum required in the new
linux/netfilter_defs.h file, and use it from the netfilter netns header files.
I also needed to include in.h and in6.h from linux/netfilter.h otherwise we hit
this compilation error:
In file included from include/linux/netfilter_defs.h:4:0,
from include/net/netns/netfilter.h:4,
from include/net/net_namespace.h:22,
from include/linux/netdevice.h:43,
from net/netfilter/nfnetlink_queue_core.c:23:
include/uapi/linux/netfilter.h:76:17: error: field ‘in’ has incomplete type struct in_addr in;
And also explicit include linux/netfilter.h in several spots.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
This tells userspace that it's safe to use the RADEON_VA_UNMAP operation
of the DRM_RADEON_GEM_VA ioctl.
Cc: stable@vger.kernel.org
(NOTE: Backporting this commit requires at least backports of commits
26d4d129b6,
48afbd70ac and
c29c0876ec as well, otherwise using
RADEON_VA_UNMAP runs into trouble)
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
The colorspace argument was compared against a V4L2_XFER_FUNC define instead
of against a V4L2_COLORSPACE define, returning the wrong answer.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
xt_socket is useful for matching sockets with IP_TRANSPARENT and
taking some action on the matching packets. However, it lacks the
ability to match only a small subset of transparent sockets.
Suppose there are 2 applications, each with its own set of transparent
sockets. The first application wants all matching packets dropped,
while the second application wants them forwarded somewhere else.
Add the ability to retore the skb->mark from the sk_mark. The mark
is only restored if a matching socket is found and the transparent /
nowildcard conditions are satisfied.
Now the 2 hypothetical applications can differentiate their sockets
based on a mark value set with SO_MARK.
iptables -t mangle -I PREROUTING -m socket --transparent \
--restore-skmark -j action
iptables -t mangle -A action -m mark --mark 10 -j action2
iptables -t mangle -A action -m mark --mark 11 -j action3
Signed-off-by: Harout Hedeshian <harouth@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds an additional attribute when sending
packet information via netlink in netfilter_queue module.
It will send additional security context data, so that
userspace applications can verify this context against
their own security databases.
Signed-off-by: Roman Kubiak <r.kubiak@samsung.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD,
kvm_vm_ioctl_check_extension_generic()
|
+ switch (arg) {
+ ...
+ #ifdef CONFIG_HAVE_KVM_IRQFD
+ case KVM_CAP_IRQFD:
+ #endif
+ ...
+ return 1;
+ ...
+ }
|
+ kvm_vm_ioctl_check_extension()
So its not necessary to check this in arch again, and also fix one typo,
s/emlation/emulation.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
This get_info handler will simply dispatch to the appropriate
existing inet protocol handler.
This patch also includes a new netlink attribute
(INET_DIAG_PROTOCOL). This attribute is currently only used
for multicast messages. Without this attribute, there is no
way of knowing the IP protocol used by the socket information
being broadcast. This attribute is not necessary in the 'dump'
variant of this protocol (though it could easily be added)
because dump requests are issued for specific family/protocol
pairs.
Tested: ss -E (note, the -E option has not yet been merged into
the upstream version of ss).
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These groups will contain socket-destruction events for
AF_INET/AF_INET6, IPPROTO_TCP/IPPROTO_UDP.
Near the end of socket destruction, a check for listeners is
performed. In the presence of a listener, rather than completely
cleanup the socket, a unit of work will be added to a private
work queue which will first broadcast information about the socket
and then finish the cleanup operation.
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ndo_get_vf_stats where the PF retrieves and fills the VFs traffic
statistics. We encode the VF stats in a nested manner to allow for
future extensions.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz says:
====================
NFC 4.2 pull request
This is the NFC pull request for 4.2.
- NCI drivers can now define their own handlers for processing
proprietary NCI responses and notifications.
- NFC vendors can use a dedicated netlink API to send their own
proprietary commands, like e.g. all commands needed to implement
vendor specific manufacturing tools.
- A new generic NCI over UART driver against which any NCI chipset
running on top of a serial interface can register.
- The st21nfcb driver is renamed to st-nci as it can and will support
most of ST Microelectronics NCI chipsets.
- The st21nfcb driver can put its CLF in hibernate mode and save
significant amount of power.
- A few st21nfcb minor fixes.
- The NXP NCI driver now supports ACPI enumeration.
- The Marvell NCI driver now supports both USB and serial
physical interfaces.
- The Marvell NCI drivers also supports NCI frames being muxed
over HCI. This is a setting that can be defined by a DT property.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Export the partner_oper_port_state of each port via sysfs and netlink.
In 802.3ad mode it is valuable for the user to be able to check the
partner_oper state, it is already exported via bond's proc entry.
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Export the actor_oper_port_state of each port via sysfs and netlink.
In 802.3ad mode it is valuable for the user to be able to check the
actor_oper state, it is already exported via bond's proc entry.
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
eBPF programs attached to kprobes need to filter based on
current->pid, uid and other fields, so introduce helper functions:
u64 bpf_get_current_pid_tgid(void)
Return: current->tgid << 32 | current->pid
u64 bpf_get_current_uid_gid(void)
Return: current_gid << 32 | current_uid
bpf_get_current_comm(char *buf, int size_of_buf)
stores current->comm into buf
They can be used from the programs attached to TC as well to classify packets
based on current task fields.
Update tracex2 example to print histogram of write syscalls for each process
instead of aggregated for all.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device is part of the hook configuration, so instead of a global
configuration per table, set it to each of the basechain that we create.
This patch reworks ebddf1a8d7 ("netfilter: nf_tables: allow to bind table to
net_device").
Note that this adds a dev_name field in the nft_base_chain structure which is
required the netdev notification subscription that follows up in a patch to
handle gone net_devices.
Suggested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In order to expose timestamp we need to expose two new attributes in
query_device to be used for CQ completion time-stamping:
timestamp_mask - how many bits are valid in the timestamp, where timestamp
values could be 64bits the most.
hca_core_clock - timestamp is given in HW cycles, the frequency in KHZ units
of the HCA, necessary in order to convert cycles to seconds.
This is added both to ib_query_device and its respective uverbs counterpart.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
ib_uverbs_ex_create_cq follows the extension verbs
mechanism. New features (for example, CQ creation flags
field which is added in a downstream patch) could used
via user-space libraries without breaking the ABI.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Samsung updates for v4.2
- add failure(exception) handling
: of_iomap(), of_find_device_by_node() and kstrdup()
- add common poweroff to use PS_HOLD based for all of exynos SoCs
- add exnos_get/set_boot_addr() helper
- constify platform_device_id and irq_domain_ops
- get current parent clock for power domain on/off
- use core_initcall to register power domain driver
- make exynos_core_restart() less verbose
- add support coupled CPUidle for exynos3250
- fix exynos_boot_secondary() return value on timeout
- fix clk_enable() in s3c24xx adc
- fix missing of_node_put() for power domains
* tag 'samsung-mach-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: (301 commits)
ARM: EXYNOS: register power domain driver from core_initcall
ARM: EXYNOS: use PS_HOLD based poweroff for all supported SoCs
ARM: SAMSUNG: Constify platform_device_id
ARM: EXYNOS: Constify irq_domain_ops
ARM: EXYNOS: add coupled cpuidle support for Exynos3250
ARM: EXYNOS: add exynos_get_boot_addr() helper
ARM: EXYNOS: add exynos_set_boot_addr() helper
ARM: EXYNOS: make exynos_core_restart() less verbose
ARM: EXYNOS: fix exynos_boot_secondary() return value on timeout
ARM: EXYNOS: Get current parent clock for power domain on/off
ARM: SAMSUNG: fix clk_enable() WARNing in S3C24XX ADC
ARM: EXYNOS: Add missing of_node_put() when parsing power domains
ARM: EXYNOS: Handle of_find_device_by_node() and kstrdup() failures
ARM: EXYNOS: Handle of of_iomap() failure
Linux 4.1-rc4
....
Some NFC controller supports UART as host interface.
As with SPI, a lot of code can be shared between vendor
drivers. This patch add the generic support of UART and
provides some extension API for vendor specific needs.
This code is strongly inspired by the Bluetooth HCI ldisc
implementation. NCI UART vendor drivers will have to register
themselves to this layer via nci_uart_register.
Underlying tty will have to be configured from user land
thanks to an ioctl.
Signed-off-by: Vincent Cuissard <cuissard@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Fixes userspace compilation errors like:
error: unknown type name ‘uint32_t’
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Using fb modifier flag, support NV12MT format in MDP4.
v2:
- rework the modifier's description [Daniel Vetter's comment]
- drop .set_mode_config() callback [Rob Clark's comment]
v3:
- change VENDOR's name and restrict usage to NV12 [pointed by Daniel]
Signed-off-by: Rob Clark <robdclark@gmail.com>
This adds create/remove window ioctls to create and remove DMA windows.
sPAPR defines a Dynamic DMA windows capability which allows
para-virtualized guests to create additional DMA windows on a PCI bus.
The existing linux kernels use this new window to map the entire guest
memory and switch to the direct DMA operations saving time on map/unmap
requests which would normally happen in a big amounts.
This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and
VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows.
Up to 2 windows are supported now by the hardware and by this driver.
This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional
information such as a number of supported windows and maximum number
levels of TCE tables.
DDW is added as a capability, not as a SPAPR TCE IOMMU v2 unique feature
as we still want to support v2 on platforms which cannot do DDW for
the sake of TCE acceleration in KVM (coming soon).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The existing implementation accounts the whole DMA window in
the locked_vm counter. This is going to be worse with multiple
containers and huge DMA windows. Also, real-time accounting would requite
additional tracking of accounted pages due to the page size difference -
IOMMU uses 4K pages and system uses 4K or 64K pages.
Another issue is that actual pages pinning/unpinning happens on every
DMA map/unmap request. This does not affect the performance much now as
we spend way too much time now on switching context between
guest/userspace/host but this will start to matter when we add in-kernel
DMA map/unmap acceleration.
This introduces a new IOMMU type for SPAPR - VFIO_SPAPR_TCE_v2_IOMMU.
New IOMMU deprecates VFIO_IOMMU_ENABLE/VFIO_IOMMU_DISABLE and introduces
2 new ioctls to register/unregister DMA memory -
VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY -
which receive user space address and size of a memory region which
needs to be pinned/unpinned and counted in locked_vm.
New IOMMU splits physical pages pinning and TCE table update
into 2 different operations. It requires:
1) guest pages to be registered first
2) consequent map/unmap requests to work only with pre-registered memory.
For the default single window case this means that the entire guest
(instead of 2GB) needs to be pinned before using VFIO.
When a huge DMA window is added, no additional pinning will be
required, otherwise it would be guest RAM + 2GB.
The new memory registration ioctls are not supported by
VFIO_SPAPR_TCE_IOMMU. Dynamic DMA window and in-kernel acceleration
will require memory to be preregistered in order to work.
The accounting is done per the user process.
This advertises v2 SPAPR TCE IOMMU and restricts what the userspace
can do with v1 or v2 IOMMUs.
In order to support memory pre-registration, we need a way to track
the use of every registered memory region and only allow unregistration
if a region is not in use anymore. So we need a way to tell from what
region the just cleared TCE was from.
This adds a userspace view of the TCE table into iommu_table struct.
It contains userspace address, one per TCE entry. The table is only
allocated when the ownership over an IOMMU group is taken which means
it is only used from outside of the powernv code (such as VFIO).
As v2 IOMMU supports IODA2 and pre-IODA2 IOMMUs (which do not support
DDW API), this creates a default DMA window for IODA2 for consistency.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use a table for the Demux output. No new information added
here. They were all merged inside the table.
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
There are several anonymous enums here, used via a typedef.
Well, we don't like typedefs on Kernel, so let's de-anonimize
those enums. Then, latter, we may be able to get rid of the
typedefs, at least from Kernelspace.
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
The comment for struct dvb_frontend_parameters is weird, as it
mixes delivery system name (ATSC) with modulation names
(QPSK, QAM, OFDM).
Use delivery system names there on the frequency comment, as this
is clearer, specially after 2GEN delivery systems.
While here, add comments at the union, to make live easier for ones
that may try to understand the convention used by the legacy API.
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
The description of struct dtv_stats has a spmall typo:
FE_SCALE_DECIBELS instead of FE_SCALE_DECIBEL
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
In order to better organize the header file, move the legacy
API (DVBv3) support to the end, just before the ioctl definitions.
This way, we can use just one #if for all of them.
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>