This patch moves the global blkif_io_lock to the per-device structure. The
spinlock seems to exists for two reasons: to disable IRQs when in the interrupt
handlers for blkfront, and to protect the blkfront VBDs when a detachment is
requested.
Having a global blkif_io_lock doesn't make sense given the use case, and it
drastically hinders performance due to contention. All VBDs with pending IOs
have to take the lock in order to get work done, which serializes everything
pretty badly.
Signed-off-by: Steven Noonan <snoonan@amazon.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We should hang onto bdev until we're done with it.
Signed-off-by: Andrew Jones <drjones@redhat.com>
[v1: Fixed up git commit description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
After the previous patch to cfq, there's no ioc_get_changed() user
left. This patch yanks out ioc_{ioprio|cgroup|get}_changed() and all
related stuff.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
cfq caches the associated cfqq's for a given cic. The cache needs to
be flushed if the cic's ioprio or blkcg has changed. It is currently
done by requiring the changing action to set the respective
ICQ_*_CHANGED bit in the icq and testing it from cfq_set_request(),
which involves iterating through all the affected icqs.
All cfq wants to know is whether ioprio and/or blkcg have changed
since the last flush and can be easily achieved by just remembering
the current ioprio and blkcg ID in cic.
This patch adds cic->{ioprio|blkcg_id}, updates all ioprio users to
use the remembered value instead, and updates cfq_set_request() path
such that, instead of using icq_get_changed(), the current values are
compared against the remembered ones and trigger appropriate flush
action if not. Condition tests are moved inside both _changed
functions which are now named check_ioprio_changed() and
check_blkcg_changed().
ioprio.h::task_ioprio*() can't be used anymore and replaced with
open-coded IOPRIO_CLASS_NONE case in cfq_async_queue_prio().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Now that io_cq is managed by block core and guaranteed to exist for
any in-flight request, it is easier and carries more information to
pass around cfq_io_cq than io_context.
This patch updates cfq_init_prio_data(), cfq_find_alloc_queue() and
cfq_get_queue() to take @cic instead of @ioc. This change removes a
duplicate cfq_cic_lookup() from cfq_find_alloc_queue().
This change enables the use of cic-cached ioprio in the next patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add 64bit unique id to blkcg. This will be used by policies which
want blkcg identity test to tell whether the associated blkcg has
changed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With recent plug merge updates, all non-percpu stat updates happen
under queue_lock making stats_lock unnecessary to synchronize stat
updates. The only synchronization necessary is stat reading, which
can be done using u64_stats_sync instead.
This patch removes blkio_group->stats_lock and adds
blkio_group_stats->syncp for reader synchronization.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Restructure blkio_get_stat() to prepare for removal of stats_lock.
* Define BLKIO_STAT_ARR_NR explicitly to denote which stats have
subtypes instead of using BLKIO_STAT_QUEUED.
* Separate out stat acquisition and printing. After this, there are
only two users of blkio_fill_stat(). Just open code it.
* The code was mixing MAX_KEY_LEN and MAX_KEY_LEN - 1. There's no
need to subtract one. Use MAX_KEY_LEN consistently.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blkiocg_reset_stats() implements stat reset for blkio.reset_stats
cgroupfs file. This feature is very unconventional and something
which shouldn't have been merged. It's only useful when there's only
one user or tool looking at the stats. As soon as multiple users
and/or tools are involved, it becomes useless as resetting disrupts
other usages. There are very good reasons why all other stats expect
readers to read values at the start and end of a period and subtract
to determine delta over the period.
The implementation is rather complex - some fields shouldn't be
cleared and it saves some fields, resets whole and restores for some
reason. Reset of percpu stats is also racy. The comment points to
64bit store atomicity for the reason but even without that stores for
zero can simply race with other CPUs doing RMW and get clobbered.
Simplify reset by
* Clear selectively instead of resetting and restoring.
* Grouping debug stat fields to be reset and using memset() over them.
* Not caring about stats_lock.
* Using memset() to reset percpu stats.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With recent plug merge updates, merged stats are no longer called for
plug merges and now only updated while holding queue_lock. As
stats_lock is scheduled to be removed, there's no reason to use percpu
for merged stats. Don't use percpu for merged stats.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Current per cpu stat allocation assumes GFP_KERNEL allocation flag. But in
IO path there are times when we want GFP_NOIO semantics. As there is no
way to pass the allocation flags to alloc_percpu(), this patch delays the
allocation of stats using a worker thread.
v2-> tejun suggested following changes. Changed the patch accordingly.
- move alloc_node location in structure
- reduce the size of names of some of the fields
- Reduce the scope of locking of alloc_list_lock
- Simplified stat_alloc_fn() by allocating stats for all
policies in one go and then assigning these to a group.
v3 -> Andrew suggested to put some comments in the code. Also raised
concerns about trying to allocate infinitely in case of allocation
failure. I have changed the logic to sleep for 10ms before retrying.
That should take care of non-preemptible UP kernels.
v4 -> Tejun had more suggestions.
- drop list_for_each_entry_all()
- instead of msleep() use queue_delayed_work()
- Some cleanups realted to more compact coding.
v5-> tejun suggested more cleanups leading to more compact code.
tj: - Relocated pcpu_stats into blkio_stat_alloc_fn().
- Minor comment update.
- This also fixes suspicious RCU usage warning caused by invoking
cgroup_path() from blkg_alloc() without holding RCU read lock.
Now that blkg_alloc() doesn't require sleepable context, RCU
read lock from blkg_lookup_create() is maintained throughout
blkg_alloc().
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch changes the page allocation in gfs2_block_truncate_page
and two others to GFP_NOFS to avoid deadlock in low-memory conditions.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
As kvm_notify_acked_irq calls kvm_assigned_dev_ack_irq under
rcu_read_lock, we cannot use a mutex in the latter function. Switch to a
spin lock to address this.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
kvm_write_tsc() converts from guest TSC to microseconds, not nanoseconds
as intended. The result is that the window for matching is 1000 seconds,
not 1 second.
Microsecond precision is enough for checking whether the TSC write delta
is within the heuristic values, so use it instead of nanoseconds.
Noted by Avi Kivity.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.
Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
Also move restore_sched_clock_state before __restore_processor_state,
since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
Thanks to Igor Mammedov for tracking it down.
Fixes suspend-to-disk with kvmclock.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Broken monitors and/or broken graphic boards may send erroneous or no
EDID data. This also applies to broken KVM devices that are unable to
correctly forward the EDID data of the connected monitor but invent
their own fantasy data.
This patch allows to specify an EDID data set to be used instead of
probing the monitor for it. It contains built-in data sets of frequently
used screen resolutions. In addition, a particular EDID data set may be
provided in the /lib/firmware directory and loaded via the firmware
interface. The name is passed to the kernel as module parameter of the
drm_kms_helper module either when loaded
options drm_kms_helper edid_firmware=edid/1280x1024.bin
or as kernel commandline parameter
drm_kms_helper.edid_firmware=edid/1280x1024.bin
It is also possible to restrict the usage of a specified EDID data set
to a particular connector. This is done by prepending the name of the
connector to the name of the EDID data set using the syntax
edid_firmware=[<connector>:]<edid>
such as, for example,
edid_firmware=DVI-I-1:edid/1920x1080.bin
in which case no other connector will be affected.
The built-in data sets are
Resolution Name
--------------------------------
1024x768 edid/1024x768.bin
1280x1024 edid/1280x1024.bin
1680x1050 edid/1680x1050.bin
1920x1080 edid/1920x1080.bin
They are ignored, if a file with the same name is available in the
/lib/firmware directory.
The built-in EDID data sets are based on standard timings that may not
apply to a particular monitor and even crash it. Ideally, EDID data of
the connected monitor should be used. They may be obtained through the
drm/cardX/cardX-<connector>/edid entry in the /sys/devices PCI directory
of a correctly working graphics adapter.
It is even possible to specify the name of an EDID data set on-the-fly
via the /sys/module interface, e.g.
echo edid/myedid.bin >/sys/module/drm_kms_helper/parameters/edid_firmware
The new screen mode is considered when the related kernel function is
called for the first time after the change. Such calls are made when the
X server is started or when the display settings dialog is opened in an
already running X server.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Conflicts:
arch/arm/mach-exynos/clock-exynos4.c
arch/arm/mach-exynos/clock.c
The cleanup moves the exynos4 clock implementation away, while
the other branch modifies the file with the old name.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
In omap_gpio_runtime_suspend/resume() the context save/restore should
be independent of bank->enabled_non_wakeup_gpios. This was preventing
context restore of GPIO lines which are not wakeup enabled.
Reported-by: Govindraj Raja <govindraj.raja@ti.com>
Signed-off-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
There are two functions, _set_gpio_dataout_reg() and _set_gpio_dataout_mask()
which writes to dataout register and the dataout context must be saved.
It is missing in the first function, _set_gpio_dataout_reg(). Fix this.
Signed-off-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
This function should be capable of both enabling and disabling interrupts
based upon the *enable* parameter. Right now the function only enables
the interrupt and *enable* is not used at all. So add the interrupt
disable capability also using the parameter.
Signed-off-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
The GPIO trigger parameter is of type unsigned.
enum {
IRQ_TYPE_NONE = 0x00000000,
IRQ_TYPE_EDGE_RISING = 0x00000001,
IRQ_TYPE_EDGE_FALLING = 0x00000002,
IRQ_TYPE_EDGE_BOTH = (IRQ_TYPE_EDGE_FALLING | IRQ_TYPE_EDGE_RISING),
IRQ_TYPE_LEVEL_HIGH = 0x00000004,
IRQ_TYPE_LEVEL_LOW = 0x00000008,
IRQ_TYPE_LEVEL_MASK = (IRQ_TYPE_LEVEL_LOW | IRQ_TYPE_LEVEL_HIGH),
IRQ_TYPE_SENSE_MASK = 0x0000000f,
IRQ_TYPE_PROBE = 0x00000010,
...
};
Even though gpio_irq_type(struct irq_data *d, unsigned type) has the right type
of parameter, the subsequent called functions set_gpio_triggering() and
set_gpio_trigger() wrongly makes it signed integer. Fix this.
Signed-off-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
There are two ways through which wakeup_en register can be programmed
using gpiolib APIs as shown below. It is seen that in the second case
in _set_gpio_wakeup(), even though bank->suspend_wakeup is updated
correctly, its value is not programmed in wakeup_en register. Fix this.
irq_set_type()->gpio_irq_type()->_set_gpio_triggering()->set_gpio_trigger()
irq_set_wake()->gpio_wake_enable()->_set_gpio_wakeup()
Signed-off-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Conflicts:
arch/arm/mach-shmobile/timer.c
This resolves a nonobvious merge conflict between renesas
timer changes in the global timer changes with those
from the renesas soc branch and last minute bug fixes that
went into v3.3.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The exynos drm driver has several subdrv. They each can be module but it
causes unfixed probe order of exynodr drm driver and each subdrv. It
also needs some weird codes such as exynos_drm_fbdev_reinit and
exynos_drm_mode_group_reinit. This patch can remove weird codes and
clear codes through we doesn't modularity each subdrv.
Also this removes unnecessary codes related module.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
this function would be used for drm based 2d acceleration driver
to get/put dma address through gem handle.
when exynos_drm_get_dma_address is called reference count of
gem object would be increased not to be released by gem close and
when exynos_drm_put_dma_address is called the reference count of
this gem object would be decreased to be released.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
with this patch, we can allocate physically continuous or non-continuous
memory and also it creates scatterlist for iommu support so allocated
memory region can be mapped to iommu page table using scatterlist.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
this patch adds mode_fixup feature for hdmi module that
specific driver changes current mode to driver desired mode
properly.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Later Exynos series from Exynos4X12 support HDMI version 1.4. We will
distinguish to use which version via platform data. This patch supports
only default features of HDMI version 1.4(The 3D, sound and etc don't
support yet)
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
In func ip6_append_data,after call skb_put(skb, fraglen + dst_exthdrlen)
the skb->len contains dst_exthdrlen,and we don't reduce dst_exthdrlen at last
This will make fraggap>0 in next "while cycle",and cause the size of skb incorrent
Fix this by reserve headroom for dst_exthdrlen.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The one fix didn't make the cut for 3.3, so we're putting it
into v3.4. Tony tells me "There are more patches needed to make
multiple smsc91x instances work, but we need to hear from people
with such boards first. Then those can be tagged for stable.",
so we don't mark this patch stable yet.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
It cannot map correctly if page fault begins from a intermediate address.
[The driver prefaults the mapping, so we need to work from the correct
base address not the faulting address otherwise the map appears offset by
the fault offset]
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Firmware reports the below ARP offload related information
while sending the target statistic event to the host.
* Number of ARP packets received.
* Number of packets matched with the device IP addr.
* Number of ARP response packet sent to the remote.
This patch adds the additional debug prints in debugfs
entry tgt_stats. It will be useful to know the ARP offload
execution status.
Signed-off-by: Raja Mani <rmani@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
* tag 'drm-intel-next-2012-03-01' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: Only clear the GPU domains upon a successful finish
drm/i915: reenable gmbus on gen3+ again
drm/i915: i2c: unconditionally set up gpio fallback
drm/i915: merge gmbus and gpio i2c adpater into one
drm/i915: merge struct intel_gpio into struct intel_gmbus
i2c: export bit-banging algo functions
drm/nouveau: do a better job at hiding the NIH i2c bit-banging algo
drm/i915: add dev_priv to intel_gmbus
drm/i915: Fix single msg gmbus_xfers writes
drm/i915: error_buffer->ring should be signed
drm/i915: Silence the error message from i915_wait_request()
drm/i915: use the new hdmi_force_audio enum more
drm/i915: No need to search again after retiring requests
drm/i915: Only bump refcnt on objects scheduled for eviction
drm/i915/bios: Downgrade the "signature missing" DRM_ERROR to debug
drm/i915: Ignore LVDS on hp t5745 and hp st5747 thin client
drm/i915: Fixes distorted external screen image on HP 2730p
The hardware only takes 27 bits for the offset, so larger offsets are
truncated, and the display shows random bits other than the intended ones.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>