The check was only needed for the DMA-API implementation in the AMD
IOMMU driver, which no longer exists.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200429133712.31431-6-joro@8bytes.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The Intel VT-d driver already has a matching function to determine the
default domain type for a device. Wire it up in intel_iommu_ops.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200429133712.31431-5-joro@8bytes.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Some devices are reqired to use a specific type (identity or dma)
of default domain when they are used with a vendor iommu. When the
system level default domain type is different from it, the vendor
iommu driver has to request a new default domain with
iommu_request_dma_domain_for_dev() and iommu_request_dm_for_dev()
in the add_dev() callback. Unfortunately, these two helpers only
work when the group hasn't been assigned to any other devices,
hence, some vendor iommu driver has to use a private domain if
it fails to request a new default one.
This adds def_domain_type() callback in the iommu_ops, so that
any special requirement of default domain for a device could be
aware by the iommu generic layer.
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
[ jroedel@suse.de: Added iommu_get_def_domain_type() function and use
it to allocate the default domain ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20200429133712.31431-3-joro@8bytes.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move the code out of iommu_group_get_for_dev() into a separate
function.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20200429133712.31431-2-joro@8bytes.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since 'value' is declared as unsigned long, the following statement is
always false.
value < 0
So let's remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
When both CONFIG_DEBUG_FS and CONFIG_PM_SLEEP are disabled, the
functions that got moved out of the #ifdef section now cause
a warning:
drivers/platform/x86/intel_pmc_core.c:654:13: error: 'pmc_core_lpm_display' defined but not used [-Werror=unused-function]
654 | static void pmc_core_lpm_display(struct pmc_dev *pmcdev, struct device *dev,
| ^~~~~~~~~~~~~~~~~~~~
drivers/platform/x86/intel_pmc_core.c:617:13: error: 'pmc_core_slps0_display' defined but not used [-Werror=unused-function]
617 | static void pmc_core_slps0_display(struct pmc_dev *pmcdev, struct device *dev,
| ^~~~~~~~~~~~~~~~~~~~~~
Rather than add even more #ifdefs here, remove them entirely and
let the compiler work it out, it can actually get rid of all the
debugfs calls without problems as long as the struct member is
there.
The two PM functions just need a __maybe_unused annotations to avoid
another warning instead of the #ifdef.
Fixes: aae43c2bcd ("platform/x86: intel_pmc_core: Relocate pmc_core_*_display() to outside of CONFIG_DEBUG_FS")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
asus-nb-wmi does not add any extra functionality on these Asus
Transformer books. They have detachable keyboards, so the hotkeys are
send through a HID device (and handled by the hid-asus driver) and also
the rfkill functionality is not used on these devices.
Besides not adding any extra functionality, initializing the WMI interface
on these devices actually has a negative side-effect. For some reason
the \_SB.ATKD.INIT() function which asus_wmi_platform_init() calls drives
GPO2 (INT33FC:02) pin 8, which is connected to the front facing webcam LED,
high and there is no (WMI or other) interface to drive this low again
causing the LED to be permanently on, even during suspend.
This commit adds a blacklist of DMI system_ids on which not to load the
asus-nb-wmi and adds these Transformer books to this list. This fixes
the webcam LED being permanently on under Linux.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Jasper Lake uses Icelake PCH IPs and the S0ix debug interfaces are same as
Icelake. It uses SLP_S0_DBG register latch/read interface from Icelake
generation. It doesn't use Tiger Lake LPM debug registers. Change the
Jasper Lake S0ix debug interface to use the ICL reg map.
Fixes: 16292bed9c ("platform/x86: intel_pmc_core: Add Atom based Jasper Lake (JSL) platform support")
Signed-off-by: Archana Patni <archana.patni@intel.com>
Acked-by: David E. Box <david.e.box@intel.com>
Tested-by: Divagar Mohandass <divagar.mohandass@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
RK3399 has a Video decoder, define the node in the dtsi. We also add
the missing power-domain in mmu node and enable the block.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Link: https://lore.kernel.org/r/20200403221345.16702-6-ezequiel@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
The LPASS hardware allows configuring the MI2S SD lines to use
when playing/recording audio. However, at the moment the lpass-cpu
driver has SD0 hard-coded for mono/stereo (or additional fixed
SD lines for more channels).
For weird reasons there seems to be hardware that uses one of the
other SD lines for mono/stereo. For example, some Samsung devices
use an external Speaker amplifier connected to Quaternary MI2S.
For some reason, the SD line for audio playback was connected to
SD1 rather than SD0. (I have no idea why...)
At the moment, the lpass-cpu driver cannot be configured to work
for the Speaker on these devices.
The q6afe driver already allows configuring the MI2S SD lines
through the "qcom,sd-lines" device tree property, but this works
only when routing audio through the ADSP.
This commit adds a very similar configuration for the lpass-cpu driver.
It is now possible to add additional subnodes to the lpass device in
the device tree, to configure the SD lines for playback and/or capture.
E.g. for the Samsung devices mentioned above:
&lpass {
dai@3 {
reg = <MI2S_QUATERNARY>;
qcom,playback-sd-lines = <1>;
};
};
qcom,playback/capture-sd-lines takes a list of SD lines (0-3)
in the same format as the q6afe driver. (The difference here is that
q6afe has separate DAIs for playback/capture, while lpass-cpu has one
for both...)
For backwards compatibility with older device trees, the lpass-cpu driver
defaults to LPAIF_I2SCTL_MODE_8CH if the subnode for a DAI is missing.
This is equivalent to the previous behavior: Up to 8 channels can be
configured, and SD0/QUAT01 will be chosen when setting up a stream
with fewer channels.
This allows the speaker to work on Samsung MSM8916 devices
that use an external speaker amplifier.
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200425184657.121991-2-stephan@gerhold.net
Signed-off-by: Mark Brown <broonie@kernel.org>
The lpass-cpu driver now allows configuring the MI2S SD lines
by defining subnodes for one of the DAIs.
Document this in the device tree bindings.
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Link: https://lore.kernel.org/r/20200425184657.121991-1-stephan@gerhold.net
Signed-off-by: Mark Brown <broonie@kernel.org>
The sdmmc1 peripheral is connected on SD-card on STM32MP1-ED1 board.
Add the UHS features the controller is able to manage.
Those features require a level shifter on the board, and the support of
the voltage switch in driver, which is done in Linux v5.7.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Yann Gautier <yann.gautier@st.com>
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
Userspace can severely fragment rb_hole_addr rbtree by manipulating
alignment while allocating buffers. Fragmented rb_hole_addr rbtree
would result in large delays while allocating buffer object for a
userspace application. It takes long time to find suitable hole
because if we fail to find a suitable hole in the first attempt
then we look for neighbouring nodes using rb_prev()/rb_next().
Traversing rbtree using rb_prev()/rb_next() can take really long
time if the tree is fragmented.
This patch improves searches in fragmented rb_hole_addr rbtree by
modifying it to an augmented rbtree which will store an extra field
in drm_mm_node, subtree_max_hole. Each drm_mm_node now stores maximum
hole size for its subtree in drm_mm_node->subtree_max_hole. Using
drm_mm_node->subtree_max_hole, it is possible to eliminate a complete
subtree if that subtree is unable to serve a request hence reducing
number of rb_prev()/rb_next() used.
With this patch applied, 1 million bo allocs on amdgpu took ~8 sec,
compared to 50k bo allocs which took 28 sec without it.
partial test code:
int test_fragmentation(void)
{
int i = 0;
uint32_t minor_version;
uint32_t major_version;
struct amdgpu_bo_alloc_request request = {};
amdgpu_bo_handle vram_handle[MAX_ALLOC] = {};
amdgpu_device_handle device_handle;
request.alloc_size = 4096;
request.phys_alignment = 8192;
request.preferred_heap = AMDGPU_GEM_DOMAIN_VRAM;
int fd = open("/dev/dri/card0", O_RDWR | O_CLOEXEC);
amdgpu_device_initialize(fd, &major_version, &minor_version,
&device_handle);
for (i = 0; i < MAX_ALLOC; i++) {
amdgpu_bo_alloc(device_handle, &request, &vram_handle[i]);
}
for (i = 0; i < MAX_ALLOC; i++)
amdgpu_bo_free(vram_handle[i]);
return 0;
}
v2:
Use RB_DECLARE_CALLBACKS_MAX to maintain subtree_max_hole
v3:
insert_hole_addr() should be static a function
fix return value of next_hole_high_addr()/next_hole_low_addr()
Reported-by: kbuild test robot <lkp@intel.com>
v4:
Fix commit message.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/364341/
Signed-off-by: Christian König <christian.koenig@amd.com>
It was removed in:
Author: Christian König <christian.koenig@amd.com>
Date: Wed Sep 25 11:38:50 2019 +0200
drm/ttm: remove pointers to globals
Signed-off-by: Maya Rashish <coypu@sdf.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/360750/
Signed-off-by: Christian König <christian.koenig@amd.com>
Here's an assortment of drive-by fixes for the new AMD SPI driver.
All of them are compile-tested only.
Lukas Wunner (5):
spi: amd: Fix duplicate iounmap in error path
spi: amd: Pass probe errors back to driver core
spi: amd: Drop duplicate driver data assignments
spi: amd: Fix refcount underflow on remove
spi: amd: Drop superfluous member from struct amd_spi
drivers/spi/spi-amd.c | 27 +++++----------------------
1 file changed, 5 insertions(+), 22 deletions(-)
--
2.26.2
When unbinding pci_epf_test, pci_epf_test_clean_dma_chan() is called in
pci_epf_test_unbind() even though epf_test->dma_supported is false.
As a result, dma_release_channel() will trigger a NULL pointer
dereference because dma_chan is not set.
Avoid calling dma_release_channel() if epf_test->dma_supported
is false.
Link: https://lore.kernel.org/r/1587540287-10458-1-git-send-email-hayashi.kunihiko@socionext.com
Fixes: 5ebf3fc59b ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
MADV_DONTNEED holds mmap_sem in read mode and that implies a
parallel page fault is possible and the kernel can end up with a level 1 PTE
entry (THP entry) converted to a level 0 PTE entry without flushing
the THP TLB entry.
Most architectures including POWER have issues with kernel instantiating a level
0 PTE entry while holding level 1 TLB entries.
The code sequence I am looking at is
down_read(mmap_sem) down_read(mmap_sem)
zap_pmd_range()
zap_huge_pmd()
pmd lock held
pmd_cleared
table details added to mmu_gather
pmd_unlock()
insert a level 0 PTE entry()
tlb_finish_mmu().
Fix this by forcing a tlb flush before releasing pmd lock if this is
not a fullmm invalidate. We can safely skip this invalidate for
task exit case (fullmm invalidate) because in that case we are sure
there can be no parallel fault handlers.
This do change the Qemu guest RAM del/unplug time as below
128 core, 496GB guest:
Without patch:
munmap start: timer = 196449 ms, PID=6681
munmap finish: timer = 196488 ms, PID=6681 - delta = 39ms
With patch:
munmap start: timer = 196345 ms, PID=6879
munmap finish: timer = 196714 ms, PID=6879 - delta = 369ms
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-23-aneesh.kumar@linux.ibm.com
Now that all the lockless page table walk is careful w.r.t the PTE
address returned, we can now revert
commit: 13bd817bb8 ("powerpc/thp: Serialize pmd clear against a linux page table walk.")
We also drop the equivalent IPI from other pte updates routines. We still keep
IPI in hash pmdp collapse and that is to take care of parallel hash page table
insert. The radix pmdp collapse flush can possibly be removed once I am sure
generic code doesn't have the any expectations around parallel gup walk.
This speeds up Qemu guest RAM del/unplug time as below
128 core, 496GB guest:
Without patch:
munmap start: timer = 13162 ms, PID=7684
munmap finish: timer = 95312 ms, PID=7684 - delta = 82150 ms
With patch:
munmap start: timer = 196449 ms, PID=6681
munmap finish: timer = 196488 ms, PID=6681 - delta = 39ms
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-21-aneesh.kumar@linux.ibm.com
This adds _PAGE_PTE check and makes sure we validate the pte value returned via
find_kvm_host_pte.
NOTE: this also considers _PAGE_INVALID to the software valid bit.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-20-aneesh.kumar@linux.ibm.com
Current code just hold rmap lock to ensure parallel page table update is
prevented. That is not sufficient. The kernel should also check whether
a mmu_notifer callback was running in parallel.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-16-aneesh.kumar@linux.ibm.com
Since kvmppc_do_h_enter can get called in realmode use low level
arch_spin_lock which is safe to be called in realmode.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-15-aneesh.kumar@linux.ibm.com
The locking rules for walking nested shadow linux page table is different from process
scoped table. Hence add a helper for nested page table walk and also
add check whether we are holding the right locks.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-11-aneesh.kumar@linux.ibm.com
The locking rules for walking partition scoped table is different from process
scoped table. Hence add a helper for secondary linux page table walk and also
add check whether we are holding the right locks.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-10-aneesh.kumar@linux.ibm.com
These functions can get called in realmode. Hence use low level
arch_spin_lock which is safe to be called in realmode.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-9-aneesh.kumar@linux.ibm.com
read_user_stack_slow is called with interrupts soft disabled and it copies contents
from the page which we find mapped to a specific address. To convert
userspace address to pfn, the kernel now uses lockless page table walk.
The kernel needs to make sure the pfn value read remains stable and is not released
and reused for another process while the contents are read from the page. This
can only be achieved by holding a page reference.
One of the first approaches I tried was to check the pte value after the kernel
copies the contents from the page. But as shown below we can still get it wrong
CPU0 CPU1
pte = READ_ONCE(*ptep);
pte_clear(pte);
put_page(page);
page = alloc_page();
memcpy(page_address(page), "secret password", nr);
memcpy(buf, kaddr + offset, nb);
put_page(page);
handle_mm_fault()
page = alloc_page();
set_pte(pte, page);
if (pte_val(pte) != pte_val(*ptep))
Hence switch to __get_user_pages_fast.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-8-aneesh.kumar@linux.ibm.com
A lockless page table walk should be safe against parallel THP collapse, THP
split and madvise(MADV_DONTNEED)/parallel fault. This patch makes sure kernel
won't reload the pteval when checking for different conditions. The patch also added
a check for pte_present to make sure the kernel is indeed operating
on a PTE and not a pointer to level 0 table page.
The pfn value we find here can be different from the actual pfn on which
machine check happened. This can happen if we raced with a parallel update
of the page table. In such a scenario we end up isolating a wrong pfn. But that
doesn't have any other side effect.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-7-aneesh.kumar@linux.ibm.com
Don't fetch the pte value using lockless page table walk. Instead use the value from the
caller. hash_preload is called with ptl lock held. So it is safe to use the
pte_t address directly.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-6-aneesh.kumar@linux.ibm.com
This is only used with init_mm currently. Walking init_mm is much simpler
because we don't need to handle concurrent page table like other mm_context
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-5-aneesh.kumar@linux.ibm.com
This makes the pte_present check stricter by checking for additional _PAGE_PTE
bit. A level 1 pte pointer (THP pte) can be switched to a pointer to level 0 pte
page table page by following two operations.
1) THP split.
2) madvise(MADV_DONTNEED) in parallel to page fault.
A lockless page table walk need to make sure we can handle such changes
gracefully.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-4-aneesh.kumar@linux.ibm.com
If multiple threads in userspace keep changing the protection keys
mapping a range, there can be a scenario where kernel takes a key fault
but the pkey value found in the siginfo struct is a permissive one.
This can confuse the userspace as shown in the below test case.
/* use this to control the number of test iterations */
static void pkeyreg_set(int pkey, unsigned long rights)
{
unsigned long reg, shift;
shift = (NR_PKEYS - pkey - 1) * PKEY_BITS_PER_PKEY;
asm volatile("mfspr %0, 0xd" : "=r"(reg));
reg &= ~(((unsigned long) PKEY_BITS_MASK) << shift);
reg |= (rights & PKEY_BITS_MASK) << shift;
asm volatile("mtspr 0xd, %0" : : "r"(reg));
}
static unsigned long pkeyreg_get(void)
{
unsigned long reg;
asm volatile("mfspr %0, 0xd" : "=r"(reg));
return reg;
}
static int sys_pkey_mprotect(void *addr, size_t len, int prot, int pkey)
{
return syscall(SYS_pkey_mprotect, addr, len, prot, pkey);
}
static int sys_pkey_alloc(unsigned long flags, unsigned long access_rights)
{
return syscall(SYS_pkey_alloc, flags, access_rights);
}
static int sys_pkey_free(int pkey)
{
return syscall(SYS_pkey_free, pkey);
}
static int faulting_pkey;
static int permissive_pkey;
static pthread_barrier_t pkey_set_barrier;
static pthread_barrier_t mprotect_barrier;
static void pkey_handle_fault(int signum, siginfo_t *sinfo, void *ctx)
{
unsigned long pkeyreg;
/* FIXME: printf is not signal-safe but for the current purpose,
it gets the job done. */
printf("pkey: exp = %d, got = %d\n", faulting_pkey, sinfo->si_pkey);
fflush(stdout);
assert(sinfo->si_code == SEGV_PKUERR);
assert(sinfo->si_pkey == faulting_pkey);
/* clear pkey permissions to let the faulting instruction continue */
pkeyreg_set(faulting_pkey, 0x0);
}
static void *do_mprotect_fault(void *p)
{
unsigned long rights, pkeyreg, pgsize;
unsigned int i;
void *region;
int pkey;
srand(time(NULL));
pgsize = sysconf(_SC_PAGESIZE);
rights = PKEY_DISABLE_WRITE;
region = p;
/* allocate key, no permissions */
assert((pkey = sys_pkey_alloc(0, PKEY_DISABLE_ACCESS)) > 0);
pkeyreg_set(4, 0x0);
/* cache the pkey here as the faulting pkey for future reference
in the signal handler */
faulting_pkey = pkey;
printf("%s: faulting pkey = %d\n", __func__, faulting_pkey);
/* try to allocate, mprotect and free pkeys repeatedly */
for (i = 0; i < NUM_ITERATIONS; i++) {
/* sync up with the other thread here */
pthread_barrier_wait(&pkey_set_barrier);
/* make sure that the pkey used by the non-faulting thread
is made permissive for this thread's context too so that
no faults are triggered because it still might have been
set to a restrictive value */
// pkeyreg_set(permissive_pkey, 0x0);
/* sync up with the other thread here */
pthread_barrier_wait(&mprotect_barrier);
/* perform mprotect */
assert(!sys_pkey_mprotect(region, pgsize, PROT_READ | PROT_WRITE, pkey));
/* choose a random byte from the protected region and
attempt to write to it, this will generate a fault */
*((char *) region + (rand() % pgsize)) = rand();
/* restore pkey permissions as the signal handler may have
cleared the bit out for the sake of continuing */
pkeyreg_set(pkey, PKEY_DISABLE_WRITE);
}
/* free pkey */
sys_pkey_free(pkey);
return NULL;
}
static void *do_mprotect_nofault(void *p)
{
unsigned long pgsize;
unsigned int i, j;
void *region;
int pkey;
pgsize = sysconf(_SC_PAGESIZE);
region = p;
/* try to allocate, mprotect and free pkeys repeatedly */
for (i = 0; i < NUM_ITERATIONS; i++) {
/* allocate pkey, all permissions */
assert((pkey = sys_pkey_alloc(0, 0)) > 0);
permissive_pkey = pkey;
/* sync up with the other thread here */
pthread_barrier_wait(&pkey_set_barrier);
pthread_barrier_wait(&mprotect_barrier);
/* perform mprotect on the common page, no faults will
be triggered as this is most permissive */
assert(!sys_pkey_mprotect(region, pgsize, PROT_READ | PROT_WRITE, pkey));
/* free pkey */
assert(!sys_pkey_free(pkey));
}
return NULL;
}
int main(int argc, char **argv)
{
pthread_t fault_thread, nofault_thread;
unsigned long pgsize;
struct sigaction act;
pthread_attr_t attr;
cpu_set_t fault_cpuset, nofault_cpuset;
unsigned int i;
void *region;
/* allocate memory region to protect */
pgsize = sysconf(_SC_PAGESIZE);
assert(region = memalign(pgsize, pgsize));
CPU_ZERO(&fault_cpuset);
CPU_SET(0, &fault_cpuset);
CPU_ZERO(&nofault_cpuset);
CPU_SET(8, &nofault_cpuset);
assert(!pthread_attr_init(&attr));
/* setup sigsegv signal handler */
act.sa_handler = 0;
act.sa_sigaction = pkey_handle_fault;
assert(!sigprocmask(SIG_SETMASK, 0, &act.sa_mask));
act.sa_flags = SA_SIGINFO;
act.sa_restorer = 0;
assert(!sigaction(SIGSEGV, &act, NULL));
/* setup barrier for the two threads */
pthread_barrier_init(&pkey_set_barrier, NULL, 2);
pthread_barrier_init(&mprotect_barrier, NULL, 2);
/* setup and start threads */
assert(!pthread_create(&fault_thread, &attr, &do_mprotect_fault, region));
assert(!pthread_setaffinity_np(fault_thread, sizeof(cpu_set_t), &fault_cpuset));
assert(!pthread_create(&nofault_thread, &attr, &do_mprotect_nofault, region));
assert(!pthread_setaffinity_np(nofault_thread, sizeof(cpu_set_t), &nofault_cpuset));
/* cleanup */
assert(!pthread_attr_destroy(&attr));
assert(!pthread_join(fault_thread, NULL));
assert(!pthread_join(nofault_thread, NULL));
assert(!pthread_barrier_destroy(&pkey_set_barrier));
assert(!pthread_barrier_destroy(&mprotect_barrier));
free(region);
puts("PASS");
return EXIT_SUCCESS;
}
The above test can result the below failure without this patch.
pkey: exp = 3, got = 3
pkey: exp = 3, got = 4
a.out: pkey-siginfo-race.c💯 pkey_handle_fault: Assertion `sinfo->si_pkey == faulting_pkey' failed.
Aborted
Check for vma access before considering this a key fault. If vma pkey allow
access retry the acess again.
Test case is written by Sandipan Das <sandipan@linux.ibm.com> hence added SOB
from him.
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-3-aneesh.kumar@linux.ibm.com
Fetch pkey from vma instead of linux page table. Also document the fact that in
some cases the pkey returned in siginfo won't be the same as the one we took
keyfault on. Even with linux page table walk, we can end up in a similar scenario.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-2-aneesh.kumar@linux.ibm.com
- Fix a regression introduced in the last merge window, which results
in guests in HPT mode dying randomly.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJeni/pAAoJEJ2a6ncsY3GfTRoIANAQjIZi96AfJcfnrYQ4yUF7
scxawTiJ9VavvsEJLJ7vsozrJ4xxmvmA0fFWC84uw9+BwPqoLFFvZTjazbGEDVvF
FGwNBR/k7nfFVMIHS3K9iy9KjvYL3xkL26AgFTDJFq8hmOO9pH0txuk4r7SXb+NX
bGG0mScAD/Dg/HwAHAS6EP3jT35QtGTK62p8foqVTziTNcmBn9Ywtg0lEzAcq2iY
Y1BUD4Ov3cggshMI9SqHE8Yyq0XA2Wi6ggcyz/gVzvcbdFQmtg57Tri8nN8661LX
XKh+VTpYSIxNs5GgjwlNesJzJ9h6CSynJF556qrjQ0XsXcNqvn8fcZdNQ+hnRYw=
=Y19W
-----END PGP SIGNATURE-----
Merge tag 'kvm-ppc-fixes-5.7-1' into topic/ppc-kvm
This brings in a fix from the kvm-ppc tree that was merged to mainline
after rc2, and so isn't in the base of our topic branch. We'd like it
in the topic branch because it interacts with patches we plan to carry
in this branch.
There have been cases where seemingly innocuous patches have broken the
uAPI by changing the memory layout of the parameter struct. Generally such
changes also introduce a change in the size of the entire struct. This
patch adds a sanity check to avoid such cases happening in the future.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Tested-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Move the alignment attribute of struct ipu3_uapi_awb_fr_config_s to the
field in struct ipu3_uapi_4a_config, the other location where the struct
is used.
Fixes: commit c9d52c114a ("media: staging: imgu: Address a compiler warning on alignment")
Reported-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Bingbu Cao <bingbu.cao@intel.com>
Cc: stable@vger.kernel.org # for v5.3 and up
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
This reverts commit c9d52c114a.
The patch being reverted changed the memory layout of struct
ipu3_uapi_acc_param. Revert it, and address the compiler warning issues in
further patches.
Fixes: commit c9d52c114a ("media: staging: imgu: Address a compiler warning on alignment")
Reported-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Bingbu Cao <bingbu.cao@intel.com>
Cc: stable@vger.kernel.org # for v5.3 and up
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Add Bingbu Cao and Tian Shu Qiu as reviewers for the IPU3 ImgU driver.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Bingbu Cao <bingbu.cao@intel.com>
Cc: Tian Shu Qiu <tian.shu.qiu@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Add some explanation of the ImgU running mode and add more information
about firmware selection and running mode usage.
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
According to the v4l spec s_fmt must return EBUSY while the
particular queue is streaming. Add such check in encoder and
decoder s_fmt methods.
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
According to stateful Codec API the decoder will process all
remaining buffers from before the source change event in
dynamic-resolution-change state and mark the last buffer with
V4L2_BUF_FLAG_LAST.
In Venus case the firmware doesn't mark that last buffer and
some mechanism have to be created in v4l decoder driver.
Fortunately the firmware interface (HFI) claims that the
decoder output buffers will be returned to v4l decoder
driver before it send the insufficient event.
In order to do that we save last queued in the driver capture
buffer in the event_notify and issue flush on output firmware
buffers queue. Once the saved buffer is returned (as a result of
flush command) we mark it as LAST. For all that possible we
extend HFI flush command with one more argument and one more
flush_done HFI driver callback.
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>