linux-xiaomi-chiron

Author	SHA1	Message	Date
Cédric Le Goater	ba418a0278	KVM: PPC: Book3S HV: Use the new IRQ chip to detect passthrough interrupts Passthrough PCI MSI interrupts are detected in KVM with a check on a specific EOI handler (P8) or on XIVE (P9). We can now check the PCI-MSI IRQ chip which is cleaner. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-14-clg@kaod.org	2021-08-10 23:14:58 +10:00
Cédric Le Goater	0fcfe2247e	powerpc/powernv/pci: Add MSI domains This is very similar to the MSI domains of the pSeries platform. The MSI allocator is directly handled under the Linux PHB in the in-the-middle "PNV-MSI" domain. Only the XIVE (P9/P10) parent domain is supported for now. Support for XICS will come later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-13-clg@kaod.org	2021-08-10 23:14:58 +10:00
Cédric Le Goater	2c50d7e99e	powerpc/powernv/pci: Introduce __pnv_pci_ioda_msi_setup() It will be used as a 'compose_msg' handler of the MSI domain introduced later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-12-clg@kaod.org	2021-08-10 23:14:58 +10:00
Cédric Le Goater	174db9e7f7	powerpc/pseries/pci: Add support of MSI domains to PHB hotplug Simply allocate or release the MSI domains when a PHB is inserted in or removed from the machine. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-11-clg@kaod.org	2021-08-10 23:14:58 +10:00
Cédric Le Goater	9a014f4568	powerpc/pseries/pci: Add a msi_free() handler to clear XIVE data The MSI domain clears the IRQ with msi_domain_free(), which calls irq_domain_free_irqs_top(), which clears the handler data. This is a problem for the XIVE controller since we need to unmap MMIO pages and free a specific XIVE structure. The 'msi_free()' handler is called before irq_domain_free_irqs_top() when the handler data is still available. Use that to clear the XIVE controller data. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-10-clg@kaod.org	2021-08-10 23:14:58 +10:00
Cédric Le Goater	07817a578a	powerpc/pseries/pci: Add a domain_free_irqs() handler The RTAS firmware can not disable one MSI at a time. It's all or nothing. We need a custom free IRQ handler for that. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-9-clg@kaod.org	2021-08-10 23:14:58 +10:00
Cédric Le Goater	292145a6e5	powerpc/xive: Remove irqd_is_started() check when setting the affinity In the early days of XIVE support, commit `cffb717ceb` ("powerpc/xive: Ensure active irqd when setting affinity") tried to fix an issue related to interrupt migration. If the root cause was related to CPU unplug, it should have been fixed and there is no reason to keep the irqd_is_started() check. This test is also breaking affinity setting of MSIs which can set before starting the associated IRQ. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-8-clg@kaod.org	2021-08-10 23:14:57 +10:00
Cédric Le Goater	5690bcae18	powerpc/xive: Drop unmask of MSIs at startup That was a workaround in the XIVE domain because of the lack of MSI domain. This is now handled. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-7-clg@kaod.org	2021-08-10 23:14:57 +10:00
Cédric Le Goater	a5f3d2c17b	powerpc/pseries/pci: Add MSI domains Two IRQ domains are added on top of default machine IRQ domain. First, the top level "pSeries-PCI-MSI" domain deals with the MSI specificities. In this domain, the HW IRQ numbers are generated by the PCI MSI layer, they compose a unique ID for an MSI source with the PCI device identifier and the MSI vector number. These numbers can be quite large on a pSeries machine running under the IBM Hypervisor and /sys/kernel/irq/ and /proc/interrupts will require small fixes to show them correctly. Second domain is the in-the-middle "pSeries-MSI" domain which acts as a proxy between the PCI MSI subsystem and the machine IRQ subsystem. It usually allocate the MSI vector numbers but, on pSeries machines, this is done by the RTAS FW and RTAS returns IRQ numbers in the IRQ number space of the machine. This is why the in-the-middle "pSeries-MSI" domain has the same HW IRQ numbers as its parent domain. Only the XIVE (P9/P10) parent domain is supported for now. We still need to add support for IRQ domain hierarchy under XICS. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-6-clg@kaod.org	2021-08-10 23:14:57 +10:00
Cédric Le Goater	6c2ab2a5d6	powerpc/xive: Ease debugging of xive_irq_set_affinity() pr_debug() is easier to activate and it helps to know how the kernel configures the HW when tweaking the IRQ subsystem. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-5-clg@kaod.org	2021-08-10 23:14:57 +10:00
Cédric Le Goater	14be098c53	powerpc/xive: Add support for IRQ domain hierarchy This adds handlers to allocate/free IRQs in a domain hierarchy. We could try to use xive_irq_domain_map() in xive_irq_domain_alloc() but we rely on xive_irq_alloc_data() to set the IRQ handler data and duplicating the code is simpler. xive_irq_free_data() needs to be called when IRQ are freed to clear the MMIO mappings and free the XIVE handler data, xive_irq_data structure. This is going to be a problem with MSI domains which we will address later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-4-clg@kaod.org	2021-08-10 23:14:57 +10:00
Cédric Le Goater	e812020073	powerpc/pseries/pci: Introduce rtas_prepare_msi_irqs() This splits the routine setting the MSIs in two parts: allocation of MSIs for the PCI device at the FW level (RTAS) and the actual mapping and activation of the IRQs. rtas_prepare_msi_irqs() will serve as a handler for the PCI MSI domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-3-clg@kaod.org	2021-08-10 23:14:57 +10:00
Cédric Le Goater	786e5b102a	powerpc/pseries/pci: Introduce __find_pe_total_msi() It will help to size the PCI MSI domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-2-clg@kaod.org	2021-08-10 23:14:56 +10:00
Alexey Kardashevskiy	2ac78e0c00	KVM: PPC: Use arch_get_random_seed_long instead of powernv variant The powernv_get_random_long() does not work in nested KVM (which is pseries) and produces a crash when accessing in_be64(rng->regs) in powernv_get_random_long(). This replaces powernv_get_random_long with the ppc_md machine hook wrapper. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210805075649.2086567-1-aik@ozlabs.ru	2021-08-10 23:14:56 +10:00
Anton Blanchard	9b49f979b3	powerpc/configs: Disable legacy ptys on microwatt defconfig We shouldn't need legacy ptys, and disabling the option improves boot time by about 0.5 seconds. Signed-off-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210805112005.3cb1f412@kryten.localdomain	2021-08-10 23:14:56 +10:00
Jordan Niethe	27fd111105	powerpc: Always inline radix_enabled() to fix build failure This is the same as commit `acdad8fb4a` ("powerpc: Force inlining of mmu_has_feature to fix build failure") but for radix_enabled(). The config in the linked bugzilla causes the following build failure: LD .tmp_vmlinux.kallsyms1 powerpc64-linux-ld: arch/powerpc/mm/pgtable.o: in function `.__ptep_set_access_flags': pgtable.c:(.text+0x17c): undefined reference to `.radix__ptep_set_access_flags' powerpc64-linux-ld: arch/powerpc/mm/pageattr.o: in function `.change_page_attr': pageattr.c:(.text+0xc0): undefined reference to `.radix__flush_tlb_kernel_range' etc. This is due to radix_enabled() not being inlined. See extract from building with -Winline: In file included from arch/powerpc/include/asm/lppaca.h:46, from arch/powerpc/include/asm/paca.h:17, from arch/powerpc/include/asm/current.h:13, from include/linux/thread_info.h:23, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:78, from include/linux/spinlock.h:51, from include/linux/mmzone.h:8, from include/linux/gfp.h:6, from arch/powerpc/mm/pgtable.c:21: arch/powerpc/include/asm/book3s/64/pgtable.h: In function '__ptep_set_access_flags': arch/powerpc/include/asm/mmu.h:327:20: error: inlining failed in call to 'radix_enabled': call is unlikely and code size would grow [-Werror=inline] The code relies on constant folding of MMU_FTRS_POSSIBLE at buildtime and elimination of non possible parts of code at compile time. For this to work radix_enabled() must be inlined so make it __always_inline. Reported-by: Erhard F. <erhard_f@mailbox.org> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jordan Niethe <jniethe5@gmail.com> [mpe: Trimmed error messages in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://bugzilla.kernel.org/show_bug.cgi?id=213803 Link: https://lore.kernel.org/r/20210804013724.514468-1-jniethe5@gmail.com	2021-08-10 23:14:56 +10:00
Sebastian Andrzej Siewior	5ae36401ca	powerpc: Replace deprecated CPU-hotplug functions. The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210803141621.780504-4-bigeasy@linutronix.de	2021-08-10 23:14:56 +10:00
kernel test robot	c00103abf7	powerpc/kexec: fix for_each_child.cocci warning for_each_node_by_type should have of_node_put() before return. Generated by: scripts/coccinelle/iterators/for_each_child.cocci Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2108031654080.17639@hadrien	2021-08-10 23:14:55 +10:00
Laurent Dufour	bd1dd4c5f5	powerpc/pseries: Prevent free CPU ids being reused on another node When a CPU is hot added, the CPU ids are taken from the available mask from the lower possible set. If that set of values was previously used for a CPU attached to a different node, it appears to an application as if these CPUs have migrated from one node to another node which is not expected. To prevent this, it is needed to record the CPU ids used for each node and to not reuse them on another node. However, to prevent CPU hot plug to fail, in the case the CPU ids is starved on a node, the capability to reuse other nodes’ free CPU ids is kept. A warning is displayed in such a case to warn the user. A new CPU bit mask (node_recorded_ids_map) is introduced for each possible node. It is populated with the CPU onlined at boot time, and then when a CPU is hot plugged to a node. The bits in that mask remain when the CPU is hot unplugged, to remind this CPU ids have been used for this node. If no id set was found, a retry is made without removing the ids used on the other nodes to try reusing them. This is the way ids have been allocated prior to this patch. The effect of this patch can be seen by removing and adding CPUs using the Qemu monitor. In the following case, the first CPU from the node 2 is removed, then the first one from the node 1 is removed too. Later, the first CPU of the node 2 is added back. Without that patch, the kernel will number these CPUs using the first CPU ids available which are the ones freed when removing the second CPU of the node 0. This leads to the CPU ids 16-23 to move from the node 1 to the node 2. With the patch applied, the CPU ids 32-39 are used since they are the lowest free ones which have not been used on another node. At boot time: [root@vm40 ~]# numactl -H \| grep cpus node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 Vanilla kernel, after the CPU hot unplug/plug operations: [root@vm40 ~]# numactl -H \| grep cpus node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 1 cpus: 24 25 26 27 28 29 30 31 node 2 cpus: 16 17 18 19 20 21 22 23 40 41 42 43 44 45 46 47 Patched kernel, after the CPU hot unplug/plug operations: [root@vm40 ~]# numactl -H \| grep cpus node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 1 cpus: 24 25 26 27 28 29 30 31 node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210429174908.16613-1-ldufour@linux.ibm.com	2021-08-10 23:14:55 +10:00
Laurent Dufour	d144f4d5a8	pseries/drmem: update LMBs after LPM After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be updated by the hypervisor in the case the NUMA topology of the LPAR's memory is updated. This is handled by the kernel, but the memory's node is not updated because there is no way to move a memory block between nodes from the Linux kernel point of view. If later a memory block is added or removed, drmem_update_dt() is called and it is overwriting the DT node ibm,dynamic-reconfiguration-memory to match the added or removed LMB. But the LMB's associativity node has not been updated after the DT node update and thus the node is overwritten by the Linux's topology instead of the hypervisor one. Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is updated to force an update of the LMB's associativity. However, ignore the call to that hook when the update has been triggered by drmem_update_dt(). Because, in that case, the LMB tree has been used to set the DT property and thus it doesn't need to be updated back. Since drmem_update_dt() is called under the protection of the device_hotplug_lock and the hook is called in the same context, use a simple boolean variable to detect that call. Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210517090606.56930-1-ldufour@linux.ibm.com	2021-08-10 23:14:55 +10:00
Laurent Dufour	9c7248bb8d	powerpc/numa: Consider the max NUMA node for migratable LPAR When a LPAR is migratable, we should consider the maximum possible NUMA node instead of the number of NUMA nodes from the actual system. The DT property 'ibm,current-associativity-domains' defines the maximum number of nodes the LPAR can see when running on that box. But if the LPAR is being migrated on another box, it may see up to the nodes defined by 'ibm,max-associativity-domains'. So if a LPAR is migratable, that value should be used. Unfortunately, there is no easy way to know if an LPAR is migratable or not. The hypervisor exports the property 'ibm,migratable-partition' in the case it set to migrate partition, but that would not mean that the current partition is migratable. Without this patch, when a LPAR is started on a 2 node box and then migrated to a 3 node box, the hypervisor may spread the LPAR's CPUs on the 3rd node. In that case if a CPU from that 3rd node is added to the LPAR, it will be wrongly assigned to the node because the kernel has been set to use up to 2 nodes (the configuration of the departure node). With this patch applies, the CPU is correctly added to the 3rd node. Fixes: `f9f130ff2e` ("powerpc/numa: Detect support for coregroup") Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210511073136.17795-1-ldufour@linux.ibm.com	2021-08-10 23:14:55 +10:00
Christophe Leroy	c8a6d91005	powerpc/non-smp: Unconditionaly call smp_mb() on switch_mm Commit `3ccfebedd8` ("powerpc, membarrier: Skip memory barrier in switch_mm()") added some logic to skip the smp_mb() in switch_mm_irqs_off() before the call to switch_mmu_context(). However, on non SMP smp_mb() is just a compiler barrier and doing it unconditionaly is simpler than the logic used to check whether the barrier is needed or not. After the patch: 00000000 <switch_mm_irqs_off>: ... c: 7c 04 18 40 cmplw r4,r3 10: 81 24 00 24 lwz r9,36(r4) 14: 91 25 04 c8 stw r9,1224(r5) 18: 4d 82 00 20 beqlr 1c: 48 00 00 00 b 1c <switch_mm_irqs_off+0x1c> 1c: R_PPC_REL24 switch_mmu_context Before the patch: 00000000 <switch_mm_irqs_off>: ... c: 7c 04 18 40 cmplw r4,r3 10: 81 24 00 24 lwz r9,36(r4) 14: 91 25 04 c8 stw r9,1224(r5) 18: 4d 82 00 20 beqlr 1c: 81 24 00 28 lwz r9,40(r4) 20: 71 29 00 0a andi. r9,r9,10 24: 40 82 00 34 bne 58 <switch_mm_irqs_off+0x58> 28: 48 00 00 00 b 28 <switch_mm_irqs_off+0x28> 28: R_PPC_REL24 switch_mmu_context ... 58: 2c 03 00 00 cmpwi r3,0 5c: 41 82 ff cc beq 28 <switch_mm_irqs_off+0x28> 60: 48 00 00 00 b 60 <switch_mm_irqs_off+0x60> 60: R_PPC_REL24 switch_mmu_context Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e9d501da0c59f60ca767b1b3ea4603fce6d02b9e.1625486440.git.christophe.leroy@csgroup.eu	2021-08-10 23:14:55 +10:00
Christophe Leroy	09ca497528	powerpc: Remove in_kernel_text() Last user of in_kernel_text() stopped using in with commit `549e8152de` ("powerpc: Make the 64-bit kernel as a position-independent executable"). Generic function is_kernel_text() does the same. So remote it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2a3a5b6f8cc0ef4e854d7b764f66aa8d2ee270d2.1624813698.git.christophe.leroy@csgroup.eu	2021-08-10 23:14:55 +10:00
Nicholas Piggin	cf9c615cde	powerpc/64s/perf: Always use SIAR for kernel interrupts If an interrupt is taken in kernel mode, always use SIAR for it rather than looking at regs_sipr. This prevents samples piling up around interrupt enable (hard enable or interrupt replay via soft enable) in PMUs / modes where the PR sample indication is not in synch with SIAR. This results in better sampling of interrupt entry and exit in particular. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210720141504.420110-1-npiggin@gmail.com	2021-08-04 10:53:39 +10:00
Parth Shah	e9ef81e107	powerpc/smp: Use existing L2 cache_map cpumask to find L3 cache siblings On POWER10 systems, the "ibm,thread-groups" property "2" indicates the cpus in thread-group share both L2 and L3 caches. Hence, use cache_property = 2 itself to find both the L2 and L3 cache siblings. Hence, create a new thread_group_l3_cache_map to keep list of L3 siblings, but fill the mask using same property "2" array. Signed-off-by: Parth Shah <parth@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210728175607.591679-4-parth@linux.ibm.com	2021-08-04 10:53:39 +10:00
Gautham R. Shenoy	69aa8e0785	powerpc/cacheinfo: Remove the redundant get_shared_cpu_map() The helper function get_shared_cpu_map() was added in 'commit `500fe5f550` ("powerpc/cacheinfo: Report the correct shared_cpu_map on big-cores")' and subsequently expanded upon in 'commit `0be47634db` ("powerpc/cacheinfo: Print correct cache-sibling map/list for L2 cache")' in order to help report the correct groups of threads sharing these caches on big-core systems where groups of threads within a core can share different sets of caches. Now that powerpc/cacheinfo is aware of "ibm,thread-groups" property, cache->shared_cpu_map contains the correct set of thread-siblings sharing the cache. Hence we no longer need the functions get_shared_cpu_map(). This patch removes this function. We also remove the helper function index_dir_to_cpu() which was only called by get_shared_cpu_map(). With these functions removed, we can still see the correct cache-sibling map/list for L1 and L2 caches on systems with L1 and L2 caches distributed among groups of threads in a core. With this patch, on a SMT8 POWER10 system where the L1 and L2 caches are split between the two groups of threads in a core, for CPUs 8,9, the L1-Data, L1-Instruction, L2, L3 cache CPU sibling list is as follows: $ grep . /sys/devices/system/cpu/cpu[89]/cache/index[0123]/shared_cpu_list /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8,10,12,14 /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8,10,12,14 /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8,10,12,14 /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8-15 /sys/devices/system/cpu/cpu9/cache/index0/shared_cpu_list:9,11,13,15 /sys/devices/system/cpu/cpu9/cache/index1/shared_cpu_list:9,11,13,15 /sys/devices/system/cpu/cpu9/cache/index2/shared_cpu_list:9,11,13,15 /sys/devices/system/cpu/cpu9/cache/index3/shared_cpu_list:8-15 $ ppc64_cpu --smt=4 $ grep . /sys/devices/system/cpu/cpu[89]/cache/index[0123]/shared_cpu_list /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8,10 /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8,10 /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8,10 /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8-11 /sys/devices/system/cpu/cpu9/cache/index0/shared_cpu_list:9,11 /sys/devices/system/cpu/cpu9/cache/index1/shared_cpu_list:9,11 /sys/devices/system/cpu/cpu9/cache/index2/shared_cpu_list:9,11 /sys/devices/system/cpu/cpu9/cache/index3/shared_cpu_list:8-11 $ ppc64_cpu --smt=2 $ grep . /sys/devices/system/cpu/cpu[89]/cache/index[0123]/shared_cpu_list /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8 /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8 /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8 /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8-9 /sys/devices/system/cpu/cpu9/cache/index0/shared_cpu_list:9 /sys/devices/system/cpu/cpu9/cache/index1/shared_cpu_list:9 /sys/devices/system/cpu/cpu9/cache/index2/shared_cpu_list:9 /sys/devices/system/cpu/cpu9/cache/index3/shared_cpu_list:8-9 $ ppc64_cpu --smt=1 $ grep . /sys/devices/system/cpu/cpu[89]/cache/index[0123]/shared_cpu_list /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8 /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8 /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8 /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210728175607.591679-3-parth@linux.ibm.com	2021-08-04 10:53:39 +10:00
Gautham R. Shenoy	a4bec516b9	powerpc/cacheinfo: Lookup cache by dt node and thread-group id Currently the cacheinfo code on powerpc indexes the "cache" objects (modelling the L1/L2/L3 caches) where the key is device-tree node corresponding to that cache. On some of the POWER server platforms thread-groups within the core share different sets of caches (Eg: On SMT8 POWER9 systems, threads 0,2,4,6 of a core share L1 cache and threads 1,3,5,7 of the same core share another L1 cache). On such platforms, there is a single device-tree node corresponding to that cache and the cache-configuration within the threads of the core is indicated via "ibm,thread-groups" device-tree property. Since the current code is not aware of the "ibm,thread-groups" property, on the aforementoined systems, cacheinfo code still treats all the threads in the core to be sharing the cache because of the single device-tree node (In the earlier example, the cacheinfo code would says CPUs 0-7 share L1 cache). In this patch, we make the powerpc cacheinfo code aware of the "ibm,thread-groups" property. We indexe the "cache" objects by the key-pair (device-tree node, thread-group id). For any CPUX, for a given level of cache, the thread-group id is defined to be the first CPU in the "ibm,thread-groups" cache-group containing CPUX. For levels of cache which are not represented in "ibm,thread-groups" property, the thread-group id is -1. [parth: Remove "static" keyword for the definition of "thread_group_l1_cache_map" and "thread_group_l2_cache_map" to get rid of the compile error.] Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Parth Shah <parth@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210728175607.591679-2-parth@linux.ibm.com	2021-08-04 10:53:39 +10:00
Masahiro Yamada	86ff0bce2e	powerpc: move the install rule to arch/powerpc/Makefile Currently, the install target in arch/powerpc/Makefile descends into arch/powerpc/boot/Makefile to invoke the shell script, but there is no good reason to do so. arch/powerpc/Makefile can run the shell script directly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210729141937.445051-3-masahiroy@kernel.org	2021-08-04 10:53:39 +10:00
Masahiro Yamada	9bef456b20	powerpc: make the install target not depend on any build artifact The install target should not depend on any build artifact. The reason is explained in commit `19514fc665` ("arm, kbuild: make "make install" not depend on vmlinux"). Change the PowerPC installation code in a similar way. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210729141937.445051-2-masahiroy@kernel.org	2021-08-04 10:53:38 +10:00
Masahiro Yamada	156ca4e650	powerpc: remove unused zInstall target from arch/powerpc/boot/Makefile Commit `c913e5f95e` ("powerpc/boot: Don't install zImage.* from make install") added the zInstall target to arch/powerpc/boot/Makefile, but you cannot use it since the corresponding hook is missing in arch/powerpc/Makefile. It has never worked since its addition. Nobody has complained about it for 7 years, which means this code was unneeded. With this removal, the install.sh will be passed in with 4 parameters. Simplify the shell script. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210729141937.445051-1-masahiroy@kernel.org	2021-08-04 10:53:38 +10:00
Nathan Chancellor	d04691d373	cpuidle: pseries: Mark pseries_idle_proble() as __init After commit 7cbd631d4dec ("cpuidle: pseries: Fixup CEDE0 latency only for POWER10 onwards"), pseries_idle_probe() is no longer inlined when compiling with clang, which causes a modpost warning: WARNING: modpost: vmlinux.o(.text+0xc86a54): Section mismatch in reference from the function pseries_idle_probe() to the function .init.text:fixup_cede0_latency() The function pseries_idle_probe() references the function __init fixup_cede0_latency(). This is often because pseries_idle_probe lacks a __init annotation or the annotation of fixup_cede0_latency is wrong. pseries_idle_probe() is a non-init function, which calls fixup_cede0_latency(), which is an init function, explaining the mismatch. pseries_idle_probe() is only called from pseries_processor_idle_init(), which is an init function, so mark pseries_idle_probe() as __init so there is no more warning. Fixes: `054e44ba99` ("cpuidle: pseries: Add function to parse extended CEDE records") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210803211547.1093820-1-nathan@kernel.org	2021-08-04 10:53:38 +10:00
Michal Suchanek	a6cae77f1b	powerpc/stacktrace: Include linux/delay.h commit `7c6986ade6` ("powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()") introduces udelay() call without including the linux/delay.h header. This may happen to work on master but the header that declares the functionshould be included nonetheless. Fixes: `7c6986ade6` ("powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210729180103.15578-1-msuchanek@suse.de	2021-08-03 22:33:37 +10:00
Gautham R. Shenoy	71737a6c2a	cpuidle: pseries: Do not cap the CEDE0 latency in fixup_cede0_latency() Currently in fixup_cede0_latency() code, we perform the fixup the CEDE(0) exit latency value only if minimum advertized extended CEDE latency values are less than 10us. This was done so as to not break the expected behaviour on POWER8 platforms where the advertised latency was higher than the default 10us, which would delay the SMT folding on the core. However, after the earlier patch "cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards", we can be sure that the fixup of CEDE0 latency is going to happen only from POWER10 onwards. Hence unconditionally use the minimum exit latency provided by the platform. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1626676399-15975-3-git-send-email-ego@linux.vnet.ibm.com	2021-08-03 22:33:37 +10:00
Gautham R. Shenoy	50741b70b0	cpuidle: pseries: Fixup CEDE0 latency only for POWER10 onwards Commit `d947fb4c96` ("cpuidle: pseries: Fixup exit latency for CEDE(0)") sets the exit latency of CEDE(0) based on the latency values of the Extended CEDE states advertised by the platform On POWER9 LPARs, the firmwares advertise a very low value of 2us for CEDE1 exit latency on a Dedicated LPAR. The latency advertized by the PHYP hypervisor corresponds to the latency required to wakeup from the underlying hardware idle state. However the wakeup latency from the LPAR perspective should include 1. The time taken to transition the CPU from the Hypervisor into the LPAR post wakeup from platform idle state 2. Time taken to send the IPI from the source CPU (waker) to the idle target CPU (wakee). 1. can be measured via timer idle test, where we queue a timer, say for 1ms, and enter the CEDE state. When the timer fires, in the timer handler we compute how much extra timer over the expected 1ms have we consumed. On a a POWER9 LPAR the numbers are CEDE latency measured using a timer (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 2601 5677 5668.74 5917 6413 9299 455.01 1. and 2. combined can be determined by an IPI latency test where we send an IPI to an idle CPU and in the handler compute the time difference between when the IPI was sent and when the handler ran. We see the following numbers on POWER9 LPAR. CEDE latency measured using an IPI (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 711 7564 7369.43 8559 9514 9698 1200.01 Suppose, we consider the 99th percentile latency value measured using the IPI to be the wakeup latency, the value would be 9.5us This is in the ballpark of the default value of 10us. Hence, use the exit latency of CEDE(0) based on the latency values advertized by platform only from POWER10 onwards. The values advertized on POWER10 platforms is more realistic and informed by the latency measurements. For earlier platforms stick to the default value of 10us. The fix was suggested by Michael Ellerman. Fixes: `d947fb4c96` ("cpuidle: pseries: Fixup exit latency for CEDE(0)") Reported-by: Enrico Joedecke <joedecke@de.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com	2021-08-03 22:33:19 +10:00
Hari Bathini	8119cefd9a	powerpc/kexec: blacklist functions called in real mode for kprobe As kprobe does not handle events happening in real mode, blacklist the functions that only get called in real mode or in kexec sequence with MMU turned off. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/162626687834.155313.4692863392927831843.stgit@hbathini-workstation.ibm.com	2021-07-26 20:38:51 +10:00
Michael Ellerman	e1ab9a730b	Merge branch 'fixes' into next Merge our fixes branch, which contains some fixes that didn't make it into rc2 but which we'd like in next.	2021-07-26 20:37:53 +10:00
Nicholas Piggin	d9c57d3ed5	KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM state The H_ENTER_NESTED hypercall is handled by the L0, and it is a request by the L1 to switch the context of the vCPU over to that of its L2 guest, and return with an interrupt indication. The L1 is responsible for switching some registers to guest context, and the L0 switches others (including all the hypervisor privileged state). If the L2 MSR has TM active, then the L1 is responsible for recheckpointing the L2 TM state. Then the L1 exits to L0 via the H_ENTER_NESTED hcall, and the L0 saves the TM state as part of the exit, and then it recheckpoints the TM state as part of the nested entry and finally HRFIDs into the L2 with TM active MSR. Not efficient, but about the simplest approach for something that's horrendously complicated. Problems arise if the L1 exits to the L0 with a TM state which does not match the L2 TM state being requested. For example if the L1 is transactional but the L2 MSR is non-transactional, or vice versa. The L0's HRFID can take a TM Bad Thing interrupt and crash. Fix this by disallowing H_ENTER_NESTED in TM[T] state entirely, and then ensuring that if the L1 is suspended then the L2 must have TM active, and if the L1 is not suspended then the L2 must not have TM active. Fixes: `360cae3137` ("KVM: PPC: Book3S HV: Nested guest entry via hypercall") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2021-07-23 16:19:38 +10:00
Nicholas Piggin	f62f3c2064	KVM: PPC: Book3S: Fix H_RTAS rets buffer overflow The kvmppc_rtas_hcall() sets the host rtas_args.rets pointer based on the rtas_args.nargs that was provided by the guest. That guest nargs value is not range checked, so the guest can cause the host rets pointer to be pointed outside the args array. The individual rtas function handlers check the nargs and nrets values to ensure they are correct, but if they are not, the handlers store a -3 (0xfffffffd) failure indication in rets[0] which corrupts host memory. Fix this by testing up front whether the guest supplied nargs and nret would exceed the array size, and fail the hcall directly without storing a failure indication to rets[0]. Also expand on a comment about why we kill the guest and try not to return errors directly if we have a valid rets[0] pointer. Fixes: `8e591cb720` ("KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls") Cc: stable@vger.kernel.org # v3.10+ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2021-07-23 16:14:31 +10:00
Linus Torvalds	2734d6c1b1	Linux 5.14-rc2	2021-07-18 14:13:49 -07:00
Linus Torvalds	8c25c44764	perf tools fixes for v5.14: 1st batch - Skip invalid hybrid PMU on hybrid systems when the atom (little) CPUs are offlined. - Fix 'perf test' problems related to the recently added hybrid (BIG/little) code. - Split ARM's coresight (hw tracing) decode by aux records to avoid fatal decoding errors. - Fix add event failure in 'perf probe' when running 32-bit perf in a 64-bit kernel. - Fix 'perf sched record' failure when CONFIG_SCHEDSTATS is not set. - Fix memory and refcount leaks detected by ASAn when running 'perf test', should be clean of warnings now. - Remove broken definition of __LITTLE_ENDIAN from tools' linux/kconfig.h, which was breaking the build in some systems. - Cast PTHREAD_STACK_MIN to int as it may turn into 'long sysconf(__SC_THREAD_STACK_MIN_VALUE), breaking the build in some systems. - Fix libperf build error with LIBPFM4=1. - Sync UAPI files changed by the memfd_secret new syscall. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYPR8OgAKCRCyPKLppCJ+ J9vSAQDBCAvQiZQvOC8tgvUV8CIRorB/O43HOiX1pEUCVCdPRgEA0WGIbfjH3bMj zYOpXRpDag5iAVvn3DmpJ38laXJexQI= =0t/C -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v5.14-2021-07-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Skip invalid hybrid PMU on hybrid systems when the atom (little) CPUs are offlined. - Fix 'perf test' problems related to the recently added hybrid (BIG/little) code. - Split ARM's coresight (hw tracing) decode by aux records to avoid fatal decoding errors. - Fix add event failure in 'perf probe' when running 32-bit perf in a 64-bit kernel. - Fix 'perf sched record' failure when CONFIG_SCHEDSTATS is not set. - Fix memory and refcount leaks detected by ASAn when running 'perf test', should be clean of warnings now. - Remove broken definition of __LITTLE_ENDIAN from tools' linux/kconfig.h, which was breaking the build in some systems. - Cast PTHREAD_STACK_MIN to int as it may turn into 'long sysconf(__SC_THREAD_STACK_MIN_VALUE), breaking the build in some systems. - Fix libperf build error with LIBPFM4=1. - Sync UAPI files changed by the memfd_secret new syscall. * tag 'perf-tools-fixes-for-v5.14-2021-07-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (35 commits) perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernel perf data: Close all files in close_dir() perf probe-file: Delete namelist in del_events() on the error path perf test bpf: Free obj_buf perf trace: Free strings in trace__parse_events_option() perf trace: Free syscall tp fields in evsel->priv perf trace: Free syscall->arg_fmt perf trace: Free malloc'd trace fields on exit perf lzma: Close lzma stream on exit perf script: Fix memory 'threads' and 'cpus' leaks on exit perf script: Release zstd data perf session: Cleanup trace_event perf inject: Close inject.output on exit perf report: Free generated help strings for sort option perf env: Fix memory leak of cpu_pmu_caps perf test maps__merge_in: Fix memory leak of maps perf dso: Fix memory leak in dso__new_map() perf test event_update: Fix memory leak of unit perf test event_update: Fix memory leak of evlist ...	2021-07-18 12:20:27 -07:00
Linus Torvalds	f0eb870a84	Fixes for 5.14-rc: * Fix shrink eligibility checking when sparse inode clusters enabled. * Reset '..' directory entries when unlinking directories to prevent verifier errors if fs is shrinked later. * Don't report unusable extent size hints to FSGETXATTR. * Don't warn when extent size hints are unusable because the sysadmin configured them that way. * Fix insufficient parameter validation in GROWFSRT ioctl. * Fix integer overflow when adding rt volumes to filesystem. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmDwajMACgkQ+H93GTRK tOtPlw//TyFCUf8krAknSc5tF5yI77JPIj19a43frMN/L6G68aDu2eBhIHbpwzAL LuPGksSqMJyBylwhZXYt83jfar0sGTl48sPqxYBr6YOj+LAmiba2PdlXGQPdWcC3 1DGqvaiFZ3ENRlk0GG0a4xPJK4nW18uujc6L8yxrzA+0VsFirorqvzay7COic0Js b5eytqqbTsqvUc7+WX+yfWyyH+zWs+VIxBJVT7kirLY8u9Da5L54JdSbTWiXq7K0 8zu7d0oyiDpb0Yb5tylLh9eoG5TVHLNHN65Le7k1dCSw/zaJMFhpc0MsxJ9zVDI5 9NjmyOXP/uFGG/dvyqZUxOKsj2W0DwGeDRF3hxkLTWeiPFGfBYRHiBDCOpOoNIIy i3hTUCAqlgt+Ehyau8HR68L06V6bD9j991HM3MK2phNRKgC+iCH1poXixjAcaddR pAG1dF8WkEUQiKn9/oikNRAA8z5+z6NHZIZiEH1DUIGAh39SBVTuD2qSVIqj0BiR pOy1gwVOFKpwdRps/JQVLPoGP7NHyOxJ2dLAYpWWYiPS2Ch6UvyXiL8aMTVF8DaV G5Rsu+e0BJV38ass3enOOh1Nok//dIyKNS0iUO9TLdw5dZ6i3+36YeKskf+KLtXQ m+i3hfAqM+EbyU/jUsykKWAeELV8FZTM2Ckc5utrkhOaZToktJ4= =dKfy -----END PGP SIGNATURE----- Merge tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Darrick Wong: "A few fixes for issues in the new online shrink code, additional corrections for my recent bug-hunt w.r.t. extent size hints on realtime, and improved input checking of the GROWFSRT ioctl. IOW, the usual 'I somehow got bored during the merge window and resumed auditing the farther reaches of xfs': - Fix shrink eligibility checking when sparse inode clusters enabled - Reset '..' directory entries when unlinking directories to prevent verifier errors if fs is shrinked later - Don't report unusable extent size hints to FSGETXATTR - Don't warn when extent size hints are unusable because the sysadmin configured them that way - Fix insufficient parameter validation in GROWFSRT ioctl - Fix integer overflow when adding rt volumes to filesystem" * tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: detect misaligned rtinherit directory extent size hints xfs: fix an integer overflow error in xfs_growfs_rt xfs: improve FSGROWFSRT precondition checking xfs: don't expose misaligned extszinherit hints to userspace xfs: correct the narrative around misaligned rtinherit/extszinherit dirs xfs: reset child dir '..' entry when unlinking child xfs: check for sparse inode clusters that cross new EOAG when shrinking	2021-07-18 11:27:25 -07:00
Linus Torvalds	fbf1bddc4e	Fixes for 5.14-rc: * Fix KASAN warnings due to integer overflow in SEEK_DATA/SEEK_HOLE. * Fix assertion errors when using inlinedata files on gfs2. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmDwarkACgkQ+H93GTRK tOuxKRAAlQTwSUaVlQ4NlJfCyB9U3ZLDv3hfwZor6HrSJtycQ2UPD1VcQe7sCH87 yUL/hL5zgFPXigSGu/E8yebi3rnH2joC526iCbHSs4BaAj78FRgLNctMOUGw1o0Y rzUxhFbrCXMPnbEHB5AkXO7HsN6Ba1Ch369Fh0NVaIxDx77jX34JWtKHwEduJz8b +tCgTAjyH0N48jZ9iEiTFx8sI/lhDCXxpQwAOzZos19KZ5RjCUurtWYdboINTQ1j X78bDrnlo1t2VHUYzDSyt2/HIBPqxwriyBVyDY67NVfJ7cQd0yYOFy0+/7Dfik02 scOg6/tI5b3pkQqum4DA6U27kau08I+JgzyV8GKgXSk+YV6Tjj+qzfIDMhKcZTFS SDhtcFNyjCrdaFC52E6F19YX/VAHP/asqBlZ8pqxTboiwMwQNoIt/xw4TvqW62bM aT04YXUIQwGCuLPU1sEUh/7io6IvRoig/OtCuKcIPFJO5mJtag4cT/996LqkMAzZ j/MZxXEpx/fP6Dpn1DWLvpmQTpVrkag6j82dKXdKita12aYZClfLyPqGhQGREPVm tjmZxbk/MRLL9joTx0Eil99S1oTra+ohw8ilwC9m+lkh/PrmVeQAYCRTU8Rlq66T 2D8eo1umeO6Y/Bhfhnn8zyY/5XoLwvWu4JHLPqKjSmtWmtUx6Pg= =Vbt9 -----END PGP SIGNATURE----- Merge tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull iomap fixes from Darrick Wong: "A handful of bugfixes for the iomap code. There's nothing especially exciting here, just fixes for UBSAN (not KASAN as I erroneously wrote in the tag message) warnings about undefined behavior in the SEEK_DATA/SEEK_HOLE code, and some reshuffling of per-page block state info to fix some problems with gfs2. - Fix KASAN warnings due to integer overflow in SEEK_DATA/SEEK_HOLE - Fix assertion errors when using inlinedata files on gfs2" * tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: Don't create iomap_page objects in iomap_page_mkwrite_actor iomap: Don't create iomap_page objects for inline files iomap: Permit pages without an iop to enter writeback iomap: remove the length variable in iomap_seek_hole iomap: remove the length variable in iomap_seek_data	2021-07-18 11:17:06 -07:00
Linus Torvalds	6750691a82	Kbuild fixes for v5.14 - Restore the original behavior of scripts/setlocalversion when LOCALVERSION is set to empty. - Show Kconfig prompts even for 'make -s' - Fix the combination of COFNIG_LTO_CLANG=y and CONFIG_MODVERSIONS=y for older GNU Make versions -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmD0QpQVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGexkP+wXQaRE+JYEWBthbVLVJeIDxCaYe YnJD7ZByukr9E7wV/8dzwTSRothcT+TchoRT2LH4gi8HpQ0Kcf2sx9v25F/sGmi1 dtrhmPFuSjoQyGT8eLr+XHQ5MGsLk+OGKDjHzIyKfPSwQdG66q10F/ytfqQS6iRK Ee6EmZuZwgbBhFBmBZZts4z33u8re8dYsRh8iYCZNAB0PJ65BCn4GCqrpNqzcrKn F7K6vqAWL1Irb6/NVU5idr4/kqH7H8/liOhIR9sK5NDOXrKMCpzwW2qldRcNZXxy wnSZXMOoUsrw15my7X/7UQxNeOHrLVgF6zJbie5KQd9COPJ4pX8+JFuCz/S7zmjZ OM731pumH1yfwLxz1bXRKbCkyNk10InXxn+3lvQusLUd6r52irhi/Y3JhUtZeOi3 A/oLoArnMIjnFO6CDL4z1nAkT9AxYq66ZiH7y6Tm9dsId/bsz84Nggq9Cf6IpzB8 eTxbxlEE3ZL8tqFrkQOnIqlqWMkfqXPC099npT0ESEIBtaZwnFtsyFoBtrAFgLsU ca91RxnF/SwZJzc+y0rW08qFYEDAlqryTUL/Iwxhm90CqqDRz6/sYd6iv9gsEdTr ynFArrohVuuCunU8T0UBD1BhDPEe/7hBz84r0V2cvCbvusHNfp4OaSG7Nf1FZWLB WFvTyYI5Kre6ERfX =7eEZ -----END PGP SIGNATURE----- Merge tag 'kbuild-fixes-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Restore the original behavior of scripts/setlocalversion when LOCALVERSION is set to empty. - Show Kconfig prompts even for 'make -s' - Fix the combination of COFNIG_LTO_CLANG=y and CONFIG_MODVERSIONS=y for older GNU Make versions * tag 'kbuild-fixes-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: Documentation: Fix intiramfs script name Kbuild: lto: fix module versionings mismatch in GNU make 3.X kbuild: do not suppress Kconfig prompts for silent build scripts/setlocalversion: fix a bug when LOCALVERSION is empty	2021-07-18 11:10:30 -07:00
Robert Richter	5e60f363b3	Documentation: Fix intiramfs script name Documentation was not changed when renaming the script in commit `80e715a06c` ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh"). Fixing this. Basically does: $ sed -i -e s/gen_initramfs_list.sh/gen_initramfs.sh/g $(git grep -l gen_initramfs_list.sh) Fixes: `80e715a06c` ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh") Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2021-07-18 23:48:14 +09:00
Lecopzer Chen	1d11053dc6	Kbuild: lto: fix module versionings mismatch in GNU make 3.X When building modules(CONFIG_...=m), I found some of module versions are incorrect and set to 0. This can be found in build log for first clean build which shows WARNING: EXPORT symbol "XXXX" [drivers/XXX/XXX.ko] version generation failed, symbol will not be versioned. But in second build(incremental build), the WARNING disappeared and the module version becomes valid CRC and make someone who want to change modules without updating kernel image can't insert their modules. The problematic code is + $(foreach n, $(filter-out FORCE,$^), \ + $(if $(wildcard $(n).symversions), \ + ; cat $(n).symversions >> $@.symversions)) For example: rm -f fs/notify/built-in.a.symversions ; rm -f fs/notify/built-in.a; \ llvm-ar cDPrST fs/notify/built-in.a fs/notify/fsnotify.o \ fs/notify/notification.o fs/notify/group.o ... `foreach n` shows nothing to `cat` into $(n).symversions because `if $(wildcard $(n).symversions)` return nothing, but actually they do exist during this line was executed. -rw-r--r-- 1 root root 168580 Jun 13 19:10 fs/notify/fsnotify.o -rw-r--r-- 1 root root 111 Jun 13 19:10 fs/notify/fsnotify.o.symversions The reason is the $(n).symversions are generated at runtime, but Makefile wildcard function expends and checks the file exist or not during parsing the Makefile. Thus fix this by use `test` shell command to check the file existence in runtime. Rebase from both: 1. [https://lore.kernel.org/lkml/20210616080252.32046-1-lecopzer.chen@mediatek.com/] 2. [https://lore.kernel.org/lkml/20210702032943.7865-1-lecopzer.chen@mediatek.com/] Fixes: `38e8918490` ("kbuild: lto: fix module versioning") Co-developed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2021-07-18 23:48:14 +09:00
Masahiro Yamada	d952cfaf0c	kbuild: do not suppress Kconfig prompts for silent build When a new CONFIG option is available, Kbuild shows a prompt to get the user input. $ make [ snip ] Core Scheduling for SMT (SCHED_CORE) [N/y/?] (NEW) This is the only interactive place in the build process. Commit `174a1dcc96` ("kbuild: sink stdout from cmd for silent build") suppressed Kconfig prompts as well because syncconfig is invoked by the 'cmd' macro. You cannot notice the fact that Kconfig is waiting for the user input. Use 'kecho' to show the equivalent short log without suppressing stdout from sub-make. Fixes: `174a1dcc96` ("kbuild: sink stdout from cmd for silent build") Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	2021-07-18 23:48:14 +09:00
Mikulas Patocka	5df99bec21	scripts/setlocalversion: fix a bug when LOCALVERSION is empty The commit `042da426f8` ("scripts/setlocalversion: simplify the short version part") reduces indentation. Unfortunately, it also changes behavior in a subtle way - if the user has empty "LOCALVERSION" variable, the plus sign is appended to the kernel version. It wasn't appended before. This patch reverts to the old behavior - we append the plus sign only if the LOCALVERSION variable is not set. Fixes: `042da426f8` ("scripts/setlocalversion: simplify the short version part") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2021-07-18 23:48:14 +09:00
Yang Jihong	b0f008551f	perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set The tracepoints trace_sched_stat_{wait, sleep, iowait} are not exposed to user if CONFIG_SCHEDSTATS is not set, "perf sched record" records the three events. As a result, the command fails. Before: #perf sched record sleep 1 event syntax error: 'sched:sched_stat_wait' \___ unknown tracepoint Error: File /sys/kernel/tracing/events/sched/sched_stat_wait not found. Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?. Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events Solution: Check whether schedstat tracepoints are exposed. If no, these events are not recorded. After: # perf sched record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.163 MB perf.data (1091 samples) ] # perf sched report run measurement overhead: 4736 nsecs sleep measurement overhead: 9059979 nsecs the run test took 999854 nsecs the sleep test took 8945271 nsecs nr_run_events: 716 nr_sleep_events: 785 nr_wakeup_events: 0 ... ------------------------------------------------------------ Fixes: `2a09b5de23` ("sched/fair: do not expose some tracepoints to user if CONFIG_SCHEDSTATS is not set") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Yafang Shao <laoar.shao@gmail.com> Link: http://lore.kernel.org/lkml/20210713112358.194693-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-18 09:36:37 -03:00
Yang Jihong	22a665513b	perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernel The "address" member of "struct probe_trace_point" uses long data type. If kernel is 64-bit and perf program is 32-bit, size of "address" variable is 32 bits. As a result, upper 32 bits of address read from kernel are truncated, an error occurs during address comparison in kprobe_warn_out_range(). Before: # perf probe -a schedule schedule is out of .text, skip it. Error: Failed to add events. Solution: Change data type of "address" variable to u64 and change corresponding address printing and value assignment. After: # perf.new.new probe -a schedule Added new event: probe:schedule (on schedule) You can now use it in all perf tools, such as: perf record -e probe:schedule -aR sleep 1 # perf probe -l probe:schedule (on schedule@kernel/sched/core.c) # perf record -e probe:schedule -aR sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.156 MB perf.data (1366 samples) ] # perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1K of event 'probe:schedule' # Event count (approx.): 1366 # # Overhead Command Shared Object Symbol # ........ ............... ................. ............ # 6.22% migration/0 [kernel.kallsyms] [k] schedule 6.22% migration/1 [kernel.kallsyms] [k] schedule 6.22% migration/2 [kernel.kallsyms] [k] schedule 6.22% migration/3 [kernel.kallsyms] [k] schedule 6.15% migration/10 [kernel.kallsyms] [k] schedule 6.15% migration/11 [kernel.kallsyms] [k] schedule 6.15% migration/12 [kernel.kallsyms] [k] schedule 6.15% migration/13 [kernel.kallsyms] [k] schedule 6.15% migration/14 [kernel.kallsyms] [k] schedule 6.15% migration/15 [kernel.kallsyms] [k] schedule 6.15% migration/4 [kernel.kallsyms] [k] schedule 6.15% migration/5 [kernel.kallsyms] [k] schedule 6.15% migration/6 [kernel.kallsyms] [k] schedule 6.15% migration/7 [kernel.kallsyms] [k] schedule 6.15% migration/8 [kernel.kallsyms] [k] schedule 6.15% migration/9 [kernel.kallsyms] [k] schedule 0.22% rcu_sched [kernel.kallsyms] [k] schedule ... # # (Cannot load tips.txt file, please install perf!) # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jianlin Lv <jianlin.lv@arm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lore.kernel.org/lkml/20210715063723.11926-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-18 09:31:15 -03:00
Riccardo Mancini	d4b3eedce1	perf data: Close all files in close_dir() When using 'perf report' in directory mode, the first file is not closed on exit, causing a memory leak. The problem is caused by the iterating variable never reaching 0. Fixes: `1455206311` ("perf data: Add perf_data__(create_dir\|close_dir) functions") Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: http://lore.kernel.org/lkml/20210716141122.858082-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-18 09:27:49 -03:00

1 2 3 4 5 ...

1029903 commits