linux-xiaomi-chiron

Author	SHA1	Message	Date
Hongbo Yao	1f399fc797	drivers/net: netdevsim depends on INET If CONFIG_INET is not set and CONFIG_NETDEVSIM=y. Building drivers/net/netdevsim/fib.o will get the following error: drivers/net/netdevsim/fib.o: In function `nsim_fib4_rt_hw_flags_set': fib.c:(.text+0x12b): undefined reference to `fib_alias_hw_flags_set' drivers/net/netdevsim/fib.o: In function `nsim_fib4_rt_destroy': fib.c:(.text+0xb11): undefined reference to `free_fib_info' Correct the Kconfig for netdevsim. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: `48bb9eb47b` ("netdevsim: fib: Add dummy implementation for FIB offload") Signed-off-by: Hongbo Yao <yaohongbo@huawei.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 11:27:49 +01:00
Rafael J. Wysocki	854e334903	Update devfreq for 5.6 Detailed description for this pull request: 1. Update devfreq core - Add new 'name' attribute of sysfs to show the device name : /sys/class/devfreq/devfreqX/name - Make 'trans_stat' sysfs resetting by entering zero(0) as following : echo 0 > /sys/class/devfreq/devfreqX/trans_stat - Add debugfs support with 'devfreq_summary' file to show the summary : /sys/kernel/debug/devfreq/devfreq_summary - Change the type of time variable to 64bit to prevent the overflow - Make the separate devfreq_stats including the statistics information - Fix the minor coding-style like indentation and kernel-doc warnings 2. Update devfreq driver - Add new imx8m-ddrc.c devfreq driver for dynamic scaling of DDR frequency. It changes the DDR frequency by using ARM SMCCC(SMC Calling Convention) interface to control TF-A firmware. - Add COMPILE_TEST dependency for rk3399_dmc.c - Clean-up code for exynos-bus.c and rk3399_dmc.c without behavior changes 3. Update devfreq-event driver - Fix excessive stack usage of exynos-ppmu.c and clean-up code of rockchip-dfi.c without behavior changes -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEsSpuqBtbWtRe4rLGnM3fLN7rz1MFAl4hO3sWHGN3MDAuY2hv aUBzYW1zdW5nLmNvbQAKCRCczd8s3uvPU54AD/0Qf33Q+93uI4LNW8qfW37OHpb9 nyuzeTu9Gj+c/SDxz5BPqeWPFOjIJAL3fWGeNkObBYRYIwbVkC6+lxLE07UH7B+w UOsEHcmBUypIBPC9i34CqciyjvKQFEdMk1/NoS79G9OXW8wnAooQF4n6fxJkbhwh ZoY+AzhnadNLP6rKzi2JBYrMaEvXpPLL7aY336wCt+LXKPyFkLziCK5nni0Zouil UPA+reX/2cv6OHebDp2mgK1QKh43ohWNuX33ik3x2mFtJ8BiZ/jiKWN0O4lraz0O 6P3/fZIMNzBqUafQdVymAr3vRykt1JlfaRhinDKXmEk6VNuX1vO2qCPc1Hc3etJX Zwbgc2Oe9r/88zt6N6ZRg50ehRGv6RNGBlRgyXZ2OUbPQz9jEHqZ0OiUcFrh3emd QD4RKFA+yc9r2vC9UF6ZwMynnxeM+nAP8EapzOJFAZa41JNWKWdP6V3PhjpzDz3+ cNobPQUgsTAGuw30uNSuMSKLU8L1Hb1r0OkebvGBoSxT/qMkUEDp2fFjhAUYyQ/e vyUia6yl9hPbzwd5QBSV7FKhwoEonwRtOlrb3MjJscOGNW3TIfzsMePVYj4M7P0/ rdHSCcGHKLsbm8cbRDOeJRNOHFFh17M3vVRjpRUUWZ/laLcRk2hNqEI3V4KFeUmK tEAERSzsVot1tpCidQ== =m9LV -----END PGP SIGNATURE----- Merge tag 'devfreq-next-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux Pull devfreq updates for v5.6 from Chanwoo Choi: "1. Update devfreq core - Add new 'name' attribute of sysfs to show the device name : /sys/class/devfreq/devfreqX/name - Make 'trans_stat' sysfs reset by entering zero(0) : echo 0 > /sys/class/devfreq/devfreqX/trans_stat - Add debugfs support with 'devfreq_summary' to show the summary : /sys/kernel/debug/devfreq/devfreq_summary - Change the type of time variable to 64bit to avoid overflows. - Make separate devfreq_stats including the statistics information. - Fix minor coding-style like indentation and kernel-doc warnings. 2. Update devfreq drivers - Add new imx8m-ddrc.c devfreq driver for dynamic scaling of DDR frequency. It changes the DDR frequency by using ARM SMCCC(SMC Calling Convention) interface to control TF-A firmware. - Add COMPILE_TEST dependency for rk3399_dmc.c. - Clean-up code for exynos-bus.c and rk3399_dmc.c without behavior changes 3. Update devfreq-event drivers - Fix excessive stack usage of exynos-ppmu.c and clean-up code of rockchip-dfi.c without behavior changes." * tag 'devfreq-next-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: (24 commits) PM / devfreq: Add debugfs support with devfreq_summary file PM / devfreq: exynos: Rename Exynos to lowercase PM / devfreq: imx8m-ddrc: Fix inconsistent IS_ERR and PTR_ERR PM / devfreq: exynos-bus: Add error log when fail to get devfreq-event PM / devfreq: exynos-bus: Disable devfreq-event device when fails PM / devfreq: rk3399_dmc: Disable devfreq-event device when fails PM / devfreq: imx8m-ddrc: Remove unused defines PM / devfreq: exynos-bus: Reduce goto statements and remove unused headers PM / devfreq: rk3399_dmc: Add COMPILE_TEST and HAVE_ARM_SMCCC dependency PM / devfreq: rockchip-dfi: Convert to devm_platform_ioremap_resource PM / devfreq: rk3399_dmc: Add missing of_node_put() PM / devfreq: rockchip-dfi: Add missing of_node_put() PM / devfreq: Fix multiple kernel-doc warnings PM / devfreq: exynos-bus: Extract exynos_bus_profile_init_passive() PM / devfreq: exynos-bus: Extract exynos_bus_profile_init() PM / devfreq: Move declaration of DEVICE_ATTR_RW(min_freq) PM / devfreq: Move statistics to separate struct devfreq_stats PM / devfreq: Add clearing transitions stats PM / devfreq: Change time stats to 64-bit PM / devfreq: Add new name attribute for sysfs ...	2020-01-17 11:19:45 +01:00
Thomas Gleixner	11e31f608b	watchdog/softlockup: Enforce that timestamp is valid on boot Robert reported that during boot the watchdog timestamp is set to 0 for one second which is the indicator for a watchdog reset. The reason for this is that the timestamp is in seconds and the time is taken from sched clock and divided by ~1e9. sched clock starts at 0 which means that for the first second during boot the watchdog timestamp is 0, i.e. reset. Use ULONG_MAX as the reset indicator value so the watchdog works correctly right from the start. ULONG_MAX would only conflict with a real timestamp if the system reaches an uptime of 136 years on 32bit and almost eternity on 64bit. Reported-by: Robert Richter <rrichter@marvell.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/87o8v3uuzl.fsf@nanos.tec.linutronix.de	2020-01-17 11:19:22 +01:00
Yonglong Liu	49edd6a2c4	net: hns: fix soft lockup when there is not enough memory When there is not enough memory and napi_alloc_skb() return NULL, the HNS driver will print error message, and than try again, if the memory is not enough for a while, huge error message and the retry operation will cause soft lockup. When napi_alloc_skb() return NULL because of no memory, we can get a warn_alloc() call trace, so this patch deletes the error message. We already use polling mode to handle irq, but the retry operation will render the polling weight inactive, this patch just return budget when the rx is not completed to avoid dead loop. Fixes: `36eedfde1a` ("net: hns: Optimize hns_nic_common_poll for better performance") Fixes: `b5996f11ea` ("net: add Hisilicon Network Subsystem basic ethernet support") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 11:19:12 +01:00
Florian Fainelli	080bb352fa	net: phy: Maintain MDIO device and bus statistics We maintain global statistics for an entire MDIO bus, as well as broken down, per MDIO bus address statistics. Given that it is possible for MDIO devices such as switches to access MDIO bus addresses for which there is not a mdio_device instance created (therefore not a a corresponding device directory in sysfs either), we also maintain per-address statistics under the statistics folder. The layout looks like this: /sys/class/mdio_bus/../statistics/ transfers errrors writes reads transfers_<addr> errors_<addr> writes_<addr> reads_<addr> When a mdio_device instance is registered, a statistics/ folder is created with the tranfers, errors, writes and reads attributes which point to the appropriate MDIO bus statistics structure. Statistics are 64-bit unsigned quantities and maintained through the u64_stats_sync.h helper functions. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 11:12:44 +01:00
Petr Mladek	f46e49a9cc	livepatch: Handle allocation failure in the sample of shadow variable API klp_shadow_alloc() is not handled in the sample of shadow variable API. It is not strictly necessary because livepatch_fix1_dummy_free() is able to handle the potential failure. But it is an example and it should use the API a clean way. Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2020-01-17 11:12:06 +01:00
Petr Mladek	be6da98425	livepatch/samples/selftest: Use klp_shadow_alloc() API correctly The commit `e91c2518a5` ("livepatch: Initialize shadow variables safely by a custom callback") leads to the following static checker warning: samples/livepatch/livepatch-shadow-fix1.c:86 livepatch_fix1_dummy_alloc() error: 'klp_shadow_alloc()' 'leak' too small (4 vs 8) It is because klp_shadow_alloc() is used a wrong way: int *leak; shadow_leak = klp_shadow_alloc(d, SV_LEAK, sizeof(leak), GFP_KERNEL, shadow_leak_ctor, leak); The code is supposed to store the "leak" pointer into the shadow variable. 3rd parameter correctly passes size of the data (size of pointer). But the 5th parameter is wrong. It should pass pointer to the data (pointer to the pointer) but it passes the pointer directly. It works because shadow_leak_ctor() handle "ctor_data" as the data instead of pointer to the data. But it is semantically wrong and confusing. The same problem is also in the module used by selftests. In this case, "pvX" variables are introduced. They represent the data stored in the shadow variables. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2020-01-17 11:12:06 +01:00
Petr Mladek	c24c57a4cc	livepatch/selftest: Clean up shadow variable names and type The shadow variable selftest is quite tricky. Especially it is problematic to understand what values are stored, returned, and printed. Make it easier to understand by using "int var, sv" variables consistently everywhere instead of the generic "void ", "ret", and "ctor_data". Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2020-01-17 11:12:06 +01:00
Petr Mladek	8f6b88662c	livepatch/sample: Use the right type for the leaking data pointer The "leak" pointer, in the sample of shadow variable API, is allocated as sizeof(int). Let's help developers and static analyzers with understanding the code by using the appropriate pointer type. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2020-01-17 11:12:06 +01:00
Johan Hovold	fdb838efa3	USB: serial: suppress driver bind attributes USB-serial drivers must not be unbound from their ports before the corresponding USB driver is unbound from the parent interface so suppress the bind and unbind attributes. Unbinding a serial driver while it's port is open is a sure way to trigger a crash as any driver state is released on unbind while port hangup is handled on the parent USB interface level. Drivers for multiport devices where ports share a resource such as an interrupt endpoint also generally cannot handle individual ports going away. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable <stable@vger.kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>	2020-01-17 11:11:26 +01:00
Adrian Huang	9646674878	iommu/amd: Remove unused struct member Commit `c805b428f2` ("iommu/amd: Remove amd_iommu_pd_list") removes the global list for the allocated protection domains. The corresponding member 'list' of the protection_domain struct is not used anymore, so it can be removed. Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-01-17 11:05:56 +01:00
Adrian Huang	62dcee7160	iommu/amd: Replace two consecutive readl calls with one readq Optimize the reigster reading by using readq instead of the two consecutive readl calls. Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-01-17 11:05:56 +01:00
Cong Wang	53d374979e	net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key() syzbot reported some bogus lockdep warnings, for example bad unlock balance in sch_direct_xmit(). They are due to a race condition between slow path and fast path, that is qdisc_xmit_lock_key gets re-registered in netdev_update_lockdep_key() on slow path, while we could still acquire the queue->_xmit_lock on fast path in this small window: CPU A CPU B __netif_tx_lock(); lockdep_unregister_key(qdisc_xmit_lock_key); __netif_tx_unlock(); lockdep_register_key(qdisc_xmit_lock_key); In fact, unlike the addr_list_lock which has to be reordered when the master/slave device relationship changes, queue->_xmit_lock is only acquired on fast path and only when NETIF_F_LLTX is not set, so there is likely no nested locking for it. Therefore, we can just get rid of re-registration of qdisc_xmit_lock_key. Reported-by: syzbot+4ec99438ed7450da6272@syzkaller.appspotmail.com Fixes: `ab92d68fc2` ("net: core: add generic lockdep keys") Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 11:02:44 +01:00
Joerg Roedel	6855d1ba75	Arm SMMU updates for 5.6 - Support for building, and {un,}loading the SMMU drivers as modules - Minor cleanups - SMMUv3: * Non-critical fix to encoding of TLBI_NH_VA invalidation command * Fix broken sanity check on size of MMIO resource during probe * Support for Substream IDs which will soon be provided by PCI PASIDs - io-pgtable: * Finish off the TTBR1 preparation work partially merged last cycle * Ensure correct memory attributes for non-cacheable mappings - SMMU: * Namespace public #defines to avoid collisions with arch/arm64/ * Avoid using valid SMR register when probing mask size -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl4gMWoQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNIWOCACz8HHWv3RuLdDe/4gEG0hwZrBuQMGYCZmy 6y+T/1E1pwDplgEC+xn18EubZZBZ9ZDr2V4+p+hABOxxwyQ9nMMbfaw1o6PkghWU pRh8DtmIjR//DPi0Yim49BHQJnRGQqkYW++vwybew5AmHJqdzKQsDoo8cdZ9eNft EwTxXgKRw8Pc7S43x3pq196eWVAJh6urXUJNL1aLkGNCmOgY8TASRUyMZqsjT+oT ZfJ8f2s34CQQFT6ZfLFxMYZQEuXo8TIVQGH98eWmQrtsGrg7K3OP/yG3I8L9ckY9 dhkjAXdGcxjOyuUjPulDvWfHA+EbhZVdadih6UD+QL9p+BhW9aDD =lfaz -----END PGP SIGNATURE----- Merge tag 'arm-smmu-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu Arm SMMU updates for 5.6 - Support for building, and {un,}loading the SMMU drivers as modules - Minor cleanups - SMMUv3: * Non-critical fix to encoding of TLBI_NH_VA invalidation command * Fix broken sanity check on size of MMIO resource during probe * Support for Substream IDs which will soon be provided by PCI PASIDs - io-pgtable: * Finish off the TTBR1 preparation work partially merged last cycle * Ensure correct memory attributes for non-cacheable mappings - SMMU: * Namespace public #defines to avoid collisions with arch/arm64/ * Avoid using valid SMR register when probing mask size	2020-01-17 11:01:01 +01:00
Eric Dumazet	41cdc74104	netdevsim: fix nsim_fib6_rt_create() error path It seems nsim_fib6_rt_create() intent was to return either a valid pointer or an embedded error code. BUG: unable to handle page fault for address: fffffffffffffff4 PGD 9870067 P4D 9870067 PUD 9872067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 22851 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:jhash2 include/linux/jhash.h:125 [inline] RIP: 0010:rhashtable_jhash2+0x76/0x2c0 lib/rhashtable.c:963 Code: b9 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 14 08 4c 89 f0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 30 02 00 00 49 8d 7e 04 <41> 8b 06 48 be 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f b6 RSP: 0018:ffffc90016127190 EFLAGS: 00010246 RAX: 0000000000000007 RBX: 00000000dfb3ab49 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffffffff839ba7c8 RDI: fffffffffffffff8 RBP: ffffc900161271c0 R08: ffff8880951f8640 R09: ffffed1015d0703d R10: ffffed1015d0703c R11: ffff8880ae8381e3 R12: 00000000dfb3ab49 R13: 00000000dfb3ab49 R14: fffffffffffffff4 R15: 0000000000000007 FS: 00007f40bfbc6700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffffffffffff4 CR3: 0000000093660000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rht_key_get_hash include/linux/rhashtable.h:133 [inline] rht_key_hashfn include/linux/rhashtable.h:159 [inline] rht_head_hashfn include/linux/rhashtable.h:174 [inline] __rhashtable_insert_fast.constprop.0+0xe15/0x1180 include/linux/rhashtable.h:723 rhashtable_insert_fast include/linux/rhashtable.h:832 [inline] nsim_fib6_rt_add drivers/net/netdevsim/fib.c:603 [inline] nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:658 [inline] nsim_fib6_event drivers/net/netdevsim/fib.c:719 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:744 [inline] nsim_fib_event_nb+0x1b16/0x2600 drivers/net/netdevsim/fib.c:772 notifier_call_chain+0xc2/0x230 kernel/notifier.c:83 __atomic_notifier_call_chain+0xa6/0x1a0 kernel/notifier.c:173 atomic_notifier_call_chain+0x2e/0x40 kernel/notifier.c:183 call_fib_notifiers+0x173/0x2a0 net/core/fib_notifier.c:35 call_fib6_notifiers+0x4b/0x60 net/ipv6/fib6_notifier.c:22 call_fib6_entry_notifiers+0xfb/0x150 net/ipv6/ip6_fib.c:399 fib6_add_rt2node net/ipv6/ip6_fib.c:1216 [inline] fib6_add+0x20cd/0x3ec0 net/ipv6/ip6_fib.c:1471 __ip6_ins_rt+0x54/0x80 net/ipv6/route.c:1315 ip6_ins_rt+0x96/0xd0 net/ipv6/route.c:1325 __ipv6_dev_ac_inc+0x76f/0xb20 net/ipv6/anycast.c:324 ipv6_sock_ac_join+0x4c1/0x790 net/ipv6/anycast.c:139 do_ipv6_setsockopt.isra.0+0x3908/0x4290 net/ipv6/ipv6_sockglue.c:670 ipv6_setsockopt+0xff/0x180 net/ipv6/ipv6_sockglue.c:944 udpv6_setsockopt+0x68/0xb0 net/ipv6/udp.c:1564 sock_common_setsockopt+0x94/0xd0 net/core/sock.c:3149 __sys_setsockopt+0x261/0x4c0 net/socket.c:2130 __do_sys_setsockopt net/socket.c:2146 [inline] __se_sys_setsockopt net/socket.c:2143 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:2143 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45aff9 Fixes: `48bb9eb47b` ("netdevsim: fib: Add dummy implementation for FIB offload") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 11:00:57 +01:00
Eric Dumazet	44c23d7159	net/sched: act_ife: initalize ife->metalist earlier It seems better to init ife->metalist earlier in tcf_ife_init() to avoid the following crash : kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10483 Comm: syz-executor216 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:_tcf_ife_cleanup net/sched/act_ife.c:412 [inline] RIP: 0010:tcf_ife_cleanup+0x6e/0x400 net/sched/act_ife.c:431 Code: 48 c1 ea 03 80 3c 02 00 0f 85 94 03 00 00 49 8b bd f8 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8d 67 e8 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 5c 03 00 00 48 bb 00 00 00 00 00 fc ff df 48 8b RSP: 0018:ffffc90001dc6d00 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffffffff864619c0 RCX: ffffffff815bfa09 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 RBP: ffffc90001dc6d50 R08: 0000000000000004 R09: fffff520003b8d8e R10: fffff520003b8d8d R11: 0000000000000003 R12: ffffffffffffffe8 R13: ffff8880a79fc000 R14: ffff88809aba0e00 R15: 0000000000000000 FS: 0000000001b51880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000563f52cce140 CR3: 0000000093541000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tcf_action_cleanup+0x62/0x1b0 net/sched/act_api.c:119 __tcf_action_put+0xfa/0x130 net/sched/act_api.c:135 __tcf_idr_release net/sched/act_api.c:165 [inline] __tcf_idr_release+0x59/0xf0 net/sched/act_api.c:145 tcf_idr_release include/net/act_api.h:171 [inline] tcf_ife_init+0x97c/0x1870 net/sched/act_ife.c:616 tcf_action_init_1+0x6b6/0xa40 net/sched/act_api.c:944 tcf_action_init+0x21a/0x330 net/sched/act_api.c:1000 tcf_action_add+0xf5/0x3b0 net/sched/act_api.c:1410 tc_ctl_action+0x390/0x488 net/sched/act_api.c:1465 rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5424 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x58c/0x7d0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:659 ____sys_sendmsg+0x753/0x880 net/socket.c:2330 ___sys_sendmsg+0x100/0x170 net/socket.c:2384 __sys_sendmsg+0x105/0x1d0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg net/socket.c:2424 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2424 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: `11a94d7fd8` ("net/sched: act_ife: validate the control action inside init()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 10:58:15 +01:00
jimyan	53291622e2	iommu/vt-d: Don't reject Host Bridge due to scope mismatch On a system with two host bridges(0000:00:00.0,0000:80:00.0), iommu initialization fails with DMAR: Device scope type does not match for 0000:80:00.0 This is because the DMAR table reports this device as having scope 2 (ACPI_DMAR_SCOPE_TYPE_BRIDGE): but the device has a type 0 PCI header: 80:00.0 Class 0600: Device 8086:2020 (rev 06) 00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00 30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00 VT-d works perfectly on this system, so there's no reason to bail out on initialization due to this apparent scope mismatch. Add the class 0x06 ("PCI_BASE_CLASS_BRIDGE") as a heuristic for allowing DMAR initialization for non-bridge PCI devices listed with scope bridge. Signed-off-by: jimyan <jimyan@baidu.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-01-17 10:57:30 +01:00
Madhuparna Bhowmik	f3265971de	net: xen-netback: hash.c: Use built-in RCU list checking list_for_each_entry_rcu has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik04@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 10:57:22 +01:00
David S. Miller	a72b6a1ee4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter updates for net The following patchset contains Netfilter fixes for net: 1) Fix use-after-free in ipset bitmap destroy path, from Cong Wang. 2) Missing init netns in entry cleanup path of arp_tables, from Florian Westphal. 3) Fix WARN_ON in set destroy path due to missing cleanup on transaction error. 4) Incorrect netlink sanity check in tunnel, from Florian Westphal. 5) Missing sanity check for erspan version netlink attribute, also from Florian. 6) Remove WARN in nft_request_module() that can be triggered from userspace, from Florian Westphal. 7) Memleak in NFTA_HOOK_DEVS netlink parser, from Dan Carpenter. 8) List poison from commit path for flowtables that are added and deleted in the same batch, from Florian Westphal. 9) Fix NAT ICMP packet corruption, from Eyal Birger. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-17 10:36:07 +01:00
Waiman Long	f5bfdc8e39	locking/osq: Use optimized spinning loop for arm64 Arm64 has a more optimized spinning loop (atomic_cond_read_acquire) using wfe for spinlock that can boost performance of sibling threads by putting the current cpu to a wait state that is broken only when the monitored variable changes or an external event happens. OSQ has a more complicated spinning loop. Besides the lock value, it also checks for need_resched() and vcpu_is_preempted(). The check for need_resched() is not a problem as it is only set by the tick interrupt handler. That will be detected by the spinning cpu right after iret. The vcpu_is_preempted() check, however, is a problem as changes to the preempt state of of previous node will not affect the wait state. For ARM64, vcpu_is_preempted is not currently defined and so is a no-op. Will has indicated that he is planning to para-virtualize wfe instead of defining vcpu_is_preempted for PV support. So just add a comment in arch/arm64/include/asm/spinlock.h to indicate that vcpu_is_preempted() should not be defined as suggested. On a 2-socket 56-core 224-thread ARM64 system, a kernel mutex locking microbenchmark was run for 10s with and without the patch. The performance numbers before patch were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 316/123,143/2,121,269 Threads = 224, Total Rate = 2,757 kop/s; Percpu Rate = 12 kop/s After patch, the numbers were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 334/147,836/1,304,787 Threads = 224, Total Rate = 3,311 kop/s; Percpu Rate = 15 kop/s So there was about 20% performance improvement. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200113150735.21956-1-longman@redhat.com	2020-01-17 10:19:30 +01:00
Waiman Long	57097124cb	locking/qspinlock: Fix inaccessible URL of MCS lock paper It turns out that the URL of the MCS lock paper listed in the source code is no longer accessible. I did got question about where the paper was. This patch updates the URL to BZ 206115 which contains a copy of the paper from https://www.cs.rochester.edu/u/scott/papers/1991_TOCS_synch.pdf Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200107174914.4187-1-longman@redhat.com	2020-01-17 10:19:30 +01:00
Waiman Long	a030f9767d	locking/lockdep: Fix lockdep_stats indentation problem It was found that two lines in the output of /proc/lockdep_stats have indentation problem: # cat /proc/lockdep_stats : in-process chains: 25057 stack-trace entries: 137827 [max: 524288] number of stack traces: 7973 number of stack hash chains: 6355 combined max dependencies: 1356414598 hardirq-safe locks: 57 hardirq-unsafe locks: 1286 : All the numbers displayed in /proc/lockdep_stats except the two stack trace numbers are formatted with a field with of 11. To properly align all the numbers, a field width of 11 is now added to the two stack trace numbers. Fixes: `8c779229d0` ("locking/lockdep: Report more stack trace statistics") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lkml.kernel.org/r/20191211213139.29934-1-longman@redhat.com	2020-01-17 10:19:30 +01:00
Waiman Long	39e7234f00	locking/rwsem: Fix kernel crash when spinning on RWSEM_OWNER_UNKNOWN The commit `91d2a812df` ("locking/rwsem: Make handoff writer optimistically spin on owner") will allow a recently woken up waiting writer to spin on the owner. Unfortunately, if the owner happens to be RWSEM_OWNER_UNKNOWN, the code will incorrectly spin on it leading to a kernel crash. This is fixed by passing the proper non-spinnable bits to rwsem_spin_on_owner() so that RWSEM_OWNER_UNKNOWN will be treated as a non-spinnable target. Fixes: `91d2a812df` ("locking/rwsem: Make handoff writer optimistically spin on owner") Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200115154336.8679-1-longman@redhat.com	2020-01-17 10:19:27 +01:00
Kim Phillips	5738891229	perf/x86/amd: Add support for Large Increment per Cycle Events Description of hardware operation --------------------------------- The core AMD PMU has a 4-bit wide per-cycle increment for each performance monitor counter. That works for most events, but now with AMD Family 17h and above processors, some events can occur more than 15 times in a cycle. Those events are called "Large Increment per Cycle" events. In order to count these events, two adjacent h/w PMCs get their count signals merged to form 8 bits per cycle total. In addition, the PERF_CTR count registers are merged to be able to count up to 64 bits. Normally, events like instructions retired, get programmed on a single counter like so: PERF_CTL0 (MSR 0xc0010200) 0x000000000053ff0c # event 0x0c, umask 0xff PERF_CTR0 (MSR 0xc0010201) 0x0000800000000001 # r/w 48-bit count The next counter at MSRs 0xc0010202-3 remains unused, or can be used independently to count something else. When counting Large Increment per Cycle events, such as FLOPs, however, we now have to reserve the next counter and program the PERF_CTL (config) register with the Merge event (0xFFF), like so: PERF_CTL0 (msr 0xc0010200) 0x000000000053ff03 # FLOPs event, umask 0xff PERF_CTR0 (msr 0xc0010201) 0x0000800000000001 # rd 64-bit cnt, wr lo 48b PERF_CTL1 (msr 0xc0010202) 0x0000000f004000ff # Merge event, enable bit PERF_CTR1 (msr 0xc0010203) 0x0000000000000000 # wr hi 16-bits count The count is widened from the normal 48-bits to 64 bits by having the second counter carry the higher 16 bits of the count in its lower 16 bits of its counter register. The odd counter, e.g., PERF_CTL1, is programmed with the enabled Merge event before the even counter, PERF_CTL0. The Large Increment feature is available starting with Family 17h. For more details, search any Family 17h PPR for the "Large Increment per Cycle Events" section, e.g., section 2.1.15.3 on p. 173 in this version: https://www.amd.com/system/files/TechDocs/56176_ppr_Family_17h_Model_71h_B0_pub_Rev_3.06.zip Description of software operation --------------------------------- The following steps are taken in order to support reserving and enabling the extra counter for Large Increment per Cycle events: 1. In the main x86 scheduler, we reduce the number of available counters by the number of Large Increment per Cycle events being scheduled, tracked by a new cpuc variable 'n_pair' and a new amd_put_event_constraints_f17h(). This improves the counter scheduler success rate. 2. In perf_assign_events(), if a counter is assigned to a Large Increment event, we increment the current counter variable, so the counter used for the Merge event is removed from assignment consideration by upcoming event assignments. 3. In find_counter(), if a counter has been found for the Large Increment event, we set the next counter as used, to prevent other events from using it. 4. We perform steps 2 & 3 also in the x86 scheduler fastpath, i.e., we add Merge event accounting to the existing used_mask logic. 5. Finally, we add on the programming of Merge event to the neighbouring PMC counters in the counter enable/disable{_all} code paths. Currently, software does not support a single PMU with mixed 48- and 64-bit counting, so Large increment event counts are limited to 48 bits. In set_period, we zero-out the upper 16 bits of the count, so the hardware doesn't copy them to the even counter's higher bits. Simple invocation example showing counting 8 FLOPs per 256-bit/%ymm vaddps instruction executed in a loop 100 million times: perf stat -e cpu/fp_ret_sse_avx_ops.all/,cpu/instructions/ <workload> Performance counter stats for '<workload>': 800,000,000 cpu/fp_ret_sse_avx_ops.all/u 300,042,101 cpu/instructions/u Prior to this patch, the reported SSE/AVX FLOPs retired count would be wrong. [peterz: lots of renames and edits to the code] Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2020-01-17 10:19:26 +01:00
Kim Phillips	471af006a7	perf/x86/amd: Constrain Large Increment per Cycle events AMD Family 17h processors and above gain support for Large Increment per Cycle events. Unfortunately there is no CPUID or equivalent bit that indicates whether the feature exists or not, so we continue to determine eligibility based on a CPU family number comparison. For Large Increment per Cycle events, we add a f17h-and-compatibles get_event_constraints_f17h() that returns an even counter bitmask: Large Increment per Cycle events can only be placed on PMCs 0, 2, and 4 out of the currently available 0-5. The only currently public event that requires this feature to report valid counts is PMCx003 "Retired SSE/AVX Operations". Note that the CPU family logic in amd_core_pmu_init() is changed so as to be able to selectively add initialization for features available in ranges of backward-compatible CPU families. This Large Increment per Cycle feature is expected to be retained in future families. A side-effect of assigning a new get_constraints function for f17h disables calling the old (prior to f15h) amd_get_event_constraints implementation left enabled by commit `e40ed1542d` ("perf/x86: Add perf support for AMD family-17h processors"), which is no longer necessary since those North Bridge event codes are obsoleted. Also fix a spelling mistake whilst in the area (calulating -> calculating). Fixes: `e40ed1542d` ("perf/x86: Add perf support for AMD family-17h processors") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191114183720.19887-2-kim.phillips@amd.com	2020-01-17 10:19:26 +01:00
Harry Pan	1e0f17724a	perf/x86/intel/rapl: Add Comet Lake support Comet Lake supports the same RAPL counters like Kaby Lake and Skylake. After this, on CML machine the energy counters appear in perf list. Signed-off-by: Harry Pan <harry.pan@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191227171944.1.Id6f3ab98474d7d1dba5b95390b24e0a67368d364@changeid	2020-01-17 10:19:25 +01:00
Valentin Schneider	ccf74128d6	sched/topology: Assert non-NUMA topology masks don't (partially) overlap topology.c::get_group() relies on the assumption that non-NUMA domains do not partially overlap. Zeng Tao pointed out in [1] that such topology descriptions, while completely bogus, can end up being exposed to the scheduler. In his example (8 CPUs, 2-node system), we end up with: MC span for CPU3 == 3-7 MC span for CPU4 == 4-7 The first pass through get_group(3, sdd@MC) will result in the following sched_group list: 3 -> 4 -> 5 -> 6 -> 7 ^ / `----------------' And a later pass through get_group(4, sdd@MC) will "corrupt" that to: 3 -> 4 -> 5 -> 6 -> 7 ^ / `-----------' which will completely break things like 'while (sg != sd->groups)' when using CPU3's base sched_domain. There already are some architecture-specific checks in place such as x86/kernel/smpboot.c::topology.sane(), but this is something we can detect in the core scheduler, so it seems worthwhile to do so. Warn and abort the construction of the sched domains if such a broken topology description is detected. Note that this is somewhat expensive (O(t.c²), 't' non-NUMA topology levels and 'c' CPUs) and could be gated under SCHED_DEBUG if deemed necessary. Testing ======= Dietmar managed to reproduce this using the following qemu incantation: $ qemu-system-aarch64 -kernel ./Image -hda ./qemu-image-aarch64.img \ -append 'root=/dev/vda console=ttyAMA0 loglevel=8 sched_debug' -smp \ cores=8 --nographic -m 512 -cpu cortex-a53 -machine virt -numa \ node,cpus=0-2,nodeid=0 -numa node,cpus=3-7,nodeid=1 alongside the following drivers/base/arch_topology.c hack (AIUI wouldn't be needed if '-smp cores=X, sockets=Y' would work with qemu): 8<--- @@ -465,6 +465,9 @@ void update_siblings_masks(unsigned int cpuid) if (cpuid_topo->package_id != cpu_topo->package_id) continue; + if ((cpu < 4 && cpuid > 3) \|\| (cpu > 3 && cpuid < 4)) + continue; + cpumask_set_cpu(cpuid, &cpu_topo->core_sibling); cpumask_set_cpu(cpu, &cpuid_topo->core_sibling); 8<--- [1]: https://lkml.kernel.org/r/1577088979-8545-1-git-send-email-prime.zeng@hisilicon.com Reported-by: Zeng Tao <prime.zeng@hisilicon.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200115160915.22575-1-valentin.schneider@arm.com	2020-01-17 10:19:23 +01:00
Hewenliang	3e0de271ff	idle: fix spelling mistake "iterrupts" -> "interrupts" There is a spelling misake in comments of cpuidle_idle_call. Fix it. Signed-off-by: Hewenliang <hewenliang4@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20200110025604.34373-1-hewenliang4@huawei.com	2020-01-17 10:19:22 +01:00
Vincent Guittot	a4f9a0e51b	sched/fair: Remove redundant call to cpufreq_update_util() With commit `bef69dd878` ("sched/cpufreq: Move the cfs_rq_util_change() call to cpufreq_update_util()") update_load_avg() has become the central point for calling cpufreq (not including the update of blocked load). This change helps to simplify further the number of calls to cpufreq_update_util() and to remove last redundant ones. With update_load_avg(), we are now sure that cpufreq_update_util() will be called after every task attachment to a cfs_rq and especially after propagating this event down to the util_avg of the root cfs_rq, which is the level that is used by cpufreq governors like schedutil to set the frequency of a CPU. The SCHED_CPUFREQ_MIGRATION flag forces an early call to cpufreq when the migration happens in a cgroup whereas util_avg of root cfs_rq is not yet updated and this call is duplicated with the one that happens immediately after when the migration event reaches the root cfs_rq. The dedicated flag SCHED_CPUFREQ_MIGRATION is now useless and can be removed. The interface of attach_entity_load_avg() can also be simplified accordingly. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/1579083620-24943-1-git-send-email-vincent.guittot@linaro.org	2020-01-17 10:19:22 +01:00
Wang Long	3d817689a6	sched/psi: create /proc/pressure and /proc/pressure/{io\|memory\|cpu} only when psi enabled when CONFIG_PSI_DEFAULT_DISABLED set to N or the command line set psi=0, I think we should not create /proc/pressure and /proc/pressure/{io\|memory\|cpu}. In the future, user maybe determine whether the psi feature is enabled by checking the existence of the /proc/pressure dir or /proc/pressure/{io\|memory\|cpu} files. Signed-off-by: Wang Long <w@laoqinren.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lkml.kernel.org/r/1576672698-32504-1-git-send-email-w@laoqinren.net	2020-01-17 10:19:22 +01:00
Peng Liu	4c58f57fa6	sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP commit `bf475ce0a3` ("sched/fair: Add per-CPU min capacity to sched_group_capacity") introduced per-cpu min_capacity. commit `e3d6d0cb66` ("sched/fair: Add sched_group per-CPU max capacity") introduced per-cpu max_capacity. In the SD_OVERLAP case, the local variable 'capacity' represents the sum of CPU capacity of all CPUs in the first sched group (sg) of the sched domain (sd). It is erroneously used to calculate sg's min and max CPU capacity. To fix this use capacity_of(cpu) instead of 'capacity'. The code which achieves this via cpu_rq(cpu)->sd->groups->sgc->capacity (for rq->sd != NULL) can be removed since it delivers the same value as capacity_of(cpu) which is currently only used for the (!rq->sd) case (see update_cpu_capacity()). An sg of the lowest sd (rq->sd or sd->child == NULL) represents a single CPU (and hence sg->sgc->capacity == capacity_of(cpu)). Signed-off-by: Peng Liu <iwtbavbm@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20200104130828.GA7718@iZj6chx1xj0e0buvshuecpZ	2020-01-17 10:19:21 +01:00
Peng Wang	fe71bbb21e	sched/fair: calculate delta runnable load only when it's needed Move the code of calculation for delta_sum/delta_avg to where it is really needed to be done. Signed-off-by: Peng Wang <rocking@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20200103114400.17668-1-rocking@linux.alibaba.com	2020-01-17 10:19:21 +01:00
Alex Shi	9dec1b6949	sched/cputime: move rq parameter in irqtime_account_process_tick Every time we call irqtime_account_process_tick() is in a interrupt, Every caller will get and assign a parameter rq = this_rq(), This is unnecessary and increase the code size a little bit. Move the rq getting action to irqtime_account_process_tick internally is better. base with this patch cputime.o 578792 bytes 577888 bytes Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1577959674-255537-1-git-send-email-alex.shi@linux.alibaba.com	2020-01-17 10:19:21 +01:00
Yangtao Li	35f4cd96f5	stop_machine: Make stop_cpus() static The function stop_cpus() is only used internally by the stop_machine for stop multiple cpus. Make it static. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191228161912.24082-1-tiny.windzz@gmail.com	2020-01-17 10:19:21 +01:00
Wei Li	02d4ac5885	sched/debug: Reset watchdog on all CPUs while processing sysrq-t Lengthy output of sysrq-t may take a lot of time on slow serial console with lots of processes and CPUs. So we need to reset NMI-watchdog to avoid spurious lockup messages, and we also reset softlockup watchdogs on all other CPUs since another CPU might be blocked waiting for us to process an IPI or stop_machine. Add to sysrq_sched_debug_show() as what we did in show_state_filter(). Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20191226085224.48942-1-liwei391@huawei.com	2020-01-17 10:19:20 +01:00
Li Guanglei	dcd6dffb0a	sched/core: Fix size of rq::uclamp initialization rq::uclamp is an array of struct uclamp_rq, make sure we clear the whole thing. Fixes: `69842cba9a` ("sched/uclamp: Add CPU's clamp buckets refcountinga") Signed-off-by: Li Guanglei <guanglei.li@unisoc.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Link: https://lkml.kernel.org/r/1577259844-12677-1-git-send-email-guangleix.li@gmail.com	2020-01-17 10:19:20 +01:00
Qais Yousef	7226017ad3	sched/uclamp: Fix a bug in propagating uclamp value in new cgroups When a new cgroup is created, the effective uclamp value wasn't updated with a call to cpu_util_update_eff() that looks at the hierarchy and update to the most restrictive values. Fix it by ensuring to call cpu_util_update_eff() when a new cgroup becomes online. Without this change, the newly created cgroup uses the default root_task_group uclamp values, which is 1024 for both uclamp_{min, max}, which will cause the rq to to be clamped to max, hence cause the system to run at max frequency. The problem was observed on Ubuntu server and was reproduced on Debian and Buildroot rootfs. By default, Ubuntu and Debian create a cpu controller cgroup hierarchy and add all tasks to it - which creates enough noise to keep the rq uclamp value at max most of the time. Imitating this behavior makes the problem visible in Buildroot too which otherwise looks fine since it's a minimal userspace. Fixes: `0b60ba2dd3` ("sched/uclamp: Propagate parent clamps") Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Doug Smythies <dsmythies@telus.net> Link: https://lore.kernel.org/lkml/000701d5b965$361b6c60$a2524520$@net/	2020-01-17 10:19:20 +01:00
Viresh Kumar	323af6deaf	sched/fair: Load balance aggressively for SCHED_IDLE CPUs The fair scheduler performs periodic load balance on every CPU to check if it can pull some tasks from other busy CPUs. The duration of this periodic load balance is set to sd->balance_interval for the idle CPUs and is calculated by multiplying the sd->balance_interval with the sd->busy_factor (set to 32 by default) for the busy CPUs. The multiplication is done for busy CPUs to avoid doing load balance too often and rather spend more time executing actual task. While that is the right thing to do for the CPUs busy with SCHED_OTHER or SCHED_BATCH tasks, it may not be the optimal thing for CPUs running only SCHED_IDLE tasks. With the recent enhancements in the fair scheduler around SCHED_IDLE CPUs, we now prefer to enqueue a newly-woken task to a SCHED_IDLE CPU instead of other busy or idle CPUs. The same reasoning should be applied to the load balancer as well to make it migrate tasks more aggressively to a SCHED_IDLE CPU, as that will reduce the scheduling latency of the migrated (SCHED_OTHER) tasks. This patch makes minimal changes to the fair scheduler to do the next load balance soon after the last non SCHED_IDLE task is dequeued from a runqueue, i.e. making the CPU SCHED_IDLE. Also the sd->busy_factor is ignored while calculating the balance_interval for such CPUs. This is done to avoid delaying the periodic load balance by few hundred milliseconds for SCHED_IDLE CPUs. This is tested on ARM64 Hikey620 platform (octa-core) with the help of rt-app and it is verified, using kernel traces, that the newly SCHED_IDLE CPU does load balancing shortly after it becomes SCHED_IDLE and pulls tasks from other busy CPUs. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/e485827eb8fe7db0943d6f3f6e0f5a4a70272781.1578471925.git.viresh.kumar@linaro.org	2020-01-17 10:19:20 +01:00
Vincent Guittot	5f68eb19b5	sched/fair : Improve update_sd_pick_busiest for spare capacity case Similarly to calculate_imbalance() and find_busiest_group(), using the number of idle CPUs when there is only 1 CPU in the group is not efficient because we can't make a difference between a CPU running 1 task and a CPU running dozens of small tasks competing for the same CPU but not enough to overload it. More generally speaking, we should use the number of running tasks when there is the same number of idle CPUs in a group instead of blindly select the 1st one. When the groups have spare capacity and the same number of idle CPUs, we compare the number of running tasks to select the busiest group. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1576839893-26930-1-git-send-email-vincent.guittot@linaro.org	2020-01-17 10:19:19 +01:00
Jisheng Zhang	db5793c599	watchdog: Remove soft_lockup_hrtimer_cnt and related code After commit `9cf57731b6` ("watchdog/softlockup: Replace "watchdog/%u" threads with cpu_stop_work"), the percpu soft_lockup_hrtimer_cnt is not used any more, so remove it and related code. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191218131720.4146aea2@xhacker.debian	2020-01-17 10:19:19 +01:00
Steven Rostedt (VMware)	31537cf8f3	tracing: Initialize ret in syscall_enter_define_fields() If syscall_enter_define_fields() is called on a system call with no arguments, the return code variable "ret" will never get initialized. Initialize it to zero. Fixes: `04ae87a520` ("ftrace: Rework event_create_dir()") Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/0FA8C6E3-D9F5-416D-A1B0-5E4CD583A101@lca.pw	2020-01-17 10:19:18 +01:00
Lokesh Vutla	3f03a58b25	arm64: dts: ti: k3-j721e-main: Add missing power-domains for smmu Add power-domains entry for smmu, so that the it is accessible as long as the driver is active. Without this device shutdown is throwing the below warning: "[ 44.736348] arm-smmu-v3 36600000.smmu: failed to clear cr0" Reported-by: Suman Anna <s-anna@ti.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Tero Kristo <t-kristo@ti.com>	2020-01-17 10:19:51 +02:00
Grygorii Strashko	f2965b9979	arm64: dts: ti: k3-am65-mcu: add system control module node The MCU System control module support is added to the device tree to allow drivers to access to their System control module registers. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Tero Kristo <t-kristo@ti.com>	2020-01-17 10:19:51 +02:00
Vignesh Raghavendra	ca3be22dd0	arm64: dts: k3-am654-base-board: Add IRQ line for GPIO expander Add IRQ line for IO expander present on wkup_i2c bus on AM654 EVM Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Tero Kristo <t-kristo@ti.com>	2020-01-17 10:19:51 +02:00
Vignesh Raghavendra	07481770e8	arm64: dts: ti: k3-am65: Add OSPI DT node AM654 SoC has two Cadence OSPI controller instances under Flash subsystem (FSS). Add DT nodes for the same. Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Tero Kristo <t-kristo@ti.com>	2020-01-17 10:19:51 +02:00
Vignesh Raghavendra	cb27354b38	arm64: dts: ti: k3-j721e: Add DT nodes for few peripherials Enable I2Cs, ADCs, OSPIs and UFS peripherals present on J721e. Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Tero Kristo <t-kristo@ti.com>	2020-01-17 10:19:51 +02:00
YueHaibing	d18fddff06	gpiolib: Remove duplicated function gpio_do_set_config() gpio_set_config() simply call gpio_do_set_config(), so remove the duplicated function. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20200116142927.58908-1-yuehaibing@huawei.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2020-01-17 09:11:41 +01:00
Linus Walleij	319d5cce72	intel-pinctrl for v5.5-3 * Fix Interrupt Status register offset for Intel Sunrisepoint PCH-H. The following is an automated git shortlog grouped by driver: sunrisepoint: - Add missing Interrupt Status register offset -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl4XC7AACgkQb7wzTHR8 rChhnA//e/BHEYPHwQpbZfl90sdDFDX6VhMko1ApjZveFCFYbgonwVSghRCAj0vx h0diyT/doswJZ+sHRY0eTFeuOLMigB4ZX1rV0ON7jKGVQFtHk0UNYdm0LqLqxbHW H6RvJ0tdFsoweTtAwMeuI6uIWG0KV9+zyNMEtfHwVT7M36eFx2pfhGq71J7Caz9+ Iem6sDYjAkdo1UWVNV4bdn6RzUKuj1GTJNZHVbgp91Y6ApF8Y+DSk2BfjvUU4wqn 4E5M1MVR1M4GX7ak+7hkN7QX/stR+G5DzuK5+QDPyOQOypd2vBkRQxTAG7wLEp6W UR2VT0696DB8yyYuT/C36RctzW6jdyTb9GxYeCIlNM2z/zHPwaxmDwgtLcrr2E01 oKbmHD5KFnoYJ9RJC6QJxnhpXs1zcZRd3V5YnbMM8Jx5bDrP8vwKnOXokIEHrkxO JEn5aoNkmpvefB2+LxNJ+zodi+FTDKZNmQY9uCCtf1OtqeOPze8b6m+hAYmkzGnV 8RSVYE+leBCRUYSp13uVTq3FFS8uclhxveCvyUwolhDsEjfUOTIEwH7V7xOn/x+y 5DCBUyf3dipDIDVCnwiwB9J4Y7YI46yTLpHBT3E57KuqZ1NOYS5RYsX6Ks8IpbJL iP9qHgEMIJlqzh4E2mTFQ/ZZlG5ko1uLK81n/oaEzBudnzslp9U= =7pOQ -----END PGP SIGNATURE----- Merge tag 'intel-pinctrl-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/intel into fixes intel-pinctrl for v5.5-3 * Fix Interrupt Status register offset for Intel Sunrisepoint PCH-H. The following is an automated git shortlog grouped by driver: sunrisepoint: - Add missing Interrupt Status register offset	2020-01-17 09:07:26 +01:00
Linus Walleij	8b844d78a7	Merge branch 'fixup-thunderx-hierarchy' into devel	2020-01-17 09:00:35 +01:00
Linus Walleij	6a77de2596	Linux 5.5-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl4bv/IeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGKWEIAImoqyoGKU+ZcBPI PCrqWtNgrZvVDs/K/IinETZSwclvuQCJicN9DYu9g//3uxf9z+i4c5/Oq9veP1lw ikVjfwgo74SqNwO9L+78oUG+2rzUDwf/LTVhlqO17fxmT5WumJjC7Y6/TejwpOd5 xmZ5NopLQTx95OJWaK0rrDUTkG1LzlQZINGbu1K8sRpbppcSc31Egh2n09kaOnmn 6xRRuxFnk2dXuCCdcCdb6rW1vNzd1IRbPjqAktRCalp04hzIFDgXbj9pfOl/iu6O nl+xDSwziW3DzKxdkw3WZTYbPPK8e7s16qF23QLVgwlIKE0qL1h6uiEnAMARGfLm bVFIUbo= =Tdp+ -----END PGP SIGNATURE----- Merge tag 'v5.5-rc6' into devel Linux 5.5-rc6	2020-01-17 08:59:29 +01:00

... 70 71 72 73 74 ...

900375 commits