linux-xiaomi-chiron

Author	SHA1	Message	Date
Alex Elder	a3d3e759a4	net: ipa: get rid of extra clock reference Suspending the IPA hardware is now managed by the runtime PM core code. The ->runtime_idle callback returns a non-zero value, so it will never suspend except when forced. As a result, there's no need to take an extra "do not suspend" clock reference. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	63de79f031	net: ipa: use runtime PM core Use the runtime power management core to cause hardware suspend and resume to occur. Enable it in ipa_clock_init() (without autosuspend), and disable it in ipa_clock_exit(). Use ipa_runtime_suspend() as the ->runtime_suspend power operation, and arrange for it to be called by having ipa_clock_get() call pm_runtime_get_sync() when the first clock reference is taken. Similarly, use ipa_runtime_resume() as the ->runtime_resume power operation, and pm_runtime_put() when the last IPA clock reference is dropped. Introduce ipa_runtime_idle() as the ->runtime_idle power operation, and have it return a non-zero value; this way suspend will never occur except when forced. Use pm_runtime_force_suspend() and pm_runtime_force_resume() as the system suspend and resume callbacks, and remove ipa_suspend() and ipa_resume(). Store a pointer to the device structure passed to ipa_clock_init(), so it can be used by ipa_clock_exit() to disable runtime power management. For now we preserve IPA clock reference counting. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	2abb0c7f98	net: ipa: resume in ipa_clock_get() Introduce ipa_runtime_suspend() and ipa_runtime_resume(), which encapsulate the activities necessary for suspending and resuming the IPA hardware. Call these functions from ipa_clock_get() and ipa_clock_put() when the first reference is taken or last one is dropped. When the very first clock reference is taken (for ipa_config()), setup isn't complete yet, so (as before) only the core clock gets enabled. When the last clock reference is dropped (after ipa_deconfig()), ipa_teardown() will have made the setup_complete flag false, so there too, the core clock will be stopped without affecting GSI or the endpoints. Otherwise these new functions will perform the desired suspend and resume actions once setup is complete. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	1016c6b8c6	net: ipa: disable clock in suspend Disable the IPA clock rather than dropping a reference to it in the system suspend callback. This forces the suspend to occur without affecting existing references. Similarly, enable the clock rather than taking a reference in ipa_resume(), forcing a resume without changing the reference count. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:55 +01:00
Alex Elder	7ebd168c3b	net: ipa: have ipa_clock_get() return a value We currently assume no errors occur when enabling or disabling the IPA core clock and interconnects. And although this commit exposes errors that could occur, we generally assume this won't happen in practice. This commit changes ipa_clock_get() and ipa_clock_put() so each returns a value. The values returned are meant to mimic what the runtime power management functions return, so we can set up error handling here before we make the switch. Have ipa_clock_get() increment the reference count even if it returns an error, to match the behavior of pm_runtime_get(). More details follow. When taking a reference in ipa_clock_get(), return 0 for the first reference, 1 for subsequent references, or a negative error code if an error occurs. Note that if ipa_clock_get() returns an error, we must not touch hardware; in some cases such errors now cause entire blocks of code to be skipped. When dropping a reference in ipa_clock_put(), we return 0 or an error code. The error would come from ipa_clock_disable(), which now returns what ipa_interconnect_disable() returns (either 0 or a negative error code). For now, callers ignore the return value; if an error occurs, a message will have already been logged, and little more can actually be done to improve the situation. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:55 +01:00
satya priya	f03f5c75f5	dt-bindings: pinctrl: qcom-pmic-gpio: Remove the interrupts property Remove the interrupts property as we no longer specify it. Signed-off-by: satya priya <skakit@codeaurora.org> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/1627910464-19363-4-git-send-email-skakit@codeaurora.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2021-08-11 13:59:30 +02:00
satya priya	328fb93a84	dt-bindings: pinctrl: qcom-pmic-gpio: Convert qcom pmic gpio bindings to YAML Convert Qualcomm PMIC GPIO bindings from .txt to .yaml format. Signed-off-by: satya priya <skakit@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/1627910464-19363-3-git-send-email-skakit@codeaurora.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2021-08-11 13:55:43 +02:00
Amir Goldstein	e43de7f086	fsnotify: optimize the case of no marks of any type Add a simple check in the inline helpers to avoid calling fsnotify() and __fsnotify_parent() in case there are no marks of any type (inode/sb/mount) for an inode's sb, so there can be no objects of any type interested in the event. Link: https://lore.kernel.org/r/20210810151220.285179-5-amir73il@gmail.com Reviewed-by: Matthew Bobrowski <repnop@google.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2021-08-11 13:50:53 +02:00
Geert Uytterhoeven	e888fa7bb8	memblock: Check memory add/cap ordering For memblock_cap_memory_range() to work properly, it should be called after memory is detected and added to memblock with memblock_add() or memblock_add_node(). If memblock_cap_memory_range() would be called before memory is registered, we may silently corrupt memory later because the crash kernel will see all memory as available. Print a warning and bail out if ordering is not satisfied. Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/aabc5bad008d49f07d542815c6c8d28ec90bb09e.1628672091.git.geert+renesas@glider.be	2021-08-11 14:50:50 +03:00
Amir Goldstein	ec44610fe2	fsnotify: count all objects with attached connectors Rename s_fsnotify_inode_refs to s_fsnotify_connectors and count all objects with attached connectors, not only inodes with attached connectors. This will be used to optimize fsnotify() calls on sb without any type of marks. Link: https://lore.kernel.org/r/20210810151220.285179-4-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Matthew Bobrowski <repnop@google.com> Signed-off-by: Jan Kara <jack@suse.cz>	2021-08-11 13:50:48 +02:00
Geert Uytterhoeven	00974b9a83	memblock: Add missing debug code to memblock_add_node() All other memblock APIs built on top of memblock_add_range() contain debug code to print their parameters. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/c45e5218b6fcf0e3aeb63d9a9d9792addae0bb7a.1628672041.git.geert+renesas@glider.be	2021-08-11 14:50:42 +03:00
Amir Goldstein	11fa333b58	fsnotify: count s_fsnotify_inode_refs for attached connectors Instead of incrementing s_fsnotify_inode_refs when detaching connector from inode, increment it earlier when attaching connector to inode. Next patch is going to use s_fsnotify_inode_refs to count all objects with attached connectors. Link: https://lore.kernel.org/r/20210810151220.285179-3-amir73il@gmail.com Reviewed-by: Matthew Bobrowski <repnop@google.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2021-08-11 13:50:42 +02:00
Amir Goldstein	09ddbe69c9	fsnotify: replace igrab() with ihold() on attach connector We must have a reference on inode, so ihold is cheaper. Link: https://lore.kernel.org/r/20210810151220.285179-2-amir73il@gmail.com Reviewed-by: Matthew Bobrowski <repnop@google.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2021-08-11 13:50:28 +02:00
James Morse	331ebe4c43	x86/resctrl: Walk the resctrl schema list instead of an arch list When parsing a schema configuration value from user-space, resctrl walks the architectures rdt_resources_all[] array to find a matching struct rdt_resource. Once the CDP resources are merged there will be one resource in use by two schemata. Anything walking rdt_resources_all[] on behalf of a user-space request should walk the list of struct resctrl_schema instead. Change the users of for_each_alloc_enabled_rdt_resource() to walk the schema instead. Schemata were only created for alloc_enabled resources so these two lists are currently equivalent. schemata_list_create() and rdt_kill_sb() are ignored. The first creates the schema list, and will eventually loop over the resource indexes using an arch helper to retrieve the resource. rdt_kill_sb() will eventually make use of an arch 'reset everything' helper. After the filesystem code is moved, rdtgroup_pseudo_locked_in_hierarchy() remains part of the x86 specific hooks to support pseudo lock. This code walks each domain, and still does this after the separate resources are merged. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-7-james.morse@arm.com	2021-08-11 13:20:43 +02:00
James Morse	208ab16847	x86/resctrl: Label the resources with their configuration type The names of resources are used for the schema name presented to user-space. The name used is rooted in a structure provided by the architecture code because the names are different when CDP is enabled. x86 implements this by swapping between two sets of resource structures based on their alloc_enabled flag. The type of configuration in-use is encoded in the name (and cbm_idx_offset). Once the CDP behaviour is moved into the parts of resctrl that will move to /fs/, there will be two struct resctrl_schema for one struct rdt_resource. The schema describes the type of configuration being applied to the resource. The name of the schema should be generated by resctrl, base on the type of configuration. To do this struct resctrl_schema needs to store the type of configuration in use for a schema. Create an enum resctrl_conf_type describing the options, and add it to struct resctrl_schema. The underlying resources are still separate, as cbm_idx_offset is still in use. Temporarily label all the entries in rdt_resources_all[] and copy that value to struct resctrl_schema. Copying the value ensures there is no mismatch while the filesystem parts of resctrl are modified to use the schema. Once the resources are merged, the filesystem code can assign this value based on the schema being created. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-6-james.morse@arm.com	2021-08-11 13:13:18 +02:00
Eli Cohen	879753c816	vdpa/mlx5: Fix queue type selection logic get_queue_type() comments that splict virtqueue is preferred, however, the actual logic preferred packed virtqueues. Since firmware has not supported packed virtqueues we ended up using split virtqueues as was desired. Since we do not advertise support for packed virtqueues, we add a check to verify split virtqueues are indeed supported. Fixes: `1a86b377aa` ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210811053759.66752-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-11 06:44:43 -04:00
Eli Cohen	08dbd56602	vdpa/mlx5: Avoid destroying MR on empty iotlb The current code treats an empty iotlb provdied in set_map() as a special case and destroy the memory region object. This must not be done since the virtqueue objects reference this MR. Doing so will cause the driver unload to emit errors and log timeouts caused by the firmware complaining on busy resources. This patch treats an empty iotlb as any other change of mapping. In this case, mlx5_vdpa_create_mr() will fail and the entire set_map() call to fail. This issue has not been encountered before but was seen to occur in a non-official version of qemu. Since qemu is a userspace program, the driver must protect against such case. Fixes: `94abbccdf2` ("vdpa/mlx5: Add shared memory registration code") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210811053713.66658-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-11 06:44:41 -04:00
Michael S. Tsirkin	a24ce06c70	tools/virtio: fix build We use a spinlock now so add a stub. Ignore bogus uninitialized variable warnings. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-11 06:44:24 -04:00
Michael S. Tsirkin	f8ce72632f	virtio_ring: pull in spinlock header we use a spinlock now pull in the correct header to make virtio_ring.c self sufficient. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-11 06:44:24 -04:00
Michael S. Tsirkin	ea2f6af165	vringh: pull in spinlock header we use a spinlock now pull in the correct header to make vring.h self sufficient. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-11 06:44:24 -04:00
Xie Yongji	82e89ea077	virtio-blk: Add validation for block size in config space An untrusted device might presents an invalid block size in configuration space. This tries to add validation for it in the validate callback and clear the VIRTIO_BLK_F_BLK_SIZE feature bit if the value is out of the supported range. And we also double check the value in virtblk_probe() in case that it's changed after the validation. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210809101609.148-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-08-11 06:44:24 -04:00
Neeraj Upadhyay	e74cfa91f4	vringh: Use wiov->used to check for read/write desc order As __vringh_iov() traverses a descriptor chain, it populates each descriptor entry into either read or write vring iov and increments that iov's ->used member. So, as we iterate over a descriptor chain, at any point, (riov/wriov)->used value gives the number of descriptor enteries available, which are to be read or written by the device. As all read iovs must precede the write iovs, wiov->used should be zero when we are traversing a read descriptor. Current code checks for wiov->i, to figure out whether any previous entry in the current descriptor chain was a write descriptor. However, iov->i is only incremented, when these vring iovs are consumed, at a later point, and remain 0 in __vringh_iov(). So, correct the check for read and write descriptor order, to use wiov->used. Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Link: https://lore.kernel.org/r/1624591502-4827-1-git-send-email-neeraju@codeaurora.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-11 06:44:24 -04:00
Vincent Whitchurch	cb5d2c1f6c	virtio_vdpa: reject invalid vq indices Do not call vDPA drivers' callbacks with vq indicies larger than what the drivers indicate that they support. vDPA drivers do not bounds check the indices. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Link: https://lore.kernel.org/r/20210701114652.21956-1-vincent.whitchurch@axis.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2021-08-11 06:44:23 -04:00
Xie Yongji	c8d182bd38	vdpa: Add documentation for vdpa_alloc_device() macro The return value of vdpa_alloc_device() macro is not very clear, so that most of callers did the wrong check. Let's add some comments to better document it. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210715080026.242-4-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2021-08-11 06:44:23 -04:00
Xie Yongji	1057afa012	vDPA/ifcvf: Fix return value check for vdpa_alloc_device() The vdpa_alloc_device() returns an error pointer upon failure, not NULL. To handle the failure correctly, this replaces NULL check with IS_ERR() check and propagate the error upwards. Fixes: `5a2414bc45` ("virtio: Intel IFC VF driver for VDPA") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210715080026.242-3-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2021-08-11 06:44:23 -04:00
Xie Yongji	9632e78e82	vp_vdpa: Fix return value check for vdpa_alloc_device() The vdpa_alloc_device() returns an error pointer upon failure, not NULL. To handle the failure correctly, this replaces NULL check with IS_ERR() check and propagate the error upwards. Fixes: `64b9f64f80` ("vdpa: introduce virtio pci driver") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210715080026.242-2-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2021-08-11 06:44:23 -04:00
Xie Yongji	2b847f2114	vdpa_sim: Fix return value check for vdpa_alloc_device() The vdpa_alloc_device() returns an error pointer upon failure, not NULL. To handle the failure correctly, this replaces NULL check with IS_ERR() check and propagate the error upwards. Fixes: `2c53d0f64c` ("vdpasim: vDPA device simulator") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210715080026.242-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2021-08-11 06:44:23 -04:00
Xie Yongji	f7ad318ea0	vhost: Fix the calculation in vhost_overflow() This fixes the incorrect calculation for integer overflow when the last address of iova range is 0xffffffff. Fixes: `ec33d031a1` ("vhost: detect 32 bit integer wrap around") Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210728130756.97-2-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-11 06:44:15 -04:00
James Morse	f259449230	x86/resctrl: Pass the schema in info dir's private pointer Many of resctrl's per-schema files return a value from struct rdt_resource, which they take as their 'priv' pointer. Moving properties that resctrl exposes to user-space into the core 'fs' code, (e.g. the name of the schema), means some of the functions that back the filesystem need the schema struct (to where the properties are moved), but currently take struct rdt_resource. For example, once the CDP resources are merged, struct rdt_resource no longer reflects all the properties of the schema. For the info dirs that represent a control, the information needed will be accessed via struct resctrl_schema, as this is how the resource is being used. For the monitors, its still struct rdt_resource as the monitors aren't described as schema. This difference means the type of the private pointers varies between control and monitor info dirs. Change the 'priv' pointer to point to struct resctrl_schema for the per-schema files that represent a control. The type can be determined from the fflags field. If the flags are RF_MON_INFO, its a struct rdt_resource. If the flags are RF_CTRL_INFO, its a struct resctrl_schema. No entry in res_common_files[] has both flags. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-5-james.morse@arm.com	2021-08-11 12:41:19 +02:00
Quentin Perret	64a80fb766	KVM: arm64: Make __pkvm_create_mappings static The __pkvm_create_mappings() function is no longer used outside of nvhe/mm.c, make it static. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-22-qperret@google.com	2021-08-11 11:39:52 +01:00
Quentin Perret	66c57edd3b	KVM: arm64: Restrict EL2 stage-1 changes in protected mode The host kernel is currently able to change EL2 stage-1 mappings without restrictions thanks to the __pkvm_create_mappings() hypercall. But in a world where the host is no longer part of the TCB, this clearly poses a problem. To fix this, introduce a new hypercall to allow the host to share a physical memory page with the hypervisor, and remove the __pkvm_create_mappings() variant. The new hypercall implements ownership and permission checks before allowing the sharing operation, and it annotates the shared page in the hypervisor stage-1 and host stage-2 page-tables. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-21-qperret@google.com	2021-08-11 11:39:52 +01:00
Quentin Perret	f9370010e9	KVM: arm64: Refactor protected nVHE stage-1 locking Refactor the hypervisor stage-1 locking in nVHE protected mode to expose a new pkvm_create_mappings_locked() function. This will be used in later patches to allow walking and changing the hypervisor stage-1 without releasing the lock. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-20-qperret@google.com	2021-08-11 11:39:51 +01:00
Quentin Perret	ad0e0139a8	KVM: arm64: Remove __pkvm_mark_hyp Now that we mark memory owned by the hypervisor in the host stage-2 during __pkvm_init(), we no longer need to rely on the host to explicitly mark the hyp sections later on. Remove the __pkvm_mark_hyp() hypercall altogether. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-19-qperret@google.com	2021-08-11 11:39:51 +01:00
Quentin Perret	2c50166c62	KVM: arm64: Mark host bss and rodata section as shared As the hypervisor maps the host's .bss and .rodata sections in its stage-1, make sure to tag them as shared in hyp and host page-tables. But since the hypervisor relies on the presence of these mappings, we cannot let the host in complete control of the memory regions -- it must not unshare or donate them to another entity for example. To prevent this, let's transfer the ownership of those ranges to the hypervisor itself, and share the pages back with the host. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-18-qperret@google.com	2021-08-11 11:39:51 +01:00
Quentin Perret	9024b3d006	KVM: arm64: Enable retrieving protections attributes of PTEs Introduce helper functions in the KVM stage-2 and stage-1 page-table manipulation library allowing to retrieve the enum kvm_pgtable_prot of a PTE. This will be useful to implement custom walkers outside of pgtable.c. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-17-qperret@google.com	2021-08-11 11:39:51 +01:00
Quentin Perret	e009dce129	KVM: arm64: Introduce addr_is_memory() Introduce a helper usable in nVHE protected mode to check whether a physical address is in a RAM region or not. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-16-qperret@google.com	2021-08-11 11:39:51 +01:00
Quentin Perret	2d77e238ba	KVM: arm64: Expose pkvm_hyp_id Allow references to the hypervisor's owner id from outside mem_protect.c. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-15-qperret@google.com	2021-08-11 11:39:51 +01:00
Quentin Perret	39257da0e0	KVM: arm64: Expose host stage-2 manipulation helpers We will need to manipulate the host stage-2 page-table from outside mem_protect.c soon. Introduce two functions allowing this, and make them usable to users of mem_protect.h. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-14-qperret@google.com	2021-08-11 11:39:50 +01:00
Quentin Perret	ec250a67ea	KVM: arm64: Add helpers to tag shared pages in SW bits We will soon start annotating shared pages in page-tables in nVHE protected mode. Define all the states in which a page can be (owned, shared and owned, shared and borrowed), and provide helpers allowing to convert this into SW bits annotations using the matching prot attributes. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-13-qperret@google.com	2021-08-11 11:39:50 +01:00
Quentin Perret	4505e9b624	KVM: arm64: Allow populating software bits Introduce infrastructure allowing to manipulate software bits in stage-1 and stage-2 page-tables using additional entries in the kvm_pgtable_prot enum. This is heavily inspired by Marc's implementation of a similar feature in the NV patch series, but adapted to allow stage-1 changes as well: https://lore.kernel.org/kvmarm/20210510165920.1913477-56-maz@kernel.org/ Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-12-qperret@google.com	2021-08-11 11:39:50 +01:00
Quentin Perret	5651311941	KVM: arm64: Enable forcing page-level stage-2 mappings Much of the stage-2 manipulation logic relies on being able to destroy block mappings if e.g. installing a smaller mapping in the range. The rationale for this behaviour is that stage-2 mappings can always be re-created lazily. However, this gets more complicated when the stage-2 page-table is used to store metadata about the underlying pages. In such cases, destroying a block mapping may lead to losing part of the state, and confuse the user of those metadata (such as the hypervisor in nVHE protected mode). To avoid this, introduce a callback function in the pgtable struct which is called during all map operations to determine whether the mappings can use blocks, or should be forced to page granularity. This is used by the hypervisor when creating the host stage-2 to force page-level mappings when using non-default protection attributes. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-11-qperret@google.com	2021-08-11 11:39:50 +01:00
Quentin Perret	b53846c5f2	KVM: arm64: Tolerate re-creating hyp mappings to set software bits The current hypervisor stage-1 mapping code doesn't allow changing an existing valid mapping. Relax this condition by allowing changes that only target software bits, as that will soon be needed to annotate shared pages. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-10-qperret@google.com	2021-08-11 11:39:50 +01:00
Quentin Perret	8a0282c681	KVM: arm64: Don't overwrite software bits with owner id We will soon start annotating page-tables with new flags to track shared pages and such, and we will do so in valid mappings using software bits in the PTEs, as provided by the architecture. However, it is possible that we will need to use those flags to annotate invalid mappings as well in the future, similar to what we do to track page ownership in the host stage-2. In order to facilitate the annotation of invalid mappings with such flags, it would be preferable to re-use the same bits as for valid mappings (bits [58-55]), but these are currently used for ownership encoding. Since we have plenty of bits left to use in invalid mappings, move the ownership bits further down the PTE to avoid the conflict. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-9-qperret@google.com	2021-08-11 11:39:50 +01:00
Quentin Perret	178cac08d5	KVM: arm64: Rename KVM_PTE_LEAF_ATTR_S2_IGNORED The ignored bits for both stage-1 and stage-2 page and block descriptors are in [55:58], so rename KVM_PTE_LEAF_ATTR_S2_IGNORED to make it applicable to both. And while at it, since these bits are more commonly known as 'software' bits, rename accordingly. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-8-qperret@google.com	2021-08-11 11:39:49 +01:00
Quentin Perret	c4f0935e4d	KVM: arm64: Optimize host memory aborts The kvm_pgtable_stage2_find_range() function is used in the host memory abort path to try and look for the largest block mapping that can be used to map the faulting address. In order to do so, the function currently walks the stage-2 page-table and looks for existing incompatible mappings within the range of the largest possible block. If incompatible mappings are found, it tries the same procedure again, but using a smaller block range, and repeats until a matching range is found (potentially up to page granularity). While this approach has benefits (mostly in the fact that it proactively coalesces host stage-2 mappings), it can be slow if the ranges are fragmented, and it isn't optimized to deal with CPUs faulting on the same IPA as all of them will do all the work every time. To avoid these issues, remove kvm_pgtable_stage2_find_range(), and walk the page-table only once in the host_mem_abort() path to find the closest leaf to the input address. With this, use the corresponding range if it is invalid and not owned by another entity. If a valid leaf is found, return -EAGAIN similar to what is done in the kvm_pgtable_stage2_map() path to optimize concurrent faults. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-7-qperret@google.com	2021-08-11 11:39:49 +01:00
Quentin Perret	51add45773	KVM: arm64: Expose page-table helpers The KVM pgtable API exposes the kvm_pgtable_walk() function to allow the definition of walkers outside of pgtable.c. However, it is not easy to implement any of those walkers without some of the low-level helpers. Move some of them to the header file to allow re-use from other places. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-6-qperret@google.com	2021-08-11 11:39:49 +01:00
Quentin Perret	1bac49d490	KVM: arm64: Provide the host_stage2_try() helper macro We currently unmap all MMIO mappings from the host stage-2 to recycle the pages whenever we run out. In order to make this pattern easy to re-use from other places, factor the logic out into a dedicated macro. While at it, apply the macro for the kvm_pgtable_stage2_set_owner() calls. They're currently only called early on and are guaranteed to succeed, but making them robust to the -ENOMEM case doesn't hurt and will avoid painful debugging sessions later on. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-4-qperret@google.com	2021-08-11 11:39:36 +01:00
Quentin Perret	8e049e0daf	KVM: arm64: Introduce hyp_assert_lock_held() Introduce a poor man's lockdep implementation at EL2 which allows to BUG() whenever a hyp spinlock is not held when it should. Hide this feature behind a new Kconfig option that targets the EL2 object specifically, instead of piggy backing on the existing CONFIG_LOCKDEP. EL2 cannot WARN() cleanly to report locking issues, hence BUG() is the only option and it is not clear whether we want this widely enabled. This is most likely going to be useful for local testing until the EL2 WARN() situation has improved. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-3-qperret@google.com	2021-08-11 11:39:35 +01:00
Will Deacon	d21292f13f	KVM: arm64: Add hyp_spin_is_locked() for basic locking assertions at EL2 Introduce hyp_spin_is_locked() so that functions can easily assert that a given lock is held (albeit possibly by another CPU!) without having to drag full lockdep support up to EL2. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-2-qperret@google.com	2021-08-11 11:39:35 +01:00
James Morse	cdb9ebc917	x86/resctrl: Add a separate schema list for resctrl Resctrl exposes schemata to user-space, which allow the control values to be specified for a group of tasks. User-visible properties of the interface, (such as the schemata names and how the values are parsed) are rooted in a struct provided by the architecture code. (struct rdt_hw_resource). Once a second architecture uses resctrl, this would allow user-visible properties to diverge between architectures. These properties should come from the resctrl code that will be common to all architectures. Resctrl has no per-schema structure, only struct rdt_{hw_,}resource. Create a struct resctrl_schema to hold the rdt_resource. Before a second architecture can be supported, this structure will also need to hold the schema name visible to user-space and the type of configuration values for resctrl. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-4-james.morse@arm.com	2021-08-11 12:28:01 +02:00

... 115 116 117 118 119 ...

1042671 commits