__kernel_sync_dicache_p5() is an alternative to
__kernel_sync_dicache() when cpu has CPU_FTR_COHERENT_ICACHE
Remove this alternative function and merge
__kernel_sync_dicache_p5() into __kernel_sync_dicache() using
standard CPU feature fixup.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4c7dcc6544882761b2b0249d7a8ec2c3a8088cb5.1601197618.git.christophe.leroy@csgroup.eu
This is copied from arm64.
Instead of using runtime generated signal trampoline offsets,
get offsets at buildtime.
If the said trampoline doesn't exist, build will fail. So no
need to check whether the trampoline exists or not in the VDSO.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f8bfd6812c3e3678b1cdb4d55a52f9eb022b40d3.1601197618.git.christophe.leroy@csgroup.eu
All other architectures but s390 use a void pointer named 'vdso'
to reference the VDSO mapping.
In a following patch, the VDSO data page will be put in front of
text, vdso_base will then not anymore point to VDSO text.
To avoid confusion between vdso_base and VDSO text, rename vdso_base
into vdso and make it a void __user *.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8e6cefe474aa4ceba028abb729485cd46c140990.1601197618.git.christophe.leroy@csgroup.eu
Copied from commit 2fea7f6c98 ("arm64: vdso: move to
_install_special_mapping and remove arch_vma_name").
Use the new _install_special_mapping() API added by
commit a62c34bd2a ("x86, mm: Improve _install_special_mapping
and fix x86 vdso naming") which obsolete install_special_mapping().
And remove arch_vma_name() as the name is handled by the new API.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: kernel test robot <lkp@intel.com>
[mpe: Squash fix to use PTR_ERR_OR_ZERO() from lkp]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e7e5dfe0f93234e31051f2a610b4b07f50b0082f.1601197618.git.christophe.leroy@csgroup.eu
Today vdso_data structure has:
- syscall_map_32[] and syscall_map_64[] on PPC64
- syscall_map_32[] on PPC32
On PPC32, syscall_map_32[] is populated using sys_call_table[].
On PPC64, syscall_map_64[] is populated using sys_call_table[]
and syscal_map_32[] is populated using compat_sys_call_table[].
To simplify vdso_setup_syscall_map(),
- On PPC32 rename syscall_map_32[] into syscall_map[],
- On PPC64 rename syscall_map_64[] into syscall_map[],
- On PPC64 rename syscall_map_32[] into compat_syscall_map[].
That way, syscall_map[] gets populated using sys_call_table[] and
compat_syscall_map[] gets population using compat_sys_call_table[].
Also define an empty compat_syscall_map[] on PPC32 to avoid ifdefs.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/472734be0d9991eee320a06824219a5b2663736b.1601197618.git.christophe.leroy@csgroup.eu
Instead of including extern references locally in
vdso_setup_syscall_map(), add the missing headers.
sys_ni_syscall() being a function, cast its address to
an unsigned long instead of declaring it as a fake
unsigned long object.
At the same time, remove a comment which paraphrases the
function name.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b4afedce748ed2858299ceab5ae29b52109263ef.1601197618.git.christophe.leroy@csgroup.eu
Since commit 24b659a138 ("powerpc: Use unstripped VDSO image for
more accurate profiling data"), only the unstripped VDSO image
has been used.
Partially revert commit 8150caad02 ("[POWERPC] powerpc vDSO: install
unstripped copies on disk") to avoid building the stripped version.
And the unstripped version in $(MODLIB)/vdso/ is not required
anymore as it is the one embedded in the kernel image.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5986ca25be44fe6e9790486304507f240077d8c4.1601197618.git.christophe.leroy@csgroup.eu
Change those two functions to be used within a user access block.
For that, change save_general_regs() to and unsafe_save_general_regs(),
then replace all user accesses by unsafe_ versions.
This series leads to a reduction from 2.55s to 1.73s of
the system CPU time with the following microbench app
on an mpc832x with KUAP (approx 32%)
Without KUAP, the difference is in the noise.
void sigusr1(int sig) { }
int main(int argc, char **argv)
{
int i = 100000;
signal(SIGUSR1, sigusr1);
for (;i--;)
raise(SIGUSR1);
exit(0);
}
An additional 0.10s reduction is achieved by removing
CONFIG_PPC_FPU, as the mpc832x has no FPU.
A bit less spectacular on an 8xx as KUAP is less heavy, prior to
the series (with KUAP) it ran in 8.10 ms. Once applies the removal
of FPU regs handling, we get 7.05s. With the full series, we get 6.9s.
If artificially re-activating FPU regs handling with the full series,
we get 7.6s.
So for the 8xx, the removal of the FPU regs copy is what makes the
difference, but the rework of handle_signal also have a benefit.
Same as above, without KUAP the difference is in the noise.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fixup typo in SPE handling]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c7b37b385ccf9666066452e58f018a86573f83e8.1597770847.git.christophe.leroy@csgroup.eu
Reorder actions in save_user_regs() and save_tm_user_regs() to
regroup copies together in order to switch to user_access_begin()
logic in a later patch.
Move non-copy actions into new functions called
prepare_save_user_regs() and prepare_save_tm_user_regs().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f6eac65781b4a57220477c8864bca2b57f29a5d5.1597770847.git.christophe.leroy@csgroup.eu
put_sigset_t() calls copy_to_user() for copying two words.
This is terribly inefficient for copying two words.
By switching to unsafe_put_user(), we end up with something as
simple as:
3cc: 81 3d 00 00 lwz r9,0(r29)
3d0: 91 26 00 b4 stw r9,180(r6)
3d4: 81 3d 00 04 lwz r9,4(r29)
3d8: 91 26 00 b8 stw r9,184(r6)
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/06def97e87ac1c4ae8e3197e0982e1fab7b3c8ae.1597770847.git.christophe.leroy@csgroup.eu
Implement 'unsafe' version of put_compat_sigset()
For the bigendian, use unsafe_put_user() directly
to avoid intermediate copy through the stack.
For the littleendian, use a straight unsafe_copy_to_user().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/537c7082ee309a0bb9c67a50c5d9dd929aedb82d.1597770847.git.christophe.leroy@csgroup.eu
On the same way as handle_signal32(), replace all user
accesses with equivalent unsafe_ versions, and move the
trampoline code icache flush outside the user access block.
Functions that have no unsafe_ equivalent also remains outside
the access block.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/2974314226256f958e2984912b48883ef1754185.1597770847.git.christophe.leroy@csgroup.eu
Those two functions are similar and serving the same purpose.
To ease refactorisation, move them close to each other.
This is pure move, no code change, no cosmetic. Yes, checkpatch is
not happy, most will clear later.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/dbce67900bf566bcf40179467bf1eb500814c405.1597770847.git.christophe.leroy@csgroup.eu
If something is bad in the frame, there is no point in
knowing which part of the frame exactly is wrong as it
got allocated as a single block.
Always print the root address of the frame in case of
failed user access, just like handle_signal32().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/691895bd31fee89a2d8370befd66ad4eff5b63f2.1597770847.git.christophe.leroy@csgroup.eu
Instead of calling get_tm_stackpointer() from the caller, call it
directly from get_sigframe(). This avoids a double call and
allows get_tm_stackpointer() to become static and be inlined
into get_sigframe() by GCC.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/abfdc105b8b28c4eb3ab9a26297d17f302b600ea.1597770847.git.christophe.leroy@csgroup.eu
get_clean_sp() is only used once in kernel/signal.c .
GCC is smart enough to see that x & 0xffffffff is a nop
calculation on PPC32, no need of a special PPC32 trivial version.
Include the logic from the PPC64 version of get_clean_sp() directly
in get_sigframe().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/13ef6510ce30a4867e043157b93af5bb8c67fb3b.1597770847.git.christophe.leroy@csgroup.eu
There is no point in copying floating point regs when there
is no FPU and MATH_EMULATION is not selected.
Create a new CONFIG_PPC_FPU_REGS bool that is selected by
CONFIG_MATH_EMULATION and CONFIG_PPC_FPU, and use it to
opt out everything related to fp_state in thread_struct.
The asm const used only by fpu.S are opted out with CONFIG_PPC_FPU
as fpu.S build is conditionnal to CONFIG_PPC_FPU.
The following app spends approx 8.1 seconds system time on an 8xx
without the patch, and 7.0 seconds with the patch (13.5% reduction).
On an 832x, it spends approx 2.6 seconds system time without
the patch and 2.1 seconds with the patch (19% reduction).
void sigusr1(int sig) { }
int main(int argc, char **argv)
{
int i = 100000;
signal(SIGUSR1, sigusr1);
for (;i--;)
raise(SIGUSR1);
exit(0);
}
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7569070083e6cd5b279bb5023da601aba3c06f3c.1597770847.git.christophe.leroy@csgroup.eu