Commit graph

227569 commits

Author SHA1 Message Date
Ingo Molnar
050c6c9b89 sched: Remove debugging check
Linus reported that the new warning introduced by commit f26f9aff6a
"Sched: fix skip_clock_update optimization" triggers. The need_resched
flag can be set by other CPUs asynchronously so this debug check is
bogus - remove it.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <AANLkTinJ8hAG1TpyC+CSYPR47p48+1=E7fiC45hMXT_1@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19 23:24:27 +01:00
Jonas Aaberg
3c5728edbe ux500: platsmp: Fix section mismatch
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 21:31:49 +01:00
Sundar Iyer
556fb03869 mach-ux500: add STMPE1601 platform data
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
[Minor fixups to GPIO enumerators]
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 20:54:55 +01:00
Sundar Iyer
e43abe6f98 mach-ux500: move keymaps to new file
Move keylayouts to a dedicated file and plug these keylayouts
for input platform data. This will make addition of new and custom
keylayouts localized.

Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 20:51:37 +01:00
Linus Torvalds
55ec86f848 Merge branches 'x86-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-32: Make sure we can map all of lowmem if we need to
  x86, vt-d: Handle previous faults after enabling fault handling
  x86: Enable the intr-remap fault handling after local APIC setup
  x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode
  x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic
  x86, xsave: Use alloc_bootmem_align() instead of alloc_bootmem()
  bootmem: Add alloc_bootmem_align()
  x86, gcc-4.6: Use gcc -m options when building vdso
  x86: HPET: Chose a paranoid safe value for the ETIME check
  x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Fix off by one in perf_swevent_init()
  perf: Fix duplicate events with multiple-pmu vs software events
  ftrace: Have recordmcount honor endianness in fn_ELF_R_INFO
  scripts/tags.sh: Add magic for trace-events
  tracing: Fix panic when lseek() called on "trace" opened for writing
2010-12-19 10:44:54 -08:00
Linus Torvalds
21228e4557 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix the irqtime code for 32bit
  sched: Fix the irqtime code to deal with u64 wraps
  nohz: Fix get_next_timer_interrupt() vs cpu hotplug
  Sched: fix skip_clock_update optimization
  sched: Cure more NO_HZ load average woes
2010-12-19 10:37:37 -08:00
Sundar Iyer
593e9d70fb mfd/tc3589x: add suspend/resume support
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:58 +01:00
Sundar Iyer
523bc3820f mfd/tc3589x: undo gpio module reset during chip init
Skip putting the GPIO module into a reset during the chip init.  This makes
sure to preserve any existing GPIO configurations done by pre-kernel boot code.

Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:55 +01:00
Sundar Iyer
bd77efd0ce mfd/tc3589x: fix random interrupt misses
On the TC35892, a random delayed interrupt clear (GPIO IC) write locks up the
child interrupts. In such a case, the original interrupt is active and not yet
acknowledged. Re-check the IRQST bit for any pending interrupts and handle
those.

Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:52 +01:00
Sundar Iyer
611b7590af mfd/tc3589x: add block identifier for multiple child devices
Add block identifier to be able to add multiple mfd clients
to the mfd core

Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:49 +01:00
Sundar Iyer
20406ebff4 mfd/tc3589x: rename tc35892 structs/registers to tc359x
Most of the register layout, client IRQ numbers on the TC35892 is shared also
by other variants. Make this generic as tc3589x

Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:46 +01:00
Sundar Iyer
f4e8afdc7a mfd/tc35892: rename tc35892 core driver to tc3589x
Rename the tc35892 core/gpio drivers to tc3589x to include
new variants in the same mfd core

Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:42 +01:00
Sundar Iyer
c6eda6c5ee mfd/tc35892: rename tc35892 header to tc3589x
Rename the header file to include further variants within
the same mfd core driver

Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:39 +01:00
Sundar Iyer
d5d228158e mach-ux500: deprecate spi support for ab8500
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19 19:27:36 +01:00
Nicolas Pitre
6d3e6d3640 ARM: fix cache-feroceon-l2 after stack based kmap_atomic()
Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is actively
wrong to rely on fixed kmap type indices (namely KM_L2_CACHE) as
kmap_atomic() totally ignores them and a concurrent instance of it may
happily reuse any slot for any purpose.  Because kmap_atomic() is now
able to deal with reentrancy, we can get rid of the ad hoc mapping here.

While the code is made much simpler, there is a needless cache flush
introduced by the usage of __kunmap_atomic().  It is not clear if the
performance difference to remove that is worth the cost in code
maintenance (I don't think there are that many highmem users on that
platform anyway) but that should be reconsidered when/if someone cares
enough to do some measurements.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2010-12-19 12:57:16 -05:00
Nicolas Pitre
25cbe45440 ARM: fix cache-xsc3l2 after stack based kmap_atomic()
Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is actively
wrong to rely on fixed kmap type indices (namely KM_L2_CACHE) as
kmap_atomic() totally ignores them and a concurrent instance of it may
happily reuse any slot for any purpose.  Because kmap_atomic() is now
able to deal with reentrancy, we can get rid of the ad hoc mapping here,
and we even don't have to disable IRQs anymore (highmem case).

While the code is made much simpler, there is a needless cache flush
introduced by the usage of __kunmap_atomic().  It is not clear if the
performance difference to remove that is worth the cost in code
maintenance (I don't think there are that many highmem users on that
platform if at all anyway).

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2010-12-19 12:57:08 -05:00
Nicolas Pitre
39af22a792 ARM: get rid of kmap_high_l1_vipt()
Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is no longer
necessary to carry an ad hoc version of kmap_atomic() added in commit
7e5a69e83b "ARM: 6007/1: fix highmem with VIPT cache and DMA" to cope
with reentrancy.

In fact, it is now actively wrong to rely on fixed kmap type indices
(namely KM_L1_CACHE) as kmap_atomic() totally ignores them now and a
concurrent instance of it may reuse any slot for any purpose.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2010-12-19 12:56:46 -05:00
Linus Walleij
991a86e182 ARM: 6530/1: mmci: partially revert clock divisor code
I misread the datasheet as if bypass mode was not available at all
on the ux500's, I was wrong. It is there, the datasheet just
states that you should not have to use it.

Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19 16:01:25 +00:00
Linus Walleij
b70a67f938 ARM: 6526/1: mmci: corrected calculation of clock div for ux500
The Ux500 variant of this block has a different divider.
The value used right now is too big and which means a loss
in performance. This fix corrects it. Also expand the math
comments a bit so it's clear what's happening. Further
the Ux500 variant does not like if we use the BYPASS bit,
instead we are supposed to set the clock divider to zero.

Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19 16:01:24 +00:00
Russell King
132b16325f ARM: AT91: update clock source registration
In d7e81c2 (clocksource: Add clocksource_register_hz/khz interface) new
interfaces were added which simplify (and optimize) the selection of the
divisor shift/mult constants.  Switch over to using this new interface.

Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19 15:44:53 +00:00
Russell King
40cc524400 ARM: clockevents: fix IOP clock events initialization
Ensure that no interrupt is pending before registering the clock
event device, and properly initialize the periodic tick in the
->set_mode callback.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19 15:44:53 +00:00
Paul Turner
19e5eebb8e sched: Fix interactivity bug by charging unaccounted run-time on entity re-weight
Mike Galbraith reported poor interactivity[*] when the new shares distribution
code was combined with autogroups.

The root cause turns out to be a mis-ordering of accounting accrued execution
time and shares updates.  Since update_curr() is issued hierarchically,
updating the parent entity weights to reflect child enqueue/dequeue results in
the parent's unaccounted execution time then being accrued (vs vruntime) at the
new weight as opposed to the weight present at accumulation.

While this doesn't have much effect on processes with timeslices that cross a
tick, it is particularly problematic for an interactive process (e.g. Xorg)
which incurs many (tiny) timeslices.  In this scenario almost all updates are
at dequeue which can result in significant fairness perturbation (especially if
it is the only thread, resulting in potential {tg->shares, MIN_SHARES}
transitions).

Correct this by ensuring unaccounted time is accumulated prior to manipulating
an entity's weight.

[*] http://xkcd.com/619/ is perversely Nostradamian here.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20101216031038.159704378@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19 16:36:30 +01:00
Paul Turner
43365bd7ff sched: Move periodic share updates to entity_tick()
Long running entities that do not block (dequeue) require periodic updates to
maintain accurate share values.  (Note: group entities with several threads are
quite likely to be non-blocking in many circumstances).

By virtue of being long-running however, we will see entity ticks (otherwise
the required update occurs in dequeue/put and we are done).  Thus we can move
the detection (and associated work) for these updates into the periodic path.

This restores the 'atomicity' of update_curr() with respect to accounting.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101216031038.067028969@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19 16:36:22 +01:00
Ingo Molnar
ca680888d5 Merge commit 'v2.6.37-rc6' into sched/core
Merge reason: Update to the latest -rc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19 16:35:14 +01:00
Robert Richter
da169f5df2 oprofile, x86: Add support for 6 counters (AMD family 15h)
This patch adds support for up to 6 hardware counters for AMD family
15h cpus. There is a new MSR range for hardware counters beginning at
MSRC001_0200 Performance Event Select (PERF_CTL0).

Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-12-19 11:43:08 +01:00
Robert Richter
30570bced1 oprofile, x86: Add support for AMD family 15h
This patch adds support for AMD family 15h (Interlagos/Valencia/
Zambezi) cpus.

Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-12-19 11:43:04 +01:00
stephen hemminger
d1ed113f16 ipv6: remove duplicate neigh_ifdown
When device is being set to down, neigh_ifdown was being called
twice. Once from addrconf notifier and once from ndisc notifier.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-18 22:01:17 -08:00
stephen hemminger
bc3ef6605e ipv6: fib6_ifdown cleanup
Remove (unnecessary) casts to make code cleaner.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-18 22:01:16 -08:00
Linus Torvalds
0a59228168 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  arch/tile: handle rt_sigreturn() more cleanly
  arch/tile: handle CLONE_SETTLS in copy_thread(), not user space
2010-12-18 10:28:54 -08:00
Linus Torvalds
2ba16c4f45 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Fix build errors in sc-mips.c
2010-12-18 10:23:29 -08:00
Linus Torvalds
46bdfe6a50 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86: avoid high BIOS area when allocating address space
  x86: avoid E820 regions when allocating address space
  x86: avoid low BIOS area when allocating address space
  resources: add arch hook for preventing allocation in reserved areas
  Revert "resources: support allocating space within a region from the top down"
  Revert "PCI: allocate bus resources from the top down"
  Revert "x86/PCI: allocate space from the end of a region, not the beginning"
  Revert "x86: allocate space within a region top-down"
  Revert "PCI: fix pci_bus_alloc_resource() hang, prefer positive decode"
  PCI: Update MCP55 quirk to not affect non HyperTransport variants
2010-12-18 10:13:24 -08:00
Santosh Shilimkar
b89cd71a15 omap4: l2x0: Enable early BRESP bit
The AXI protocol specifies that the write response can only
be sent back to an AXI master when the last write data has been
accepted. This optimization enables the PL310 to send the write
response of certain write transactions as soon as the store buffer
accepts the write address. This behavior is not compatible with
the AXI protocol and is disabled by default. You enable this
optimization by setting the Early BRESP Enable bit in the
Auxiliary Control Register (bit [30]).

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Mans Rullgard <mans@mansr.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:33:01 -08:00
Santosh Shilimkar
b0f20ff9d7 omap4: l2x0: Set share override bit
Clearing bit 22 in the PL310 Auxiliary Control register (shared
attribute override enable) has the side effect of transforming Normal
Shared Non-cacheable reads into Cacheable no-allocate reads.

Coherent DMA buffers in Linux always have a Cacheable alias via the
kernel linear mapping and the processor can speculatively load cache
lines into the PL310 controller. With bit 22 cleared, Non-cacheable
reads would unexpectedly hit such cache lines leading to buffer
corruption

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:32:55 -08:00
Mans Rullgard
11e0264046 omap4: l2x0: enable instruction and data prefetching
Enabling L2 prefetching improves performance as shown on Panda
ES2.1 board with mem test, and it has measurable impact on
performances. I think we should consider it, even though it damages
"writes" a bit. (rebased to k.org)
Usually the prefetch is used at both levels together L1 + L2, however,
to enable the CP15 prefetch engines, these are under security, and on
GP devices, we cannot enable it(e.g. on PandaBoard). However, just
enabling PL310 prefetch seems to provide performance improvement,
as shown in the data below (from Ubuntu) and would be a great thing
to pull in.

What prefetch does is enable automatic next line prefetching. With this
enabled, whenever the PL310 receives a cachable read request, it
automatically prefetches the following cache line as well.

Measurement Data:
==
STOCK 10.10 WITHOUT PATCH

========================
~# ./memspeed
size    8388608 8192k 8M
offset  8388608, 0
buffers 0x2aaad000 0x2b2ad000
copy  libc          133 MB/s
copy  Android v5    273 MB/s
copy  Android NEON  235 MB/s
copy  INT32         116 MB/s
copy  ASM ARM       187 MB/s
copy  ASM VLDM 64   204 MB/s
copy  ASM VLDM 128  173 MB/s
copy  ASM VLD1      216 MB/s
read  ASM ARM       286 MB/s
read  ASM VLDM      242 MB/s
read  ASM VLD1      286 MB/s
write libc         1947 MB/s
write ASM ARM      1943 MB/s
write ASM VSTM     1942 MB/s
write ASM VST1     1935 MB/s

10.10 + PATCH
=============
~# ./memspeed
size    8388608 8192k 8M
offset  8388608, 0
buffers 0x2ab17000 0x2b317000
copy  libc          129 MB/s
copy  Android v5    256 MB/s
copy  Android NEON  356 MB/s
copy  INT32         127 MB/s
copy  ASM ARM       321 MB/s
copy  ASM VLDM 64   337 MB/s
copy  ASM VLDM 128  321 MB/s
copy  ASM VLD1      350 MB/s
read  ASM ARM       496 MB/s
read  ASM VLDM      470 MB/s
read  ASM VLD1      488 MB/s
write libc         1701 MB/s
write ASM ARM      1682 MB/s
write ASM VSTM     1693 MB/s
write ASM VST1     1681 MB/s

Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:32:41 -08:00
Santosh Shilimkar
1773e60a81 omap4: l2x0: Construct the AUXCTRL value using defines
This patch removes the hardcoded value of auxctrl value and
construct it using bitfields

Bit 25 is reserved and is always set to 1. Same value
of this bit is retained in this patch

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:31:59 -08:00
Santosh Shilimkar
0aaa6f8f1d ARM: l2x0: Add aux control register bitfields
This patch adds the PL310 Auxiliary Control Register bitfields
so that SOC's can use these bit fields to construct the AUXCTRL
value to be passed/programmed instead of hardcoding it.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18 09:30:15 -08:00
Christoph Lameter
7c83912062 vmstat: User per cpu atomics to avoid interrupt disable / enable
Currently the operations to increment vm counters must disable interrupts
in order to not mess up their housekeeping of counters.

So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer
count on preremption being disabled we still have some minor issues.
The fetching of the counter thresholds is racy.
A threshold from another cpu may be applied if we happen to be
rescheduled on another cpu.  However, the following vmstat operation
will then bring the counter again under the threshold limit.

The operations for __xxx_zone_state are not changed since the caller
has taken care of the synchronization needs (and therefore the cycle
count is even less than the optimized version for the irq disable case
provided here).

The optimization using this_cpu_cmpxchg will only be used if the arch
supports efficient this_cpu_ops (must have CONFIG_CMPXCHG_LOCAL set!)

The use of this_cpu_cmpxchg reduces the cycle count for the counter
operations by %80 (inc_zone_page_state goes from 170 cycles to 32).

Signed-off-by: Christoph Lameter <cl@linux.com>
2010-12-18 15:54:49 +01:00
Christoph Lameter
20b876918c irq_work: Use per cpu atomics instead of regular atomics
The irq work queue is a per cpu object and it is sufficient for
synchronization if per cpu atomics are used. Doing so simplifies
the code and reduces the overhead of the code.

Before:

christoph@linux-2.6$ size kernel/irq_work.o
   text	   data	    bss	    dec	    hex	filename
    451	      8	      1	    460	    1cc	kernel/irq_work.o

After:

christoph@linux-2.6$ size kernel/irq_work.o 
   text	   data	    bss	    dec	    hex	filename
    438	      8	      1	    447	    1bf	kernel/irq_work.o

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Christoph Lameter <cl@linux.com>
2010-12-18 15:54:48 +01:00
Tejun Heo
05c2d088d0 Merge branch 'this_cpu_ops' into for-2.6.38 2010-12-18 15:54:36 +01:00
Christoph Lameter
8270137a0d cpuops: Use cmpxchg for xchg to avoid lock semantics
Use cmpxchg instead of xchg to realize this_cpu_xchg.

xchg will cause LOCK overhead since LOCK is always implied but cmpxchg
will not.

Baselines:

xchg()		= 18 cycles (no segment prefix, LOCK semantics)
__this_cpu_xchg = 1 cycle

(simulated using this_cpu_read/write, two prefixes. Looks like the
cpu can use loop optimization to get rid of most of the overhead)

Cycles before:

this_cpu_xchg	 = 37 cycles (segment prefix and LOCK (implied by xchg))

After:

this_cpu_xchg	= 11 cycle (using cmpxchg without lock semantics)

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-18 15:54:04 +01:00
Christoph Lameter
7296e08aba x86: this_cpu_cmpxchg and this_cpu_xchg operations
Provide support as far as the hardware capabilities of the x86 cpus
allow.

Define CONFIG_CMPXCHG_LOCAL in Kconfig.cpu to allow core code to test for
fast cpuops implementations.

V1->V2:
	- Take out the definition for this_cpu_cmpxchg_8 and move it into
	  a separate patch.

tj: - Reordered ops to better follow this_cpu_* organization.
    - Renamed macro temp variables similar to their existing
      neighbours.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-18 15:54:04 +01:00
Christoph Lameter
2b71244285 percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
Generic code to provide new per cpu atomic features

	this_cpu_cmpxchg
	this_cpu_xchg

Fallback occurs to functions using interrupts disable/enable
to ensure correct per cpu atomicity.

Fallback to regular cmpxchg and xchg is not possible since per cpu atomic
semantics include the guarantee that the current cpus per cpu data is
accessed atomically. Use of regular cmpxchg and xchg requires the
determination of the address of the per cpu data before regular cmpxchg
or xchg which therefore cannot be atomically included in an xchg or
cmpxchg without segment override.

tj: - Relocated new ops to conform better to the general organization.
    - This patch contains a trivial comment fix.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-18 15:54:04 +01:00
Russell King
9326845f45 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nico/orion into devel-stable 2010-12-18 14:28:31 +00:00
Russell King
2f841ed13b Merge branch 'hw-breakpoint' of git://repo.or.cz/linux-2.6/linux-wd into devel-stable 2010-12-18 14:27:55 +00:00
Russell King
1ae1b5f053 ARM: smp: avoid incrementing mm_users on CPU startup
We should not be incrementing mm_users when we startup a secondary
CPU - doing so results in mm_users incrementing by one each time we
hotplug a CPU, which will eventually wrap, and will cause problems.

Other architectures such as x86 do not increment mm_users, but only
mm_count, so we follow that pattern.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-18 13:57:00 +00:00
Haojian Zhuang
a79a9ad94a ARM: pxa: sanitize IRQ registers access based on offset
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
2010-12-18 21:02:17 +08:00
Haojian Zhuang
3f408fa071 ARM: mmp: select CPU_PJ4
Since CPU_PJ4 is shared between PXA95x and MMP2, select CPU_PJ4 in MMP2
configuration.

Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
2010-12-18 21:02:16 +08:00
Haojian Zhuang
c9b5ef47dc ARM: pxa: support saarb platform
Saarb platform is a handheld platform that supports Marvell PXA955 silicon.

Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
2010-12-18 21:02:15 +08:00
Haojian Zhuang
a4553358d9 ARM: pxa: support pxa95x
The core of PXA955 is PJ4. Add new PJ4 support. And add new macro
CONFIG_PXA95x.

Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
2010-12-18 21:02:14 +08:00
Eric Miao
aae8224ddd ARM: pxa: introduce pxa3xx_clock_sysclass for clock suspend/resume
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
2010-12-18 21:02:03 +08:00