Commit graph

494793 commits

Author SHA1 Message Date
Eugenia Emantayev
ddae0349fd net/mlx4: Change QP allocation scheme
When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields
in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset.

The current Ethernet driver code reserves a Tx QP range with 256b alignment.

This is wrong because if there are more than 64 Tx QPs in use,
QPNs >= base + 65 will have bits 6/7 set.

This problem is not specific for the Ethernet driver, any entity that
tries to reserve more than 64 BF-enabled QPs should fail. Also, using
ranges is not necessary here and is wasteful.

The new mechanism introduced here will support reservation for
"Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
(when hypervisors support WC in VMs). The flow we use is:

1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
   and request "BF enabled QPs" if BF is supported for the function

2. In the ALLOC_RES FW command, change param1 to:
a. param1[23:0]  - number of QPs
b. param1[31-24] - flags controlling QPs reservation

Bit 31 refers to Eth blueflame supported QPs. Those QPs must have
bits 6 and 7 unset in order to be used in Ethernet.

Bits 24-30 of the flags are currently reserved.

When a function tries to allocate a QP, it states the required attributes
for this QP. Those attributes are considered "best-effort". If an attribute,
such as Ethernet BF enabled QP, is a must-have attribute, the function has
to check that attribute is supported before trying to do the allocation.

In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
which are unsupported. If SRIOV is used, the PF validates those attributes
and masks out unsupported attributes as well. In order to notify VFs which
attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's
mailbox is filled by the PF, which notifies which QP allocation attributes
it supports.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:47:35 -05:00
Matan Barak
3dca0f42c7 net/mlx4_core: Use tasklet for user-space CQ completion events
Previously, we've fired all our completion callbacks straight from our ISR.

Some of those callbacks were lightweight (for example, mlx4_en's and
IPoIB napi callbacks), but some of them did more work (for example,
the user-space RDMA stack uverbs' completion handler). Besides that,
doing more than the minimal work in ISR is generally considered wrong,
it could even lead to a hard lockup of the system. Since when a lot
of completion events are generated by the hardware, the loop over those
events could be so long, that we'll get into a hard lockup by the system
watchdog.

In order to avoid that, add a new way of invoking completion events
callbacks. In the interrupt itself, we add the CQs which receive completion
event to a per-EQ list and schedule a tasklet. In the tasklet context
we loop over all the CQs in the list and invoke the user callback.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:47:34 -05:00
Or Gerlitz
383677da43 net/mlx4_core: Mask out host side virtualization features for guests
When VFs (guests in this context) issue the QUERY_DEV_CAP command, they
need not be told that host side virtualization features such as VST, FSM
(MAC anti-spoofing) and running > 80 VFs are supported by the device.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:47:34 -05:00
Or Gerlitz
c58942f252 net/mlx4_en: Set csum level for encapsulated packets
This was dropped by mistake for the napi_gro_frags flow, fix that.

Fixes: dd65beac48 ('net/mlx4_en: Extend usage of napi_gro_frags')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:47:34 -05:00
Marcel Holtmann
417287de88 Bluetooth: Fix check for support for page scan related commands
The Read Page Scan Activity and Read Page Scan Type commands are not
supported by all controllers. Move the execution of both commands
into the 3rd phase of the init procedure. And then check the bit
mask of supported commands before adding them to the init sequence.

With this re-ordering of the init sequence, the extra check for
AVM BlueFritz! controllers is no longer needed. They will report
that these two commands are not supported.

This fixes an issue with the Microsoft Corp. Wireless Transceiver
for Bluetooth 2.0 (ID 045e:009c).

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-11 21:42:11 +02:00
Linus Torvalds
e28870f9b3 - Clean-up leaky resources; pwm_bl
- Simplify Device Tree initialisation; lp855x_bl
  - Add Regulator support; lp855x
  - Remove Bryan from the Maintainer list -- new baby, no time :)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUhsq0AAoJEFGvii+H/Hdh8HgQAKcv+1jK3Eouh7YJPBLOQu73
 qNBD6nwcCRjcf2gjW9DkoGxTyZVs2+ndXxG85z3CBOdhr84YeyF1ity76wLs7Dd+
 dQaR1zP+0H0sh0jMS+SGdEBnF5eSV/iBVvR2u8q0Wl8/m7zOJE1PIVEv6P7/+wNJ
 jv/MdzLvp8LEwANwaaknvePCGPnnbLcBcEonivx4u2lePF1Y1Vtk6tHWW8zm/GEG
 p7DrOwWGkCWJwFeROnbzy+oaR88oA5Ezrt5b56u+AMvcnRoSZqPF+cAV7U72AsnH
 wXiKtAE/oBsgMKQcXyeGiGD8/3uwNZPxO3h2kLme7Cw/oL25Z7D/ru0308/82ozo
 gK/9nYiXC8NhEWEhed9+3+Rp7mLGy6BaqZ8GX7uK2jeLhDqNSKDXCOpi7QTEl1mn
 z4mbXi5phTvbcSwcLyytzVIuFfOPAA7WzBVK6U+n0BkGMHrECCRyAEroO5wy/HST
 3B3Q49NNclrqg60/IMFxfaqDLnAw0DWUEshRJP5ggCfyWE9iK/NUSpKtzp3nzWKf
 7WcpiOwjC17emjH8nIDOu5xrbakGbNWLP3Z/keyzhtIP8bDqEvHrL7kFZhirFWOm
 g8z/9he6m93T15B7BwRi+O/gtsdsBp/mTxPNz35elQEvi4pKp0jmUp6F+GIQL3Vh
 xecI7iLhjiiLLB+wJDTU
 =0PNI
 -----END PGP SIGNATURE-----

Merge tag 'backlight-for-linus-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 - Clean-up leaky resources; pwm_bl
 - Simplify Device Tree initialisation; lp855x_bl
 - Add Regulator support; lp855x
 - Remove Bryan from the Maintainer list -- new baby, no time :)

* tag 'backlight-for-linus-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  MAINTAINERS: Remove my name from Backlight subsystem
  backlight: lp855x: Add supply regulator to lp855x
  backlight: lp855x: Refactor DT parsing code
  backlight: pwm: Clean-up pwm requested using legacy API
2014-12-11 11:39:03 -08:00
Sriharsha Basavapatna
630f4b7056 be2net: Export tunnel offloads only when a VxLAN tunnel is created
The encapsulated offload flags shouldn't be unconditionally exported
to the stack. The stack expects offloading to work across all tunnel
types when those flags are set. This would break other tunnels (like
GRE) since be2net currently supports tunnel offload for VxLAN only.

Also, with VxLANs Skyhawk-R can offload only 1 UDP dport. If more
than 1 UDP port is added, we should disable offloads in that case too.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:37:24 -05:00
Kevin Hao
0a4b5a2488 gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
We need to use dma_mapping_error() to check the dma address returned
by dma_map_single/page(). Otherwise we would get warning like this:
  WARNING: at lib/dma-debug.c:1140
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2-next-20141029 #196
  task: c0834300 ti: effe6000 task.ti: c0874000
  NIP: c02b2c98 LR: c02b2c98 CTR: c030abc4
  REGS: effe7d70 TRAP: 0700   Not tainted  (3.18.0-rc2-next-20141029)
  MSR: 00021000 <CE,ME>  CR: 22044022  XER: 20000000

  GPR00: c02b2c98 effe7e20 c0834300 00000098 00021000 00000000 c030b898 00000003
  GPR08: 00000001 00000000 00000001 749eec9d 22044022 1001abe0 00000020 ef278678
  GPR16: ef278670 ef278668 ef278660 070a8040 c087f99c c08cdc60 00029000 c0840d44
  GPR24: c08be6e8 c0840000 effe7e78 ef041340 00000600 ef114e10 00000000 c08be6e0
  NIP [c02b2c98] check_unmap+0x51c/0x9e4
  LR [c02b2c98] check_unmap+0x51c/0x9e4
  Call Trace:
  [effe7e20] [c02b2c98] check_unmap+0x51c/0x9e4 (unreliable)
  [effe7e70] [c02b31d8] debug_dma_unmap_page+0x78/0x8c
  [effe7ed0] [c03d1640] gfar_clean_rx_ring+0x208/0x488
  [effe7f40] [c03d1a9c] gfar_poll_rx_sq+0x3c/0xa8
  [effe7f60] [c04f8714] net_rx_action+0xc0/0x178
  [effe7f90] [c00435a0] __do_softirq+0x100/0x1fc
  [effe7fe0] [c0043958] irq_exit+0xa4/0xc8
  [effe7ff0] [c000d14c] call_do_irq+0x24/0x3c
  [c0875e90] [c00048a0] do_IRQ+0x8c/0xf8
  [c0875eb0] [c000ed10] ret_from_except+0x0/0x18

For TX, we need to unmap the pages which has already been mapped and
free the skb before return.

For RX, move the dma mapping and error check to gfar_new_skb(). We
would reuse the original skb in the rx ring when either allocating
skb failure or dma mapping error.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:27:14 -05:00
Hariprasad Shenai
666224d4d5 cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
Remove use of calls into t4_fw_hello() with MASTER_MUST, which results in
FW_HELLO_CMD_MASTERFORCE being set. The firmware doesn't support this and of
course any existing PF Drivers will totally go for a toss.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:25:17 -05:00
Linus Torvalds
c1b30e4d94 Pin control changes for the v3.19 series:
- Force conversion of the ux500 pin control device trees
   and parsers to use the generic pin control bindings.
 - New driver and device tree bindings for the Qualcomm
   PMIC MPP pin controller and GPIO.
 - Some ACPI infrastructure for pin controllers.
 - New driver for the Intel CherryView/Braswell pin controller,
   the first Intel pin controller to fully take advantage of
   the pin control subsystem.
 - Support the Freescale i.MX VF610 variant.
 - Support the sunxi A80 variant.
 - Support the Samsung Exynos 4415 and Exynos 7 variants.
 - Split out Intel pin controllers to their own subdirectory.
 - A large slew of rockchip pin control updates, including
   suspend/resume support.
 - A large slew of Samsung Exynos pin controller updates.
 - Various minor updates and fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUhrHUAAoJEEEQszewGV1zPZsQAMzWjGKcZhyBDWyTsHM/E9nN
 csRIcVdXs+OggH0nr2YNm2AAh+nRlp4DAQCB7S83SLfKFHF4oWT8SlornEl7WKdN
 zcVUbV29LtHkotjtVoGQZmjuJx+uvHlWJt7moTKJsAMTeNyXv25jEp0LGETji24A
 xsIQ+Bp+G9IYZqK1dlJFPva1YMjjt9sBhJqKnOhh5Z+wjj3YdT7z5LW1x001GPju
 kwKumgxOL7qKjvyaI7n2z+9VhGu9zAvoxK2gLOgjgtFQODASLS/gk2oCuRi/fIpn
 RqE+YyfrNSeMKpOjZOXc/R0SRtOkhyvMBYbgQrAX04nio4pbT6x2XgclAe6v7O5Q
 T3GmOR2JZblwrzEPRs5mGBC9p7fd488ToHAPg5ojNH5F70hDkC8wSYYJZmaL+ORw
 umyxRlRjIbQ4vs6cZMlz/NksqpQyqCTMuBRLllo/jsSQlk0Vo3Gdci5J/T10lKd2
 ciX6AxlRKaRyRo+W6/i01xcX7SzzmNZoOCMXWSjsPv7Th+Gm7vIKyVeNOUkiqUXH
 1fVjw/M0AhIttVRbx1qTPsqFaDI/WPPk9EUvVm3W7DFuf0/w9B0HkZe6KpXdp33K
 GV6gEMvmTObvUpwYrYEi7hhKVl+cJ902ZMR/LSmK0QdADhI98pjsokDrigl+Jy93
 U1OepT70fw4mgJnqnevZ
 =sxpe
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control changes from Linus Walleij:
 "Here is a stash of pin control changes I have collected for the v3.19
  series.  Mainly new hardware support, with Intels new embedded SoC as
  the especially interesting thing standing out, fully using the
  subsystem.

   - Force conversion of the ux500 pin control device trees and parsers
     to use the generic pin control bindings.
   - New driver and device tree bindings for the Qualcomm PMIC MPP pin
     controller and GPIO.
   - Some ACPI infrastructure for pin controllers.
   - New driver for the Intel CherryView/Braswell pin controller, the
     first Intel pin controller to fully take advantage of the pin
     control subsystem.
   - Support the Freescale i.MX VF610 variant.
   - Support the sunxi A80 variant.
   - Support the Samsung Exynos 4415 and Exynos 7 variants.
   - Split out Intel pin controllers to their own subdirectory.
   - A large slew of rockchip pin control updates, including
     suspend/resume support.
   - A large slew of Samsung Exynos pin controller updates.
   - Various minor updates and fixes"

* tag 'pinctrl-v3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (49 commits)
  pinctrl: at91: enhance (debugfs) at91_gpio_dbg_show
  pinctrl: meson: add device tree bindings documentation
  gpio: tz1090: Fix error handling of irq_of_parse_and_map
  pinctrl: tz1090-pinctrl.txt: Fix typo in binding
  pinctrl: pinconf-generic: Declare dt_params/conf_items const
  pinctrl: exynos: Add support for Exynos4415
  pinctrl: exynos: Add initial driver data for Exynos7
  pinctrl: exynos: Add irq_chip instance for Exynos7 wakeup interrupts
  pinctrl: exynos: Consolidate irq domain callbacks
  pinctrl: exynos: Generalize the eint16_31 demux code
  pinctrl: samsung: Separate per-bank init and runtime data
  pinctrl: samsung: Constify samsung_pin_ctrl struct
  pinctrl: samsung: Constify samsung_pin_bank_type struct
  pinctrl: samsung: Drop unused label field in samsung_pin_ctrl struct
  pinctrl: samsung: Make samsung_pinctrl_get_soc_data use ERR_PTR()
  pinctrl: Add Intel Cherryview/Braswell pin controller support
  gpio / ACPI: Add knowledge about pin controllers to acpi_get_gpiod()
  pinctrl: Fix path error in documentation
  pinctrl: rockchip: save and restore gpio6_c6 pinmux in suspend/resume
  pinctrl: rockchip: add suspend/resume functions
  ...
2014-12-11 10:43:14 -08:00
Michael S. Tsirkin
de2b48d581 virtio_pci_common.h: drop VIRTIO_PCI_NO_LEGACY
Legacy drivers use virtio_pci_common.h too, we should not
define VIRTIO_PCI_NO_LEGACY there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-11 20:04:39 +02:00
Michael S. Tsirkin
3d2667826c virtio_config: fix virtio_cread_bytes
virtio_cread_bytes is implemented incorrectly in case length happens to
be 2,4 or 8 bytes: transports and devices will assume it's an integer
value that has to be converted to LE format.

Let's just do multiple 1-byte reads: this also makes life easier
for transports who only need to implement 1,2,4 and 8 byte reads.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-11 20:04:38 +02:00
Michael S. Tsirkin
30683a8cce virtio: set VIRTIO_CONFIG_S_FEATURES_OK on restore
virtio 1.0 devices require that drivers set VIRTIO_CONFIG_S_FEATURES_OK
after finalizing features.
virtio core missed doing this on restore, fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-11 20:04:38 +02:00
Rafael J. Wysocki
ae5056e857 Merge branch 'fixes'
* fixes:
  PM / OPP: remove double calls to find_device_opp()
  PM / OPP: set new_opp->dev_opp to a valid dev_opp
  leds: leds-gpio: Fix the "default-state" property check
2014-12-11 17:54:30 +01:00
Arnaldo Carvalho de Melo
3a351127cb tools lib fs: Adopt filename__read_int from tools/perf/
Will be useful for new helpers to read sysctl values.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-11 13:17:46 -03:00
Maurizio Lombardi
fcbf6a087a bio: modify __bio_add_page() to accept pages that don't start a new segment
The original behaviour is to refuse to add a new page if the maximum
number of segments has been reached, regardless of the fact the page we
are going to add can be merged into the last segment or not.

Unfortunately, when the system runs under heavy memory fragmentation
conditions, a driver may try to add multiple pages to the last segment.
The original code won't accept them and EBUSY will be reported to
userspace.

This patch modifies the function so it refuses to add a page only in case
the latter starts a new segment and the maximum number of segments has
already been reached.

The bug can be easily reproduced with the st driver:

1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE  to 16
2) modprobe st buffer_kbs=1024
3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10
   dd: error writing `/dev/st0': Device or resource busy

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: Jet Chen <jet.chen@intel.com>
Cc: Tomas Henzl <thenzl@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-12-11 09:11:52 -07:00
Indraneel M
285dffc910 NVMe: Fix FS mount issue (hot-remove followed by hot-add)
After Hot-remove of a device with a mounted partition,
when the device is hot-added again, the new node reappears
as nvme0n1. Mounting this new node fails with the error:

mount: mount /dev/nvme0n1p1 on /mnt failed: File exists.

The old nodes's FS entries still exist and the kernel can't re-create
procfs and sysfs entries for the new node with the same name.
The patch fixes this issue.

Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Indraneel M <indraneel.m@samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-12-11 08:24:18 -07:00
Steven Rostedt (Red Hat)
1fb8915b98 printk: Do not disable preemption for accessing printk_func
As printk_func will either be the default function, or a per_cpu function
for the current CPU, there's no reason to disable preemption to access
it from printk. That's because if the printk_func is not the default
then the caller had better disabled preemption as they were the one to
change it.

Link: http://lkml.kernel.org/r/CA+55aFz5-_LKW4JHEBoWinN9_ouNcGRWAF2FUA35u46FRN-Kxw@mail.gmail.com

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-11 09:12:01 -05:00
Jaganath Kanakkassery
5c1a4c8f28 Bluetooth: Fix missing hci_dev_lock/unlock in hci_event
mgmt_pending_remove() should be called with hci_dev_lock protection and
all hci_event.c functions which calls mgmt_complete() (which eventually
calls mgmt_pending_remove()) should hold the lock.
So this patch fixes the same

Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-11 15:09:04 +01:00
Imre Deak
c352d1ba1e drm/i915: vlv: fix IRQ masking when uninstalling interrupts
irq_mask should include all IRQ bits that we want to mask, but atm we
set it incorrectly to the inverse of this. If the mask is used
subsequently to enable/disable some IRQ bits, we may unintentionally
unmask unrelated IRQs. I can't see any way that this can lead to a real
problem in the current -nightly code, since the first place the mask
will be used next (after a suspend/resume cycle) is in
valleyview_irq_postinstall(), but the mask is reset there to its proper
value.

This causes a problem in the upstream kernel though, where - due to another
issue - the mask is used in the above way to disable only the display IRQs.
This other issue is fixed by:

commit 950eabaf5a
Author: Imre Deak <imre.deak@intel.com>
Date:   Mon Sep 8 15:21:09 2014 +0300

    drm/i915: vlv: fix display IRQ enable/disable

Interestingly, even with the above two bugs, we shouldn't in theory have
any real problems (arguably a famous last sentence:). That's because
even if we unmask something unintentionally via the VLV_IMR/VLV_IER
register the master IRQ masking bit in VLV_MASTER_IER is still set and
should prevent all i915 interrupts. According to my testing on an ASUS
T100 with DSI output this isn't the case at least with the
MIPIA_INTERRUPT. Leaving this one unmasked in IMR/IER, while having
VLV_MASTER_IER set to 0 may lead to a lockup during system suspend as
shown in the bugzilla ticket below. This fix should get rid of the
problem reported there in upstream and older kernels.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85920
Cc: stable@vger.kernel.org (v3.15+)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-11 16:06:10 +02:00
Jesse Barnes
9f49c37635 drm/i915: save/restore GMBUS freq across suspend/resume on gen4
Should probably just init this in the GMbus code all the time, based on
the cdclk and HPLL like we do on newer platforms.  Ville has code for
that in a rework branch, but until then we can fix this bug fairly
easily.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76301
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Nikolay <mar.kolya@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-11 15:31:59 +02:00
Jaganath Kanakkassery
3ad675827f Bluetooth: Fix missing hci_dev_lock/unlock in mgmt req_complete()
mgmt_pending_remove() should be called with hci_dev_lock protection
and currently the rule to take dev lock is that all mgmt req_complete
functions should take dev lock. So this patch fixes the same in the
missing functions

Without this patch there is a chance of invalid memory access while
accessing the mgmt_pending list like below

bluetoothd:  392] [0] Backtrace:
bluetoothd:  392] [0] [<c04ec770>] (pending_eir_or_class+0x0/0x68) from [<c04f1830>] (add_uuid+0x34/0x1c4)
bluetoothd:  392] [0] [<c04f17fc>] (add_uuid+0x0/0x1c4) from [<c04f3cc4>] (mgmt_control+0x204/0x274)
bluetoothd:  392] [0] [<c04f3ac0>] (mgmt_control+0x0/0x274) from [<c04f609c>] (hci_sock_sendmsg+0x80/0x308)
bluetoothd:  392] [0] [<c04f601c>] (hci_sock_sendmsg+0x0/0x308) from [<c03d4d68>] (sock_aio_write+0x144/0x174)
bluetoothd:  392] [0]  r8:00000000 r7 7c1be90 r6 7c1be18 r5:00000017 r4 a90ea80
bluetoothd:  392] [0] [<c03d4c24>] (sock_aio_write+0x0/0x174) from [<c00e2d4c>] (do_sync_write+0xb0/0xe0)
bluetoothd:  392] [0] [<c00e2c9c>] (do_sync_write+0x0/0xe0) from [<c00e371c>] (vfs_write+0x134/0x13c)
bluetoothd:  392] [0]  r8:00000000 r7 7c1bf70 r6:beeca5c8 r5:00000017 r4 7c05900
bluetoothd:  392] [0] [<c00e35e8>] (vfs_write+0x0/0x13c) from [<c00e3910>] (sys_write+0x44/0x70)
bluetoothd:  392] [0]  r8:00000000 r7:00000004 r6:00000017 r5:beeca5c8 r4 7c05900
bluetoothd:  392] [0] [<c00e38cc>] (sys_write+0x0/0x70) from [<c000e3c0>] (ret_fast_syscall+0x0/0x30)
bluetoothd:  392] [0]  r9 7c1a000 r8:c000e568 r6:400b5f10 r5:403896d8 r4:beeca604
bluetoothd:  392] [0] Code: e28cc00c e152000c 0a00000f e3a00001 (e1d210b8)
bluetoothd:  392] [0] ---[ end trace 67b6ac67435864c4 ]---
bluetoothd:  392] [0] Kernel panic - not syncing: Fatal exception

Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-11 14:08:47 +01:00
Benjamin Gaignard
f78e772a2c drm: sti: correctly cleanup CRTC and planes
When bind failed make sure that CRTC and planes are
completely clean up to avoid properties duplication.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 14:00:19 +01:00
Benjamin Gaignard
4fdbc678fe drm: sti: add HQVDP plane
High Quality Video Data Plane is hardware IP dedicated
to video rendering. Compare to GPD (graphic planes) it
have better scaler capabilities.

HQVDP use VID layer to push data into hardware compositor
without going into DDR. From data flow point of view HQVDP
and VID are nested so HQVPD update/disable VID.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 14:00:13 +01:00
Benjamin Gaignard
96006a770d drm: sti: add cursor plane
stih407 SoC have a dedicated hardware cursor plane,
this patch enable it.
The hardware have a color look up table, fix it to
be able to use ARGB8888.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 14:00:09 +01:00
Benjamin Gaignard
5e03abc52c drm: sti: enable auxiliary CRTC
For stih407 SoC enable the second mixer to get two CRTC.
Allow GPD planes and encoders to be connected to this new CRTC.
Cursor plane can only be set on first CRTC.
GPD clocks needed change the parent clock depending on which
CRTC GPD are used.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 14:00:04 +01:00
Benjamin Gaignard
7f2d479c01 drm: sti: fix delay in VTG programming
The HDMI path introduce a delay of 6 pixels.
This delay should be take into account while programming
VTG for the HDMI. Without this delay, the HDMI active
window area is shift of 6 pixel on the right.

Set also timing for DVO output.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:59:58 +01:00
Benjamin Gaignard
ca279601fb drm: sti: prepare sti_tvout to support auxiliary crtc
Change some functions prototype to prepare the introduction of
auxiliary crtc. It will also help to have a DVO encoder.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:59:45 +01:00
Benjamin Gaignard
ca614aadd7 drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off}
Make sure that vblank is enabled when crtc commit is call.
Replace drm_vblank_off() by drm_crtc_vblank_off()

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:59:36 +01:00
Benjamin Gaignard
589b347b54 drm: sti: fix hdmi avi infoframe
The hardware expect to have the infoframe checksum in the first byte.
In consequence shift all infoframe on one byte.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:59:31 +01:00
Benjamin Gaignard
eb929dc4d3 drm: sti: remove event lock while disabling vblank
Stop use event_lock in vblank disable function.
This was creating a dead lock.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:58:35 +01:00
Benjamin Gaignard
a51fe84d1d drm: sti: simplify gdp code
Store the physical address at node creation time
to avoid use of virt_to_dma and dma_to_virt everywhere

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:58:28 +01:00
Benjamin Gaignard
2f7d0e82ce drm: sti: clear all mixer control
Make sure that mixer control register is correctly reset
before use it.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:58:19 +01:00
Benjamin Gaignard
765692078f drm: sti: remove gpio for HDMI hot plug detection
gpio used for HDMI hot plug detection is useless,
HDMI_STI register contains an hot plug detection status bit.
Fix binding documentation.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:58:12 +01:00
Benjamin Gaignard
41a14623bd drm: sti: allow to change hdmi ddc i2c adapter
Depending of the board configuration i2c for ddc could change,
this patch allow to use a phandle to specify which i2c controller to use.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
2014-12-11 13:57:59 +01:00
Mark Rutland
fb59d007a0 arm64: mm: dump: don't skip final region
If the final page table entry we walk is a valid mapping, the page table
dumping code will not log the region this entry is part of, as the final
note_page call in ptdump_show will trigger an early return. Luckily this
isn't seen on contemporary systems as they typically don't have enough
RAM to extend the linear mapping right to the end of the address space.

In note_page, we log a region  when we reach its end (i.e. we hit an
entry immediately afterwards which has different prot bits or is
invalid). The final entry has no subsequent entry, so we will not log
this immediately. We try to cater for this with a subsequent call to
note_page in ptdump_show, but this returns early as 0 < LOWEST_ADDR, and
hence we will skip a valid mapping if it spans to the final entry we
note.

Unlike 32-bit ARM, the pgd with the kernel mapping is never shared with
user mappings, so we do not need the check to ensure we don't log user
page tables. Due to the way addr is constructed in the walk_* functions,
it can never be less than LOWEST_ADDR when walking the page tables, so
it is not necessary to avoid dereferencing invalid table addresses. The
existing checks for st->current_prot and st->marker[1].start_address are
sufficient to ensure we will not print and/or dereference garbage when
trying to log information.

This patch removes the unnecessary check against LOWEST_ADDR, ensuring
we log all regions in the kernel page table, including those which span
right to the end of the address space.

Cc: Kees Cook <keescook@chromium.org>
Acked-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-12-11 12:08:07 +00:00
Mark Rutland
35545f0ccb arm64: mm: dump: fix shift warning
When building with 48-bit VAs, it's possible to get the following
warning when building the arm64 page table dumping code:

arch/arm64/mm/dump.c: In function ‘walk_pgd’:
arch/arm64/mm/dump.c:266:2: warning: right shift count >= width of type
  pgd_t *pgd = pgd_offset(mm, 0);
  ^

As pgd_offset is a macro and the second argument is not cast to any
particular type, the zero will be given integer type by the compiler.
As pgd_offset passes the pargument to pgd_index, we then try to shift
the 32-bit integer by at least 39 bits (for 4k pages).

Elsewhere the pgd_offset is passed a second argument of unsigned long
type, so let's do the same here by passing '0UL' rather than '0'.

Cc: Kees Cook <keescook@chromium.org>
Acked-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-12-11 12:08:07 +00:00
Krzysztof Kozlowski
e5e62d4752 arm64: psci: Fix build breakage without PM_SLEEP
Fix build failure of defconfig when PM_SLEEP is disabled (e.g. by
disabling SUSPEND) and CPU_IDLE enabled:

arch/arm64/kernel/psci.c:543:2: error: unknown field ‘cpu_suspend’ specified in initializer
  .cpu_suspend = cpu_psci_cpu_suspend,
  ^
arch/arm64/kernel/psci.c:543:2: warning: initialization from incompatible pointer type [enabled by default]
arch/arm64/kernel/psci.c:543:2: warning: (near initialization for ‘cpu_psci_ops.cpu_prepare’) [enabled by default]
make[1]: *** [arch/arm64/kernel/psci.o] Error 1

The cpu_operations.cpu_suspend field exists only if ARM64_CPU_SUSPEND is
defined, not CPU_IDLE.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-12-11 12:08:06 +00:00
Juergen Gross
cdfa0badfc xen: switch to post-init routines in xen mmu.c earlier
With the virtual mapped linear p2m list the post-init mmu operations
must be used for setting up the p2m mappings, as in case of
CONFIG_FLATMEM the init routines may trigger BUGs.

paging_init() sets up all infrastructure needed to switch to the
post-init mmu ops done by xen_post_allocator_init(). With the virtual
mapped linear p2m list we need some mmu ops during setup of this list,
so we have to switch to the correct mmu ops as soon as possible.

The p2m list is usable from the beginning, just expansion requires to
have established the new linear mapping. So the call of
xen_remap_memory() had to be introduced, but this is not due to the
mmu ops requiring this.

Summing it up: calling xen_post_allocator_init() not directly after
paging_init() was conceptually wrong in the beginning, it just didn't
matter up to now as no functions used between the two calls needed
some critical mmu ops (e.g. alloc_pte). This has changed now, so I
corrected it.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-12-11 12:05:53 +00:00
David Vrabel
dbdd74763f Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single"
This reverts commit 2c3fc8d26d.

This commit broke on x86 PV because entries in the generic SWIOTLB are
indexed using (pseudo-)physical address not DMA address and these are
not the same in a x86 PV guest.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-12-11 12:05:01 +00:00
Nadav Amit
ab646f54f4 KVM: x86: em_ret_far overrides cpl
commit d50eaa1803 ("KVM: x86: Perform limit checks when assigning EIP")
mistakenly used zero as cpl on em_ret_far. Use the actual one.

Fixes: d50eaa1803
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-11 12:27:32 +01:00
Bandan Das
78051e3b7e KVM: nVMX: Disable unrestricted mode if ept=0
If L0 has disabled EPT, don't advertise unrestricted
mode at all since it depends on EPT to run real mode code.

Fixes: 92fbc7b195
Cc: stable@vger.kernel.org
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-11 12:26:15 +01:00
Borislav Petkov
be9d1738b1 x86/asm: Unify segment selector defines
Those are identical on 32- and 64-bit, unify them. No functional
change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1418127959-29902-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:45:03 +01:00
Borislav Petkov
5de2b61a63 x86/asm: Guard against building the 32/64-bit versions of the asm-offsets*.c file directly
Sometimes it is helpful to build a kernel compilation unit
directly, i.e.:

  make .../<filename>.i

in order to look at compiler output.

Since asm-offsets_{32,64}.c are included by asm-offsets.c and
building them directly doesn't evaluate the macros used (thus
making the preprocessor output not very useful), error out when
an attempt is made to build them. Issue a hint for the user to
build asm-offsets.c instead.

Suggested-by: Michael Matz <matz@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1418139917-12722-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:43:56 +01:00
Andy Lutomirski
f647d7c155 x86_64, switch_to(): Load TLS descriptors before switching DS and ES
Otherwise, if buggy user code points DS or ES into the TLS
array, they would be corrupted after a context switch.

This also significantly improves the comments and documents some
gotchas in the code.

Before this patch, the both tests below failed.  With this
patch, the es test passes, although the gsbase test still fails.

 ----- begin es test -----

/*
 * Copyright (c) 2014 Andy Lutomirski
 * GPL v2
 */

static unsigned short GDT3(int idx)
{
	return (idx << 3) | 3;
}

static int create_tls(int idx, unsigned int base)
{
	struct user_desc desc = {
		.entry_number    = idx,
		.base_addr       = base,
		.limit           = 0xfffff,
		.seg_32bit       = 1,
		.contents        = 0, /* Data, grow-up */
		.read_exec_only  = 0,
		.limit_in_pages  = 1,
		.seg_not_present = 0,
		.useable         = 0,
	};

	if (syscall(SYS_set_thread_area, &desc) != 0)
		err(1, "set_thread_area");

	return desc.entry_number;
}

int main()
{
	int idx = create_tls(-1, 0);
	printf("Allocated GDT index %d\n", idx);

	unsigned short orig_es;
	asm volatile ("mov %%es,%0" : "=rm" (orig_es));

	int errors = 0;
	int total = 1000;
	for (int i = 0; i < total; i++) {
		asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
		usleep(100);

		unsigned short es;
		asm volatile ("mov %%es,%0" : "=rm" (es));
		asm volatile ("mov %0,%%es" : : "rm" (orig_es));
		if (es != GDT3(idx)) {
			if (errors == 0)
				printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
				       GDT3(idx), es);
			errors++;
		}
	}

	if (errors) {
		printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
		return 1;
	} else {
		printf("[OK]\tES was preserved\n");
		return 0;
	}
}

 ----- end es test -----

 ----- begin gsbase test -----

/*
 * gsbase.c, a gsbase test
 * Copyright (c) 2014 Andy Lutomirski
 * GPL v2
 */

static unsigned char *testptr, *testptr2;

static unsigned char read_gs_testvals(void)
{
	unsigned char ret;
	asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
	return ret;
}

int main()
{
	int errors = 0;

	testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
	if (testptr == MAP_FAILED)
		err(1, "mmap");

	testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
	if (testptr2 == MAP_FAILED)
		err(1, "mmap");

	*testptr = 0;
	*testptr2 = 1;

	if (syscall(SYS_arch_prctl, ARCH_SET_GS,
		    (unsigned long)testptr2 - (unsigned long)testptr) != 0)
		err(1, "ARCH_SET_GS");

	usleep(100);

	if (read_gs_testvals() == 1) {
		printf("[OK]\tARCH_SET_GS worked\n");
	} else {
		printf("[FAIL]\tARCH_SET_GS failed\n");
		errors++;
	}

	asm volatile ("mov %0,%%gs" : : "r" (0));

	if (read_gs_testvals() == 0) {
		printf("[OK]\tWriting 0 to gs worked\n");
	} else {
		printf("[FAIL]\tWriting 0 to gs failed\n");
		errors++;
	}

	usleep(100);

	if (read_gs_testvals() == 0) {
		printf("[OK]\tgsbase is still zero\n");
	} else {
		printf("[FAIL]\tgsbase was corrupted\n");
		errors++;
	}

	return errors == 0 ? 0 : 1;
}

 ----- end gsbase test -----

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: <stable@vger.kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:40:08 +01:00
Xishi Qiu
29258cf49e x86/mm: Use min() instead of min_t() in the e820 printout code
The type of "MAX_DMA_PFN" and "xXx_pfn" are both unsigned long
now, so use min() instead of min_t().

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Cc: Linux MM <linux-mm@kvack.org>
Cc: <dave@sr71.net>
Cc: Rik van Riel <riel@redhat.com>
Link: http://lkml.kernel.org/r/5487AB3F.7050807@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:35:02 +01:00
Xishi Qiu
c072b90c8d x86/mm: Fix zone ranges boot printout
This is the usual physical memory layout boot printout:
	...
	[    0.000000] Zone ranges:
	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
	[    0.000000]   Normal   [mem 0x100000000-0xc3fffffff]
	[    0.000000] Movable zone start for each node
	[    0.000000] Early memory node ranges
	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
	[    0.000000]   node   0: [mem 0x00100000-0xbf78ffff]
	[    0.000000]   node   0: [mem 0x100000000-0x63fffffff]
	[    0.000000]   node   1: [mem 0x640000000-0xc3fffffff]
	...

This is the log when we set "mem=2G" on the boot cmdline:
	...
	[    0.000000] Zone ranges:
	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]  // should be 0x7fffffff, right?
	[    0.000000]   Normal   empty
	[    0.000000] Movable zone start for each node
	[    0.000000] Early memory node ranges
	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
	...

This patch fixes the printout, the following log shows the right
ranges:
	...
	[    0.000000] Zone ranges:
	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
	[    0.000000]   DMA32    [mem 0x01000000-0x7fffffff]
	[    0.000000]   Normal   empty
	[    0.000000] Movable zone start for each node
	[    0.000000] Early memory node ranges
	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
	...

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Cc: Linux MM <linux-mm@kvack.org>
Cc: <dave@sr71.net>
Cc: Rik van Riel <riel@redhat.com>
Link: http://lkml.kernel.org/r/5487AB3D.6070306@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:35:02 +01:00
Luis R. Rodriguez
96ed4cd0aa x86/doc: Update documentation after file shuffling
While at it, also refer to the 32 bit entry file.

Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: linux-doc@vger.kernel.org
Cc: bpoirier@suse.de
Link: http://lkml.kernel.org/r/1418165684-6226-1-git-send-email-mcgrof@do-not-panic.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:33:22 +01:00
Jiri Olsa
9fc81d8742 perf: Fix events installation during moving group
We allow PMU driver to change the cpu on which the event
should be installed to. This happened in patch:

  e2d37cd213 ("perf: Allow the PMU driver to choose the CPU on which to install events")

This patch also forces all the group members to follow
the currently opened events cpu if the group happened
to be moved.

This and the change of event->cpu in perf_install_in_context()
function introduced in:

  0cda4c0231 ("perf: Introduce perf_pmu_migrate_context()")

forces group members to change their event->cpu,
if the currently-opened-event's PMU changed the cpu
and there is a group move.

Above behaviour causes problem for breakpoint events,
which uses event->cpu to touch cpu specific data for
breakpoints accounting. By changing event->cpu, some
breakpoints slots were wrongly accounted for given
cpu.

Vinces's perf fuzzer hit this issue and caused following
WARN on my setup:

   WARNING: CPU: 0 PID: 20214 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x142/0x150()
   Can't find any breakpoint slot
   [...]

This patch changes the group moving code to keep the event's
original cpu.

Reported-by: Vince Weaver <vince@deater.net>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vince@deater.net>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1418243031-20367-3-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:24:15 +01:00
Jiri Olsa
af91568e76 perf/x86/intel/uncore: Make sure only uncore events are collected
The uncore_collect_events functions assumes that event group
might contain only uncore events which is wrong, because it
might contain any type of events.

This bug leads to uncore framework touching 'not' uncore events,
which could end up all sorts of bugs.

One was triggered by Vince's perf fuzzer, when the uncore code
touched breakpoint event private event space as if it was uncore
event and caused BUG:

   BUG: unable to handle kernel paging request at ffffffff82822068
   IP: [<ffffffff81020338>] uncore_assign_events+0x188/0x250
   ...

The code in uncore_assign_events() function was looking for
event->hw.idx data while the event was initialized as a
breakpoint with different members in event->hw union.

This patch forces uncore_collect_events() to collect only uncore
events.

Reported-by: Vince Weaver <vince@deater.net>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1418243031-20367-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-11 11:24:14 +01:00