Commit graph

737480 commits

Author SHA1 Message Date
Darrick J. Wong
f6d5fc21fd xfs: cross-reference refcount btree during scrub
During metadata btree scrub, we should cross-reference with the
reference counts.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:46 -08:00
Darrick J. Wong
dbde19da96 xfs: cross-reference the rmapbt data with the refcountbt
Cross reference the refcount data with the rmap data to check that the
number of rmaps for a given block match the refcount of that block, and
that CoW blocks (which are owned entirely by the refcountbt) are tracked
as well.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
d852657ccf xfs: cross-reference reverse-mapping btree
When scrubbing various btrees, we should cross-reference the records
with the reverse mapping btree and ensure that traversing the btree
finds the same number of blocks that the rmapbt thinks are owned by
that btree.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
2e6f27561b xfs: cross-reference inode btrees during scrub
Cross-reference the inode btrees with the other metadata when we
scrub the filesystem.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
e1134b12fd xfs: cross-reference bnobt records with cntbt
Scrub should make sure that each bnobt record has a corresponding
cntbt record.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
52dc4b44af xfs: cross-reference with the bnobt
When we're scrubbing various btrees, cross-reference the records with
the bnobt to ensure that we don't also think the space is free.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
166d76410d xfs: introduce scrubber cross-referencing stubs
Create some stubs that will be used to cross-reference metadata records.
The actual cross-referencing will be filled in by subsequent patches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
858333dcf0 xfs: check btree block ownership with bnobt/rmapbt when scrubbing btree
When scanning a metadata btree block, cross-reference the block location
with the free space btree and the reverse mapping btree to ensure that
the rmapbt knows about the block and the bnobt does not.  Add a
mechanism to defer checks when we happen to be scanning the bnobt/rmapbt
itself because it's less efficient to repeatedly clone and destroy the
cursor.

This patch provides the framework to make btree block owner checks
happen; the actual meat will be added in subsequent patches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
9a7e269566 xfs: fix a few erroneous process_error calls in the scrubbers
There are a few places where we make a libxfs api call on behalf of some
object other than the one we're scrubbing but inadvertently call the
regular process_error function.  When this happens we mark the object
corrupt even though it was corruption in /some other/ object that
actually produced the -EFSCORRUPTED code.  The correct output flag for
these situations is SCRUB_OFLAG_XFAIL, not SCRUB_OFLAG_CORRUPT, so fix
this now that we also have a helper to set these.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:45 -08:00
Darrick J. Wong
64b12563b2 xfs: set up scrub cross-referencing helpers
Create some helper functions that we'll use later to deal with problems
we might encounter while cross referencing metadata with other metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:44 -08:00
Darrick J. Wong
49db55eca5 xfs: add scrub cross-referencing helpers for the refcount btrees
Add a couple of functions to the refcount btrees that will be used
to cross-reference metadata against the refcountbt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:44 -08:00
Darrick J. Wong
ed7c52d4bf xfs: add scrub cross-referencing helpers for the rmap btrees
Add a couple of functions to the rmap btrees that will be used
to cross-reference metadata against the rmapbt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:44 -08:00
Darrick J. Wong
2e001266b6 xfs: add scrub cross-referencing helpers for the inode btrees
Add a couple of functions to the inode btrees that will be used
to cross-reference metadata against the inobt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:44 -08:00
Darrick J. Wong
ce1d802e6a xfs: add scrub cross-referencing helpers for the free space btrees
Add a couple of functions to the free space btrees that will be used
to cross-reference metadata against the bnobt/cntbt, and a generic
btree function that provides the real implementation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17 21:00:44 -08:00
Nicholas Piggin
f2ac428e0e powerpc/pseries/cpuidle: add polling idle for shared processor guests
For shared processor guests (e.g., KVM), add an idle polling mode rather
than immediately returning to the hypervisor when the guest CPU goes
idle.

Test setup is a 2 socket POWER9 with 4 guests running, each with vCPUs
equal to 1/2 of real of CPUs. Saturated each guest with tbench. Using
polling idle gives about 1.4x throughput.

Kernel compile speed was not changed significantly.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:43:44 +11:00
Nicholas Piggin
ced54c08d8 cpuidle/powernv: avoid double irq enable coming out of idle
Since e1689795a7 ("cpuidle: Add common time keeping and irq enabling"),
cpuidle drivers are expected to return from ->enter with irqs disabled.

Update the cpuidle-powernv snooze and cede loops to disable irqs before
returning.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:43:44 +11:00
Nicholas Piggin
f1343d0446 cpuidle/powernv: avoid double irq enable coming out of idle
Since e1689795a7 ("cpuidle: Add common time keeping and irq enabling"),
cpuidle drivers are expected to return from ->enter with irqs disabled.

Update the cpuidle-powernv snooze loop to disable irqs before returning.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:43:43 +11:00
Nicholas Piggin
c16bee4bde powerpc: define __ARCH_IRQ_EXIT_IRQS_DISABLED
powerpc calls irq_exit() with local irqs disabled, therefore it
can define __ARCH_IRQ_EXIT_IRQS_DISABLED.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:43:43 +11:00
Nicholas Piggin
47712a921b powerpc/watchdog: remove arch_trigger_cpumask_backtrace
The powerpc NMI IPIs may not be recoverable if they are taken in
some sections of code, and also there have been and still are issues
with taking NMIs (in KVM guest code, in firmware, etc) which makes them
a bit dangerous to use.

Generic code like softlockup detector and rcu stall detectors really
hammer on trigger_*_backtrace, which has lead to further problems
because we've implemented it with the NMI.

So stop providing NMI backtraces for now. Importantly, the powerpc code
uses NMI IPIs in crash/debug, and the SMP hardlockup watchdog. So if the
softlockup and rcu hang detection traces are not being printed because
the CPU is stuck with interrupts off, then the hard lockup watchdog
should get it with the NMI IPI.

Fixes: 2104180a53 ("powerpc/64s: implement arch-specific hardlockup watchdog")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:43:43 +11:00
Nicholas Piggin
1af19331a3 powerpc/64s: Relax PACA address limitations
Book3S PACA memory allocation is restricted by the RMA limit and also
must not take SLB faults when accessed in virtual mode. Currently a
fixed 256MB limit is used for this, which is imprecise and sub-optimal.

Update the paca allocation limits to use use the ppc64_rma_size for RMA
limit, and share the safe_stack_limit() that is currently used for stack
allocations that must not take virtual mode faults.

The safe_stack_limit() name is changed to ppc64_bolted_size() to match
ppc64_rma_size and some comments are updated. We also need to use
early_mmu_has_feature() because we are now calling this function prior
to the jump label patching that enables mmu_has_feature().

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Change mmu_has_feature() to early_mmu_has_feature()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:42:48 +11:00
Ming Lei
23d4ee19e7 blk-mq: don't dispatch request in blk_mq_request_direct_issue if queue is busy
If we run into blk_mq_request_direct_issue(), when queue is busy, we
don't want to dispatch this request into hctx->dispatch_list, and
what we need to do is to return the queue busy info to caller, so
that caller can deal with it well.

Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
Reported-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17 21:38:52 -07:00
Paul Mackerras
d075745d89 KVM: PPC: Book3S HV: Improve handling of debug-trigger HMIs on POWER9
Hypervisor maintenance interrupts (HMIs) are generated by various
causes, signalled by bits in the hypervisor maintenance exception
register (HMER).  In most cases calling OPAL to handle the interrupt
is the correct thing to do, but the "debug trigger" HMIs signalled by
PPC bit 17 (bit 46) of HMER are used to invoke software workarounds
for hardware bugs, and OPAL does not have any code to handle this
cause.  The debug trigger HMI is used in POWER9 DD2.0 and DD2.1 chips
to work around a hardware bug in executing vector load instructions to
cache inhibited memory.  In POWER9 DD2.2 chips, it is generated when
conditions are detected relating to threads being in TM (transactional
memory) suspended mode when the core SMT configuration needs to be
reconfigured.

The kernel currently has code to detect the vector CI load condition,
but only when the HMI occurs in the host, not when it occurs in a
guest.  If a HMI occurs in the guest, it is always passed to OPAL, and
then we always re-sync the timebase, because the HMI cause might have
been a timebase error, for which OPAL would re-sync the timebase, thus
removing the timebase offset which KVM applied for the guest.  Since
we don't know what OPAL did, we don't know whether to subtract the
timebase offset from the timebase, so instead we re-sync the timebase.

This adds code to determine explicitly what the cause of a debug
trigger HMI will be.  This is based on a new device-tree property
under the CPU nodes called ibm,hmi-special-triggers, if it is
present, or otherwise based on the PVR (processor version register).
The handling of debug trigger HMIs is pulled out into a separate
function which can be called from the KVM guest exit code.  If this
function handles and clears the HMI, and no other HMI causes remain,
then we skip calling OPAL and we proceed to subtract the guest
timebase offset from the timebase.

The overall handling for HMIs that occur in the host (i.e. not in a
KVM guest) is largely unchanged, except that we now don't set the flag
for the vector CI load workaround on DD2.2 processors.

This also removes a BUG_ON in the KVM code.  BUG_ON is generally not
useful in KVM guest entry/exit code since it is difficult to handle
the resulting trap gracefully.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:31:25 +11:00
Dave Airlie
92eb5f0c00 Merge tag 'drm-amdkfd-next-fixes-2018-01-15' of git://people.freedesktop.org/~gabbayo/linux into drm-next
- fix NULL pointer dereference
- fix compiler warning on large define values
- remove unnecessary call to execute_queues_cpsch

* tag 'drm-amdkfd-next-fixes-2018-01-15' of git://people.freedesktop.org/~gabbayo/linux:
  drm/amdkfd: Fix potential NULL pointer dereferences
  drm/amdkfd: add ull suffix to 64bit defines
  drm/amdkfd: don't always call execute_queues_cpsch()
  drm/amdkfd: Fix return value 0 when execute_queues_cpsch fails
2018-01-18 13:30:48 +10:00
Dave Airlie
75f195f46f Merge tag 'drm-misc-fixes-2018-01-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Final 4.15 drm-misc pull:

Just 3 sun4i patches to fix clock computation/checks.

* tag 'drm-misc-fixes-2018-01-17' of git://anongit.freedesktop.org/drm/drm-misc:
  drm/sun4i: hdmi: Add missing rate halving check in sun4i_tmds_determine_rate
  drm/sun4i: hdmi: Fix incorrect assignment in sun4i_tmds_determine_rate
  drm/sun4i: hdmi: Check for unset best_parent in sun4i_tmds_determine_rate
2018-01-18 13:30:22 +10:00
Dave Airlie
894219d7d2 Merge branch 'vmwgfx-fixes-4.15' of git://people.freedesktop.org/~thomash/linux into drm-fixes
Last minute fixes for vmwgfx.
One fix for a drm helper warning introduced in 4.15
One important fix for a longer standing memory corruption issue on older
hardware versions.

* 'vmwgfx-fixes-4.15' of git://people.freedesktop.org/~thomash/linux:
  drm/vmwgfx: fix memory corruption with legacy/sou connectors
  drm/vmwgfx: Fix a boot time warning
2018-01-18 13:29:24 +10:00
Rafael J. Wysocki
a7f2766ac7 Merge branches 'acpi-gpio', 'acpi-button', 'acpi-battery' and 'acpi-video'
* acpi-gpio:
  gpio: merrifield: Add support of ACPI enabled platforms
  ACPI: utils: Introduce acpi_dev_get_first_match_name()

* acpi-button:
  ACPI: button: Add a LID switch blacklist and add 1 model to it
  ACPI: button: Add a debug message when we're sending a LID event

* acpi-battery:
  ACPI / battery: Add quirk for Asus GL502VSK and UX305LA
  ACPI: battery: Drop redundant test for failure

* acpi-video:
  ACPI / video: Default lcd_only to true on Win8-ready and newer machines
2018-01-18 03:02:16 +01:00
Rafael J. Wysocki
0c81e26e86 Merge branches 'acpi-x86', 'acpi-apei' and 'acpi-ec'
* acpi-x86:
  ACPI / x86: boot: Propagate error code in acpi_gsi_to_irq()
  ACPI / x86: boot: Don't setup SCI on HW-reduced platforms
  ACPI / x86: boot: Use INVALID_ACPI_IRQ instead of 0 for acpi_sci_override_gsi
  ACPI / x86: boot: Get rid of ACPI_INVALID_GSI
  ACPI / x86: boot: Swap variables in condition in acpi_register_gsi_ioapic()

* acpi-apei:
  ACPI / APEI: remove redundant variables len and node_len
  ACPI: APEI: call into AER handling regardless of severity
  ACPI: APEI: handle PCIe AER errors in separate function

* acpi-ec:
  ACPI: EC: Fix debugfs_create_*() usage
2018-01-18 03:01:55 +01:00
Rafael J. Wysocki
13c35c8388 Merge branches 'acpi-numa', 'acpi-sysfs', 'acpi-pmic', 'acpi-soc' and 'acpi-ged'
* acpi-numa:
  ACPI / NUMA: ia64: Parse all entries of SRAT memory affinity table

* acpi-sysfs:
  ACPI: sysfs: Make ACPI GPE mask kernel parameter cover all GPEs

* acpi-pmic:
  ACPI / PMIC: Convert to use builtin_platform_driver() macro
  ACPI / PMIC: constify platform_device_id

* acpi-soc:
  ACPI / LPSS: Do not instiate platform_dev for devs without MMIO resources
  ACPI / LPSS: Add device link for CHT SD card dependency on I2C

* acpi-ged:
  ACPI: GED: unregister interrupts during shutdown
2018-01-18 03:01:38 +01:00
Rafael J. Wysocki
2a2bafcb3b Merge branch 'acpica'
* acpica: (40 commits)
  ACPICA: Update version to 20171215
  ACPICA: trivial style fix, no functional change
  ACPICA: Fix a couple memory leaks during package object resolution
  ACPICA: Recognize the Windows 10 version 1607 and 1703 OSI strings
  ACPICA: DT compiler: prevent error if optional field at the end of table is not present
  ACPICA: Rename a global variable, no functional change
  ACPICA: Create and deploy safe version of strncpy
  ACPICA: Cleanup the global variables and update comments
  ACPICA: Debugger: fix slight indentation issue
  ACPICA: Fix a regression in the acpi_evaluate_object_type() interface
  ACPICA: Update for a few debug output statements
  ACPICA: Debug output, no functional change
  ACPICA: Update information in MAINTAINERS
  ACPICA: Rename variable to match upstream
  ACPICA: Update version to 20171110
  ACPICA: ACPI 6.2: Additional PPTT flags
  ACPICA: Update linkage for get mutex name interface
  ACPICA: Update mutex error messages, no functional change
  ACPICA: Debugger: add "background" command for method execution
  ACPICA: Small typo fix, no functional change
  ...
2018-01-18 03:01:07 +01:00
Rafael J. Wysocki
ee43730d65 Merge branches 'pm-opp', 'pm-devfreq', 'pm-avs' and 'pm-tools'
* pm-opp:
  OPP: Introduce "required-opp" property
  OPP: Allow OPP table to be used for power-domains

* pm-devfreq:
  PM / devfreq: Fix potential NULL pointer dereference in governor_store
  PM / devfreq: Propagate error from devfreq_add_device()

* pm-avs:
  PM / AVS: rockchip-io: account for const type of of_device_id.data

* pm-tools:
  tools/power/x86/intel_pstate_tracer: Free the trace buffer memory
  cpupower: Remove FSF address
2018-01-18 02:56:04 +01:00
Rafael J. Wysocki
bcaea4678f Merge branches 'acpi-pm' and 'pm-sleep'
* acpi-pm:
  platform/x86: surfacepro3: Support for wakeup from suspend-to-idle
  ACPI / PM: Use Low Power S0 Idle on more systems
  ACPI / PM: Make it possible to ignore the system sleep blacklist

* pm-sleep:
  PM / hibernate: Drop unused parameter of enough_swap
  block, scsi: Fix race between SPI domain validation and system suspend
  PM / sleep: Make lock/unlock_system_sleep() available to kernel modules
  PM: hibernate: Do not subtract NR_FILE_MAPPED in minimum_image_size()
2018-01-18 02:55:28 +01:00
Rafael J. Wysocki
4b67157f04 Merge branch 'pm-core'
* pm-core: (29 commits)
  dmaengine: rcar-dmac: Make DMAC reinit during system resume explicit
  PM / runtime: Allow no callbacks in pm_runtime_force_suspend|resume()
  PM / runtime: Check ignore_children in pm_runtime_need_not_resume()
  PM / runtime: Rework pm_runtime_force_suspend/resume()
  PM / wakeup: Print warn if device gets enabled as wakeup source during sleep
  PM / core: Propagate wakeup_path status flag in __device_suspend_late()
  PM / core: Re-structure code for clearing the direct_complete flag
  PM: i2c-designware-platdrv: Optimize power management
  PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE
  PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
  PCI / PM: Use SMART_SUSPEND and LEAVE_SUSPENDED flags for PCIe ports
  PM / wakeup: Add device_set_wakeup_path() helper to control wakeup path
  PM / core: Assign the wakeup_path status flag in __device_prepare()
  PM / wakeup: Do not fail dev_pm_attach_wake_irq() unnecessarily
  PM / core: Direct DPM_FLAG_LEAVE_SUSPENDED handling
  PM / core: Direct DPM_FLAG_SMART_SUSPEND optimization
  PM / core: Add helpers for subsystem callback selection
  PM / wakeup: Drop redundant check from device_init_wakeup()
  PM / wakeup: Drop redundant check from device_set_wakeup_enable()
  PM / wakeup: only recommend "call"ing device_init_wakeup() once
  ...
2018-01-18 02:55:09 +01:00
Rafael J. Wysocki
f9b736f64a Merge branches 'pm-domains', 'pm-kconfig', 'pm-cpuidle' and 'powercap'
* pm-domains:
  PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume()
  PM / domains: Don't skip driver's ->suspend|resume_noirq() callbacks
  PM / Domains: Remove obsolete "samsung,power-domain" check

* pm-kconfig:
  bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate
  PM: Provide a config snippet for disabling PM

* pm-cpuidle:
  cpuidle: Avoid NULL argument in cpuidle_switch_governor()

* powercap:
  powercap: intel_rapl: Fix trailing semicolon
  powercap: add suspend and resume mechanism for SOC power limit
  powercap: Simplify powercap_init()
2018-01-18 02:54:45 +01:00
Rafael J. Wysocki
f31c376025 Merge branch 'pm-cpufreq'
* pm-cpufreq: (36 commits)
  cpufreq: scpi: remove arm_big_little dependency
  drivers: psci: remove cluster terminology and dependency on physical_package_id
  cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin
  cpufreq: intel_pstate: Add Skylake servers support
  cpufreq: intel_pstate: Replace bxt_funcs with core_funcs
  cpufreq: imx6q: add 696MHz operating point for i.mx6ul
  ARM: dts: imx6ul: add 696MHz operating point
  cpufreq: stats: Change return type of cpufreq_stats_update() as void
  powernv-cpufreq: Treat pstates as opaque 8-bit values
  powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  powernv-cpufreq: Add helper to extract pstate from PMSR
  cpu_cooling: Remove static-power related documentation
  cpufreq: imx6q: switch to Use clk_bulk_get() to refine clk operations
  PM / OPP: Make local function ti_opp_supply_set_opp() static
  PM / OPP: Add ti-opp-supply driver
  dt-bindings: opp: Introduce ti-opp-supply bindings
  cpufreq: ti-cpufreq: Add support for multiple regulators
  cpufreq: ti-cpufreq: Convert to module_platform_driver
  cpufreq: Add DVFS support for Armada 37xx
  MAINTAINERS: add new entries for Armada 37xx cpufreq driver
  ...
2018-01-18 02:52:56 +01:00
Rafael J. Wysocki
f06970f4b0 Merge branch 'pm-cpufreq-thermal' into pm-cpufreq
* pm-cpufreq-thermal:
  cpu_cooling: Remove static-power related documentation
  cpu_cooling: Drop static-power related stuff
  cpu_cooling: Keep only one of_cpufreq*cooling_register() helper
  cpu_cooling: Remove unused cpufreq_power_cooling_register()
  cpu_cooling: Make of_cpufreq_power_cooling_register() parse DT
2018-01-18 02:52:42 +01:00
Luis de Bethencourt
bee344cb70 PCI / PM: Remove spurious semicolon
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.

Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-18 02:50:03 +01:00
Paul Mackerras
00608e1f00 KVM: PPC: Book3S HV: Allow HPT and radix on the same core for POWER9 v2.2
POWER9 chip versions starting with "Nimbus" v2.2 can support running
with some threads of a core in HPT mode and others in radix mode.
This means that we don't have to prohibit independent-threads mode
when running a HPT guest on a radix host, and we don't have to do any
of the synchronization between threads that was introduced in commit
c01015091a ("KVM: PPC: Book3S HV: Run HPT guests on POWER9 radix
hosts", 2017-10-19).

Rather than using up another CPU feature bit, we just do an
explicit test on the PVR (processor version register) at module
startup time to determine whether we have to take steps to avoid
having some threads in HPT mode and some in radix mode (so-called
"mixed mode").  We test for "Nimbus" (indicated by 0 or 1 in the top
nibble of the lower 16 bits) v2.2 or later, or "Cumulus" (indicated by
2 or 3 in that nibble) v1.1 or later.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-18 12:05:19 +11:00
Yonghong Song
eefa864a81 bpf: change fake_ip for bpf_trace_printk helper
Currently, for bpf_trace_printk helper, fake ip address 0x1
is used with comments saying that fake ip will not be printed.
This is indeed true for 4.12 and earlier version, but for
4.13 and later version, the ip address will be printed if
it cannot be resolved with kallsym. Running samples/bpf/tracex5
program and you will have the following in the debugfs
trace_pipe output:
  ...
  <...>-1819  [003] ....   443.497877: 0x00000001: mmap
  <...>-1819  [003] ....   443.498289: 0x00000001: syscall=102 (one of get/set uid/pid/gid)
  ...

The kernel commit changed this behavior is:
  commit feaf1283d1
  Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
  Date:   Thu Jun 22 17:04:55 2017 -0400

      tracing: Show address when function names are not found
  ...

This patch changed the comment and also altered the fake ip
address to 0x0 as users may think 0x1 has some special meaning
while it doesn't. The new output:
  ...
  <...>-1799  [002] ....    25.953576: 0: mmap
  <...>-1799  [002] ....    25.953865: 0: read(fd=0, buf=00000000053936b5, size=512)
  ...

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 01:51:42 +01:00
Jesper Dangaard Brouer
e2e3224122 samples/bpf: xdp2skb_meta comment explain why pkt-data pointers are invalidated
Improve the 'unknown reason' comment, with an actual explaination of why
the ctx pkt-data pointers need to be loaded after the helper function
bpf_xdp_adjust_meta().  Based on the explaination Daniel gave.

Fixes: 36e04a2d78 ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 01:49:09 +01:00
Xiongfeng Wang
321cb0308a Kbuild: suppress packed-not-aligned warning for default setting only
gcc-8 reports many -Wpacked-not-aligned warnings. The below are some
examples.

./include/linux/ceph/msgr.h:67:1: warning: alignment 1 of 'struct
ceph_entity_addr' is less than 8 [-Wpacked-not-aligned]
 } __attribute__ ((packed));

./include/linux/ceph/msgr.h:67:1: warning: alignment 1 of 'struct
ceph_entity_addr' is less than 8 [-Wpacked-not-aligned]
 } __attribute__ ((packed));

./include/linux/ceph/msgr.h:67:1: warning: alignment 1 of 'struct
ceph_entity_addr' is less than 8 [-Wpacked-not-aligned]
 } __attribute__ ((packed));

This patch suppresses this kind of warnings for default setting.

Signed-off-by: Xiongfeng Wang <xiongfeng.wang@linaro.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:53 +09:00
Masahiro Yamada
ab9ce9feed fixdep: use existing helper to check modular CONFIG options
str_ends_with() tests if the given token ends with a particular string.
Currently, it is used to check file paths without $(srctree).

Actually, we have one more place where this helper is useful.  Use it
to check if CONFIG option ends with _MODULE.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:39 +09:00
Masahiro Yamada
87b95a8135 fixdep: refactor parse_dep_file()
parse_dep_file() has too much indentation, and puts the code far to
the right.  This commit refactors the code and reduces the one level
of indentation.

strrcmp() computes 'slen' by itself, but the caller already knows the
length of the token, so 'slen' can be passed via function argument.
With this, we can swap the order of strrcmp() and "*p = \0;"

Also, strrcmp() is an ambiguous function name.  Flip the logic and
rename it to str_ends_with().

I added a new helper is_ignored_file() - this returns 1 if the token
represents a file that should be ignored.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:39 +09:00
Masahiro Yamada
5d1ef76f5a fixdep: move global variables to local variables of main()
I do not mind global variables where they are useful enough.  In this
case, I do not see a good reason to use global variables since they
are just referenced in shallow places.  It is easy to pass them via
function arguments.

I squashed print_cmdline() into main() since it is just one line code.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:38 +09:00
Masahiro Yamada
ccfe78873c fixdep: remove unneeded memcpy() in parse_dep_file()
Each token in the depfile is copied to the temporary buffer 's' to
terminate the token with zero.  We do not need to do this any more
because the parsed buffer is now writable.  Insert '\0' directly in
the buffer without calling memcpy().

<limits.h> is no longer necessary. (It was needed for PATH_MAX).

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:38 +09:00
Masahiro Yamada
4003fd80cb fixdep: factor out common code for reading files
Now, do_config_files() and print_deps() are almost the same.  Only
the difference is the parser function called (parse_config_file vs
parse_dep_file).

We can reduce the code duplication by factoring out the common code
into read_file() - this function allocates a buffer and loads a file
to it.  It returns the pointer to the allocated buffer.  (As before,
it bails out by exit(2) for any error.)  The caller must free the
buffer when done.

Having empty source files is possible; fixdep should simply skip them.
I deleted the "st.st_size == 0" check, so read_file() allocates 1-byte
buffer for an empty file.  strstr() will immediately return NULL, and
this is what we expect.

On the other hand, an empty dep_file should be treated as an error.
In this case, parse_dep_file() will error out with "no targets found"
and it is a correct error message.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:38 +09:00
Masahiro Yamada
01b5cbe701 fixdep: use malloc() and read() to load dep_file to buffer
Commit dee81e9886 ("fixdep: faster CONFIG_ search") changed how to
read files in which CONFIG options are searched.  It used malloc()
and read() instead of mmap() because it needed to zero-terminate the
buffer in order to use strstr().  print_deps() was left untouched
since there was no reason to change it.

Now, I have two motivations to change it in the same way.

 - do_config_file() and print_deps() do quite similar things; they
   open a file, load it onto memory, and pass it to a parser function.
   If we use malloc() and read() for print_deps() too, we can factor
   out the common code.  (I will do this in the next commit.)

 - parse_dep_file() copies each token to a temporary buffer because
   it needs to zero-terminate it to be passed to printf().  It is not
   possible to modify the buffer directly because it is mmap'ed with
   O_RDONLY.  If we load the file content into a malloc'ed buffer, we
   can insert '\0' after each token, and save memcpy().  (I will do
   this in the commit after next.)

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:37 +09:00
Masahiro Yamada
41f92cffba fixdep: remove unnecessary <arpa/inet.h> inclusion
<arpa/inet.h> was included for ntohl(), but it was removed by
commit dee81e9886 ("fixdep: faster CONFIG_ search").

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-01-18 09:37:37 +09:00
Daniel Borkmann
cda18e9726 Merge branch 'bpf-dump-and-disasm-nfp-jit'
Jakub Kicinski says:

====================
Jiong says:

Currently bpftool could disassemble host jited image, for example x86_64,
using libbfd. However it couldn't disassemble offload jited image.

There are two reasons:

  1. bpf_obj_get_info_by_fd/struct bpf_prog_info couldn't get the address
     of jited image and image's length.

  2. Even after issue 1 resolved, bpftool couldn't figure out what is the
     offload arch from bpf_prog_info, therefore can't drive libbfd
     disassembler correctly.

  This patch set resolve issue 1 by introducing two new fields "jited_len"
and "jited_image" in bpf_dev_offload. These two fields serve as the generic
interface to communicate the jited image address and length for all offload
targets to higher level caller. For example, bpf_obj_get_info_by_fd could
use them to fill the userspace visible fields jited_prog_len and
jited_prog_insns.

  This patch set resolve issue 2 by getting bfd backend name through
"ifindex", i.e network interface index.

v1:
 - Deduct bfd arch name through ifindex, i.e network interface index.
   First, map ifindex to devname through ifindex_to_name_ns, then get
   pci id through /sys/class/dev/DEVNAME/device/vendor. (Daniel, Alexei)
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 01:26:16 +01:00
Jiong Wang
e65935969d tools: bpftool: improve architecture detection by using ifindex
The current architecture detection method in bpftool is designed for host
case.

For offload case, we can't use the architecture of "bpftool" itself.
Instead, we could call the existing "ifindex_to_name_ns" to get DEVNAME,
then read pci id from /sys/class/dev/DEVNAME/device/vendor, finally we map
vendor id to bfd arch name which will finally be used to select bfd backend
for the disassembler.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 01:26:15 +01:00
Jiong Wang
eb1d7db927 nfp: bpf: set new jit info fields
This patch set those new jit info fields introduced in this patch set.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 01:26:15 +01:00