Commit Graph

874965 Commits

Author SHA1 Message Date
Ilias Apalodimas
111cf1ab4d net: ethernet: ti: introduce cpsw switchdev based driver part 2 - switch
CPSW switchdev based driver which is operating in dual-emac mode by
default, thus working as 2 individual network interfaces. The Switch mode
can be enabled by configuring devlink driver parameter "switch_mode" to 1:

	devlink dev param set platform/48484000.switch \
	name switch_mode value 1 cmode runtime

This can be done regardless of the state of Port's netdevs - UP/DOWN, but
Port's netdev devices have to be UP before joining the bridge to avoid
overwriting of bridge configuration as CPSW switch driver completely
reloads its configuration when first Port changes its state to UP.
When the both interfaces joined the bridge - CPSW switch driver will start
marking packets with offload_fwd_mark flag unless "ale_bypass=0".

All configuration is implemented via switchdev API and notifiers.
Supported:
 - SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS
 - SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: BR_MCAST_FLOOD
 - SWITCHDEV_ATTR_ID_PORT_STP_STATE
 - SWITCHDEV_OBJ_ID_PORT_VLAN
 - SWITCHDEV_OBJ_ID_PORT_MDB
 - SWITCHDEV_OBJ_ID_HOST_MDB

Hence CPSW switchdev driver supports:
- FDB offloading
- MDB offloading
- VLAN filtering and offloading
- STP

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Ilias Apalodimas
ed3525eda4 net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac
Part 1:
 Introduce basic CPSW dual_mac driver (cpsw_new.c) which is operating in
dual-emac mode by default, thus working as 2 individual network interfaces.
Main differences from legacy CPSW driver are:

 - optimized promiscuous mode: The P0_UNI_FLOOD (both ports) is enabled in
addition to ALLMULTI (current port) instead of ALE_BYPASS. So, Ports in
promiscuous mode will keep possibility of mcast and vlan filtering, which
is provides significant benefits when ports are joined to the same bridge,
but without enabling "switch" mode, or to different bridges.
 - learning disabled on ports as it make not too much sense for
   segregated ports - no forwarding in HW.
 - enabled basic support for devlink.

	devlink dev show
		platform/48484000.switch

	devlink dev param show
	 platform/48484000.switch:
	name ale_bypass type driver-specific
	 values:
		cmode runtime value false

 - "ale_bypass" devlink driver parameter allows to enable
ALE_CONTROL(4).BYPASS mode for debug purposes.
 - updated DT bindings.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko
ef63fe72f6 dt-bindings: net: ti: add new cpsw switch driver bindings
Add bindings for the new TI CPSW switch driver. Comparing to the legacy
bindings (net/cpsw.txt):
- ports definition follows DSA bindings (net/dsa/dsa.txt) and ports can be
marked as "disabled" if not physically wired.
- all deprecated properties dropped;
- all legacy propertiies dropped which represent constant HW cpapbilities
(cpdma_channels, ale_entries, bd_ram_size, mac_control, slaves,
active_slave)
- TI CPTS DT properties are reused as is, but grouped in "cpts" sub-node
- TI Davinci MDIO DT bindings are reused as is, because Davinci MDIO is
reused.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko
c5013ac1dd net: ethernet: ti: cpsw: move set of common functions in cpsw_priv
As a preparatory patch to add support for a switchdev based cpsw driver,
move common functions to cpsw-priv.c so that they can be used across both
drivers.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko
51a9533797 net: ethernet: ti: cpsw: resolve build deps of cpsw drivers
A following patches introduce new CPSW switchdev driver which uses common
code with legacy CPSW driver. This will introduce build dependency between
CPSW switchdev and CPSW legacy drivers related to for_each_slave() and
cpsw_slave_index() - they can be compiled both, but only one of them will
be not functional depending in Kconfig settings due to duffrences in Slave
Ports indexes calculation.

To fix this make for_each_slave() local (it's used now only by legacy CPSW
driver) and convert cpsw_slave_index() to be a function pointer which is
assigned in probe. Driver to probe is defined by DT.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Ilias Apalodimas
e85c143707 net: ethernet: ti: ale: modify vlan/mdb api for switchdev
A following patch introduces switchdev functionality, so modify
ALE engine VLANs/MDBs API:
- cpsw_ale_del_mcast(): update so it will remove only selected ports from
mcast port_mask or delete whole mcast record if !port_mask
- cpsw_ale_del_vlan(): update so it will remove only selected ports from
all VLAN record's masks or delete whole VLAN record if !port_mask
- add cpsw_ale_vlan_add_modify() to add or modify existing VLAN record's
masks
- add cpsw_ale_set_unreg_mcast() for enabling unreg mcast on port VLANs

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko
4b41d34367 net: ethernet: ti: cpsw: allow untagged traffic on host port
Now untagged vlan traffic is not support on Host P0 port. This patch adds
in ALE context bitmap of VLANs for which Host P0 port bit set in Force
Untagged Packet Egress bitmask in VLANs ALE entries, and adds corresponding
check in VLAN incapsulation header parsing function cpsw_rx_vlan_encap().

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko
7fe579dfb9 net: ethernet: ti: ale: clean ale tbl on init and intf restart
Clean CPSW ALE on init and intf restart (up/down) to avoid reading obsolete
or garbage entries from ALE table.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
David S. Miller
b9242da6f6 Merge branch 'nf_tables_offload-vlan-matching-support'
Pablo Neira Ayuso says:

====================
nf_tables_offload: vlan matching support

The following patchset contains Netfilter support for vlan matching
offloads:

1) Constify nft_reg_load() as a preparation patch.
2) Restrict rule matching to ingress interface type ARPHRD_ETHER.
3) Add new vlan_tci field to flow_dissector_key_vlan structure,
   to allow to set up vlan_id, vlan_dei and vlan_priority in one go.
4) C-VLAN matching support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:21:35 -08:00
Pablo Neira Ayuso
89d8fd44ab netfilter: nft_payload: add C-VLAN offload support
Match on h_vlan_encapsulated_proto and set up protocol dependency. Check
for protocol dependency before accessing the tci field. Allow to match
on the encapsulated ethertype too.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:21:34 -08:00
Pablo Neira Ayuso
a82055af59 netfilter: nft_payload: add VLAN offload support
Match on ethertype and set up protocol dependency. Check for protocol
dependency before accessing the tci field. Allow to match on the
encapsulated ethertype too.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:21:34 -08:00
Pablo Neira Ayuso
8819efc943 netfilter: nf_tables_offload: allow ethernet interface type only
Hardware offload support at this stage assumes an ethernet device in
place. The flow dissector provides the intermediate representation to
express this selector, so extend it to allow to store the interface
type. Flower does not uses this, so skb_flow_dissect_meta() is not
extended to match on this new field.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:21:34 -08:00
Pablo Neira Ayuso
7cd9a58d68 netfilter: nf_tables: constify nft_reg_load{8, 16, 64}()
This patch constifies the pointer to source register data that is passed
as an input parameter.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:21:34 -08:00
Xin Long
2f1d370b99 lwtunnel: add support for multiple geneve opts
geneve RFC (draft-ietf-nvo3-geneve-14) allows a geneve packet to carry
multiple geneve opts, so it's necessary for lwtunnel to support adding
multiple geneve opts in one lwtunnel route. But vxlan and erspan opts
are still only allowed to add one option.

With this patch, iproute2 could make it like:

  # ip r a 1.1.1.0/24 encap ip id 1 geneve_opts 0:0:12121212,1:2:12121212 \
    dst 10.1.0.2 dev geneve1

  # ip r a 1.1.1.0/24 encap ip id 1 vxlan_opts 456 \
    dst 10.1.0.2 dev erspan1

  # ip r a 1.1.1.0/24 encap ip id 1 erspan_opts 1:123:0:0 \
    dst 10.1.0.2 dev erspan1

Which are pretty much like cls_flower and act_tunnel_key.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 10:15:13 -08:00
Rahul Lakkireddy
272630feb4 cxgb4: remove unneeded semicolon for switch block
Semicolon is not required at the end of switch block. So, remove it.

Addresses coccinelle warning:
drivers/net/ethernet/chelsio/cxgb4/sge.c:2260:2-3: Unneeded semicolon

Fixes: 4846d5330d ("cxgb4: add Tx and Rx path for ETHOFLD traffic")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19 16:39:51 -08:00
Vladimir Oltean
b8fc7177d8 net: dsa: felix: Fix CPU port assignment when not last port
On the NXP LS1028A, there are 2 Ethernet links between the Felix switch
and the ENETC:
- eno2 <-> swp4, at 2.5G
- eno3 <-> swp5, at 1G

Only one of the above Ethernet port pairs can act as a DSA link for
tagging.

When adding initial support for the driver, it was tested only on the 1G
eno3 <-> swp5 interface, due to the necessity of using PHYLIB initially
(which treats fixed-link interfaces as emulated C22 PHYs, so it doesn't
support fixed-link speeds higher than 1G).

After making PHYLINK work, it appears that swp4 still can't act as CPU
port. So it looks like ocelot_set_cpu_port was being called for swp4,
but then it was called again for swp5, overwriting the CPU port assigned
in the DT.

It appears that when you call dsa_upstream_port for a port that is not
defined in the device tree (such as swp5 when using swp4 as CPU port),
its dp->cpu_dp pointer is not initialized by dsa_tree_setup_default_cpu,
and this trips up the following condition in dsa_upstream_port:

	if (!cpu_dp)
		return port;

So the moral of the story is: don't call dsa_upstream_port for a port
that is not defined in the device tree, and therefore its dsa_port
structure is not completely initialized (ds->num_ports is still 6).

Fixes: 5605194877 ("net: dsa: ocelot: add driver for Felix switch family")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19 15:21:45 -08:00
Colin Ian King
c21709e744 net: phy: dp83869: fix return of uninitialized variable ret
In the case where the call to phy_interface_is_rgmii returns zero
the variable ret is left uninitialized and this is returned at
the end of the function dp83869_configure_rgmii.  Fix this by
returning 0 instead of the uninitialized value in ret.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 01db923e83 ("net: phy: dp83869: Add TI dp83869 phy")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:23:44 -08:00
Xin Long
3132174b4b lwtunnel: change to use nla_put_u8 for LWTUNNEL_IP_OPT_ERSPAN_VER
LWTUNNEL_IP_OPT_ERSPAN_VER is u8 type, and nla_put_u8 should have
been used instead of nla_put_u32(). This is a copy-paste error.

Fixes: b0a21810bd ("lwtunnel: add options setting and dumping for erspan")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:19:56 -08:00
David S. Miller
bec39a9fbb Merge branch 'bnxt_en-Updates'
Michael Chan says:

====================
bnxt_en: Updates.

This series has the firmware interface update that changes the aRFS/ntuple
interface on 57500 chips.  The 2nd patch adds a counter and improves
the hardware buffer error handling on the 57500 chips.  The rest of the
series is mainly enhancements on error recovery and firmware reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:29 -08:00
Pavan Chebbi
642aebdee4 bnxt_en: Abort waiting for firmware response if there is no heartbeat.
This is especially beneficial during the NVRAM related firmware
commands that have longer timeouts.  If the BNXT_STATE_FW_FATAL_COND
flag gets set while waiting for firmware response, abort and return
error.

Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:29 -08:00
Vasundhara Volam
a2b31e27f6 bnxt_en: Add a warning message for driver initiated reset
During loss of heartbeat, log this warning message.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
Vasundhara Volam
05069dd4c5 bnxt_en: Return proper error code for non-existent NVM variable
For NVM params that are not supported in the current NVM
configuration, return the error as -EOPNOTSUPP.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
Vasundhara Volam
e4e38237d7 bnxt_en: Report health status update after reset is done
Report health status update to devlink health reporter, once
reset is completed.

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
Vasundhara Volam
e633a32935 bnxt_en: Set MASTER flag during driver registration.
The Linux driver is capable of being the master function to handle
resets, so we set the flag to let firmware know.  Some other
drivers, such as DPDK, is not capable and will not set the flag.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
Vasundhara Volam
0a3f4e4f34 bnxt_en: Extend ETHTOOL_RESET to hot reset driver.
If firmware supports hot reset, extend ETHTOOL_RESET to support
hot reset driver which does not require a driver reload after
ETHTOOL_RESET.  The driver will go through the same coordinated
reset sequence as a firmware initiated fatal/non-fatal reset.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
Vasundhara Volam
5b306bde2b bnxt_en: Increase firmware response timeout for coredump commands.
Use the larger HWRM_COREDUMP_TIMEOUT value for coredump related
data response from the firmware.  These commands take longer than
normal commands.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
Michael Chan
19b3751ffa bnxt_en: Improve RX buffer error handling.
When hardware reports RX buffer errors, the latest 57500 chips do not
require reset.  The packet is discarded by the hardware and the
ring will continue to operate.

Also, add an rx_buf_errors counter for this type of error.  It can help
the user to identify if the aggregation ring is too small.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
Michael Chan
41136ab358 bnxt_en: Update firmware interface spec to 1.10.1.12.
The aRFS ring table interface has changed for the 57500 chips.  Updating
it accordingly so it will work with the latest production firmware.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:13:28 -08:00
David S. Miller
c4154cffa3 Merge branch 'selftests-Add-ethtool-and-scale-tests'
Ido Schimmel says:

====================
selftests: Add ethtool and scale tests

This patch set adds generic ethtool tests and a mlxsw-specific router
scale test for Spectrum-2.

Patches #1-#2 from Danielle add the router scale test for Spectrum-2. It
re-uses the same test as Spectrum-1, but it is invoked with a different
scale, according to what it is queried from devlink-resource.

Patches #3-#5 from Amit are a re-work of the ethtool tests that were
posted in the past [1]. Patches #3-#4 add the necessary library
routines, whereas patch #5 adds the test itself. The test checks both
good and bad flows with autoneg on and off. The test plan it detailed in
the commit message.

Last time Andrew and Florian (copied) provided very useful feedback that
is incorporated in this set. Namely:

* Parse the value of the different link modes from
  /usr/include/linux/ethtool.h
* Differentiate between supported and advertised speeds and use the
  latter in autoneg tests
* Make the test generic and move it to net/forwarding/ instead of being
  mlxsw-specific

[1] https://patchwork.ozlabs.org/cover/1112903/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:11:54 -08:00
Amit Cohen
64916b57c0 selftests: forwarding: Add speed and auto-negotiation test
Check configurations and packets transference with different variations
of autoneg and speed.

Test plan:
1. Test force of same speed with autoneg off
2. Test force of different speeds with autoneg off (should fail)
3. One side is autoneg on and other side sets force of common speeds
4. One side is autoneg on and other side only advertises a subset of the
   common speeds (one speed of the subset)
5. One side is autoneg on and other side only advertises a subset of the
   common speeds. Check that highest speed is negotiated
6. Test autoneg on, but each side advertises different speeds (should
   fail)

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:11:54 -08:00
Amit Cohen
8f72a9cf36 selftests: forwarding: lib.sh: Add wait for dev with timeout
Add a function that waits for device with maximum number of iterations.
It enables to limit the waiting and prevent infinite loop.

This will be used by the subsequent patch which will set two ports to
different speeds in order to make sure they cannot negotiate a link.

Waiting for all the setup is limited with 10 minutes for each device.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:11:54 -08:00
Amit Cohen
646cf7ed9a selftests: forwarding: Add ethtool_lib.sh
Functions:
1. speeds_arr_get
	The function returns an array of speed values from
        /usr/include/linux/ethtool.h The array looks as follows:
	[10baseT/Half] = 0,
	[10baseT/Full] = 1,
	...

2. ethtool_set:
	params: cmd
	The function runs ethtool by cmd (ethtool -s cmd) and checks if
	there was an error in configuration

3. dev_speeds_get:
	params: dev, with_mode (0 or 1), adver (0 or 1)
	return value: Array of supported/Advertised link modes
	with/without mode

	* Example 1:
	speeds_get swp1 0 0
	return: 1000 10000 40000
	* Example 2:
	speeds_get swp1 1 1
	return: 1000baseKX/Full 10000baseKR/Full 40000baseCR4/Full

4. common_speeds_get:
	params: dev1, dev2, with_mode (0 or 1), adver (0 or 1)
	return value: Array of common speeds of dev1 and dev2

	* Example:
	common_speeds_get swp1 swp2 0 0
	return: 1000 10000
	Assuming that swp1 supports 1000 10000 40000 and swp2 supports
	1000 10000

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:11:54 -08:00
Danielle Ratson
b22b0b0b10 selftests: mlxsw: Check devlink device before running test
The scale test for Spectrum-2 should only be invoked for Spectrum-2.
Skip the test otherwise.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:11:54 -08:00
Danielle Ratson
0fed96fa83 selftests: mlxsw: Add router scale test for Spectrum-2
Same as for Spectrum-1, test the ability to add the maximum number of
routes possible to the switch.

Invoke the test from the 'resource_scale' wrapper script.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:11:54 -08:00
David S. Miller
6960f7e3b2 Merge branch 'page_pool-followup-changes-to-restore-tracepoint-features'
Jesper Dangaard says:

====================
page_pool: followup changes to restore tracepoint features

This patchset is a followup to Jonathan patch, that do not release
pool until inflight == 0. That changed page_pool to be responsible for
its own delayed destruction instead of relying on xdp memory model.

As the page_pool maintainer, I'm promoting the use of tracepoint to
troubleshoot and help driver developers verify correctness when
converting at driver to use page_pool. The role of xdp:mem_disconnect
have changed, which broke my bpftrace tools for shutdown verification.
With these changes, the same capabilities are regained.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:03:18 -08:00
Jesper Dangaard Brouer
832ccf6f80 page_pool: extend tracepoint to also include the page PFN
The MM tracepoint for page free (called kmem:mm_page_free) doesn't provide
the page pointer directly, instead it provides the PFN (Page Frame Number).
This is annoying when writing a page_pool leak detector in BPF.

This patch change page_pool tracepoints to also provide the PFN.
The page pointer is still provided to allow other kinds of
troubleshooting from BPF.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:03:18 -08:00
Jesper Dangaard Brouer
7c9e69428d page_pool: add destroy attempts counter and rename tracepoint
When Jonathan change the page_pool to become responsible to its
own shutdown via deferred work queue, then the disconnect_cnt
counter was removed from xdp memory model tracepoint.

This patch change the page_pool_inflight tracepoint name to
page_pool_release, because it reflects the new responsability
better.  And it reintroduces a counter that reflect the number of
times page_pool_release have been tried.

The counter is also used by the code, to only empty the alloc
cache once.  With a stuck work queue running every second and
counter being 64-bit, it will overrun in approx 584 billion
years. For comparison, Earth lifetime expectancy is 7.5 billion
years, before the Sun will engulf, and destroy, the Earth.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:03:18 -08:00
Jesper Dangaard Brouer
c491eae8f9 xdp: remove memory poison on free for struct xdp_mem_allocator
When looking at the details I realised that the memory poison in
__xdp_mem_allocator_rcu_free doesn't make sense. This is because the
SLUB allocator uses the first 16 bytes (on 64 bit), for its freelist,
which overlap with members in struct xdp_mem_allocator, that were
updated.  Thus, SLUB already does the "poisoning" for us.

I still believe that poisoning memory make sense in other cases.
Kernel have gained different use-after-free detection mechanism, but
enabling those is associated with a huge overhead. Experience is that
debugging facilities can change the timing so much, that that a race
condition will not be provoked when enabled. Thus, I'm still in favour
of poisoning memory where it makes sense.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 17:03:17 -08:00
Russell King
b95e86d846 net: phy: avoid matching all-ones clause 45 PHY IDs
We currently match clause 45 PHYs using any ID read from a MMD marked
as present in the "Devices in package" registers 5 and 6.  However,
this is incorrect.  45.2 says:

  "The definition of the term package is vendor specific and could be
   a chip, module, or other similar entity."

so a package could be more or less than the whole PHY - a PHY could be
made up of several modules instantiated onto a single chip such as the
Marvell 88x3310, or some of the MMDs could be disabled according to
chip configuration, such as the Broadcom 84881.

In the case of Broadcom 84881, the "Devices in package" registers
contain 0xc000009b, meaning that there is a PHYXS present in the
package, but all registers in MMD 4 return 0xffff.  This leads to our
matching code incorrectly binding this PHY to one of our generic PHY
drivers.

This patch changes the way we determine whether to attempt to match a
MMD identifier, or use it to request a module - if the identifier is
all-ones, then we skip over it. When reading the identifiers, we
initialise phydev->c45_ids.device_ids to all-ones, only reading the
device ID if the "Devices in package" registers indicates we should.

This avoids the generic drivers incorrectly matching on a PHY ID of
0xffffffff.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 16:58:13 -08:00
David S. Miller
e64dbb1ac0 Merge branch 'Add-support-for-SFPs-behind-PHYs'
Russell King says:

====================
Add support for SFPs behind PHYs

This series adds partial support for SFP cages connected to PHYs,
specifically optical SFPs.

We add core infrastructure to phylib for this, and arrange for
minimal code in the PHY driver - currently, this is code to verify
that the module is one that we can support for Marvell 10G PHYs.

v2: add yaml binding patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 16:56:13 -08:00
Russell King
36023da1c7 net: phy: marvell10g: add SFP+ support
Add support for SFP+ cages to the Marvell 10G PHY driver. This is
slightly complicated by the way phylib works in that we need to use
a multi-step process to attach the SFP bus, and we also need to track
the phylink state machine to know when the module's transmit disable
signal should change state.

With appropriate DT changes, this allows the SFP+ canges on the
Macchiatobin platform to be functional.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 16:56:13 -08:00
Russell King
298e54fa81 net: phy: add core phylib sfp support
Add core phylib help for supporting SFP sockets on PHYs.  This provides
a mechanism to inform the SFP layer about PHY up/down events, and also
unregister the SFP bus when the PHY is going away.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 16:56:13 -08:00
Russell King
fb3d8bcde6 dt-bindings: net: add ethernet controller and phy sfp property
Document the missing sfp property for ethernet controllers (which
has existed for some time) which is being extended to ethernet PHYs.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 16:56:13 -08:00
David S. Miller
99638e9d6c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Wildcard support for the net,iface set from Kristian Evensen.

2) Offload support for matching on the input interface.

3) Simplify matching on vlan header fields.

4) Add nft_payload_rebuild_vlan_hdr() function to rebuild the vlan
   header from the vlan sk_buff metadata.

5) Pass extack to nft_flow_cls_offload_setup().

6) Add C-VLAN matching support.

7) Use time64_t in xt_time to fix y2038 overflow, from Arnd Bergmann.

8) Use time_t in nft_meta to fix y2038 overflow, also from Arnd.

9) Add flow_action_entry_next() helper function to flowtable offload
   infrastructure.

10) Add IPv6 support to the flowtable offload infrastructure.

11) Support for input interface matching from postrouting,
    from Phil Sutter.

12) Missing check for ndo callback in flowtable offload, from wenxu.

13) Remove conntrack parameter from flow_offload_fill_dir(), from wenxu.

14) Do not pass flow_rule object for rule removal, cookie is sufficient
    to achieve this.

15) Release flow_rule object in case of error from the offload commit
    path.

16) Undo offload ruleset updates if transaction fails.

17) Check for error when binding flowtable callbacks, from wenxu.

18) Always unbind flowtable callbacks when unregistering hooks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18 16:43:05 -08:00
David S. Miller
19b7e21c55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Lots of overlapping changes and parallel additions, stuff
like that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-16 21:51:42 -08:00
Linus Torvalds
1d4c79ed32 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This reverts a number of changes to the khwrng thread which feeds the
  kernel random number pool from hwrng drivers. They were trying to fix
  issues with suspend-and-resume but ended up causing regressions"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  Revert "hwrng: core - Freeze khwrng thread during suspend"
2019-11-16 18:14:32 -08:00
Herbert Xu
08e97aec70 Revert "hwrng: core - Freeze khwrng thread during suspend"
This reverts commit 03a3bb7ae6 ("hwrng: core - Freeze khwrng
thread during suspend"), ff296293b3 ("random: Support freezable
kthreads in add_hwgenerator_randomness()") and 59b569480d ("random:
Use wait_event_freezable() in add_hwgenerator_randomness()").

These patches introduced regressions and we need more time to
get them ready for mainline.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 08:48:17 +08:00
Linus Torvalds
fe30021c36 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two fixes: disable unreliable HPET on Intel Coffe Lake platforms, and
  fix a lockdep splat in the resctrl code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix potential lockdep warning
  x86/quirks: Disable HPET on Intel Coffe Lake platforms
2019-11-16 16:10:59 -08:00
Linus Torvalds
3278b3b678 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
 "Fix integer truncation bug in __do_adjtimex()"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ntp/y2038: Remove incorrect time_t truncation
2019-11-16 16:08:46 -08:00
Linus Torvalds
5ffaf037e7 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes: a handful of AUX event handling related fixes, a Sparse
  fix and two ABI fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix missing static inline on perf_cgroup_switch()
  perf/core: Consistently fail fork on allocation failures
  perf/aux: Disallow aux_output for kernel events
  perf/core: Reattach a misplaced comment
  perf/aux: Fix the aux_output group inheritance fix
  perf/core: Disallow uncore-cgroup events
2019-11-16 15:56:01 -08:00