Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from bpf, wireless, netfilter and
wireguard trees.
The bpf vs lockdown+audit fix is the most notable.
Things haven't slowed down just yet, both in terms of regressions in
current release and largish fixes for older code, but we usually see a
slowdown only after -rc5.
Current release - regressions:
- virtio-net: fix page faults and crashes when XDP is enabled
- mlx5e: fix HW timestamping with CQE compression, and make sure they
are only allowed to coexist with capable devices
- stmmac:
- fix kernel panic due to NULL pointer dereference of
mdio_bus_data
- fix double clk unprepare when no PHY device is connected
Current release - new code bugs:
- mt76: a few fixes for the recent MT7921 devices and runtime power
management
Previous releases - regressions:
- ice:
- track AF_XDP ZC enabled queues in bitmap to fix copy mode Tx
- fix allowing VF to request more/less queues via virtchnl
- correct supported and advertised autoneg by using PHY
capabilities
- allow all LLDP packets from PF to Tx
- kbuild: quote OBJCOPY var to avoid a pahole call break the build
Previous releases - always broken:
- bpf, lockdown, audit: fix buggy SELinux lockdown permission checks
- mt76: address the recent FragAttack vulnerabilities not covered by
generic fixes
- ipv6: fix KASAN: slab-out-of-bounds Read in
fib6_nh_flush_exceptions
- Bluetooth:
- fix the erroneous flush_work() order, to avoid double free
- use correct lock to prevent UAF of hdev object
- nfc: fix NULL ptr dereference in llcp_sock_getname() after failed
connect
- ieee802154: multiple fixes to error checking and return values
- igb: fix XDP with PTP enabled
- intel: add correct exception tracing for XDP
- tls: fix use-after-free when TLS offload device goes down and back
up
- ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
- netfilter: nft_ct: skip expectations for confirmed conntrack
- mptcp: fix falling back to TCP in presence of out of order packets
early in connection lifetime
- wireguard: switch from O(n) to a O(1) algorithm for maintaining
peers, fixing stalls and a large memory leak in the process
Misc:
- devlink: correct VIRTUAL port to not have phys_port attributes
- Bluetooth: fix VIRTIO_ID_BT assigned number
- net: return the correct errno code ENOBUF -> ENOMEM
- wireguard:
- peer: allocate in kmem_cache saving 25% on peer memory
- do not use -O3"
* tag 'net-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
cxgb4: avoid link re-train during TC-MQPRIO configuration
sch_htb: fix refcount leak in htb_parent_to_leaf_offload
wireguard: allowedips: free empty intermediate nodes when removing single node
wireguard: allowedips: allocate nodes in kmem_cache
wireguard: allowedips: remove nodes in O(1)
wireguard: allowedips: initialize list head in selftest
wireguard: peer: allocate in kmem_cache
wireguard: use synchronize_net rather than synchronize_rcu
wireguard: do not use -O3
wireguard: selftests: make sure rp_filter is disabled on vethc
wireguard: selftests: remove old conntrack kconfig value
virtchnl: Add missing padding to virtchnl_proto_hdrs
ice: Allow all LLDP packets from PF to Tx
ice: report supported and advertised autoneg using PHY capabilities
ice: handle the VF VSI rebuild failure
ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared
ice: Fix allowing VF to request more/less queues via virtchnl
virtio-net: fix for skb_over_panic inside big mode
ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions
fib: Return the correct errno code
...
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix NULL pointer dereference in 'perf probe' when handling
DW_AT_const_value when looking for a variable, which is valid.
- Fix for capability querying of perf_event_attr.cgroup support in
older kernels.
- Add missing cloning of evsel->use_config_name.
- Honor event config name on --no-merge in 'perf stat'.
- Fix some memory leaks found using ASAN.
- Fix the perf entry for perf_event_attr setup with make LIBPFM4=1 on
s390 z/VM.
- Update MIPS UAPI perf_regs.h file.
- Fix 'perf stat' BPF counter load return check.
* tag 'perf-tools-fixes-for-v5.13-2021-06-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf env: Fix memory leak of bpf_prog_info_linear member
perf symbol-elf: Fix memory leak by freeing sdt_note.args
perf stat: Honor event config name on --no-merge
perf evsel: Add missing cloning of evsel->use_config_name
perf test: Test 17 fails with make LIBPFM4=1 on s390 z/VM
perf stat: Fix error return code in bperf__load()
perf record: Move probing cgroup sampling support
perf probe: Fix NULL pointer dereference in convert_variable_location()
perf tools: Copy uapi/asm/perf_regs.h from the kernel for MIPS
Pull PCI fixes from Bjorn Helgaas:
- Fix MSIs for platforms with "msi-map" device-tree property, which we
broke in v5.13-rc1 (Jean-Philippe Brucker)
- Add Krzysztof Wilczyński as PCI reviewer (Lorenzo Pieralisi)
* tag 'pci-v5.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/MSI: Fix MSIs for generic hosts that use device-tree's "msi-map"
MAINTAINERS: Add Krzysztof as PCI host/endpoint controllers reviewer
When configuring TC-MQPRIO offload, only turn off netdev carrier and
don't bring physical link down in hardware. Otherwise, when the
physical link is brought up again after configuration, it gets
re-trained and stalls ongoing traffic.
Also, when firmware is no longer accessible or crashed, avoid sending
FLOWC and waiting for reply that will never come.
Fix following hung_task_timeout_secs trace seen in these cases.
INFO: task tc:20807 blocked for more than 122 seconds.
Tainted: G S 5.13.0-rc3+ #122
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:tc state:D stack:14768 pid:20807 ppid: 19366 flags:0x00000000
Call Trace:
__schedule+0x27b/0x6a0
schedule+0x37/0xa0
schedule_preempt_disabled+0x5/0x10
__mutex_lock.isra.14+0x2a0/0x4a0
? netlink_lookup+0x120/0x1a0
? rtnl_fill_ifinfo+0x10f0/0x10f0
__netlink_dump_start+0x70/0x250
rtnetlink_rcv_msg+0x28b/0x380
? rtnl_fill_ifinfo+0x10f0/0x10f0
? rtnl_calcit.isra.42+0x120/0x120
netlink_rcv_skb+0x4b/0xf0
netlink_unicast+0x1a0/0x280
netlink_sendmsg+0x216/0x440
sock_sendmsg+0x56/0x60
__sys_sendto+0xe9/0x150
? handle_mm_fault+0x6d/0x1b0
? do_user_addr_fault+0x1c5/0x620
__x64_sys_sendto+0x1f/0x30
do_syscall_64+0x3c/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f7f73218321
RSP: 002b:00007ffd19626208 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 000055b7c0a8b240 RCX: 00007f7f73218321
RDX: 0000000000000028 RSI: 00007ffd19626210 RDI: 0000000000000003
RBP: 000055b7c08680ff R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000055b7c085f5f6
R13: 000055b7c085f60a R14: 00007ffd19636470 R15: 00007ffd196262a0
Fixes: b1396c2bd6 ("cxgb4: parse and configure TC-MQPRIO offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit ae81feb733 ("sch_htb: fix null pointer dereference
on a null new_q") fixes a NULL pointer dereference bug, but it
is not correct.
Because htb_graft_helper properly handles the case when new_q
is NULL, and after the previous patch by skipping this call
which creates an inconsistency : dev_queue->qdisc will still
point to the old qdisc, but cl->parent->leaf.q will point to
the new one (which will be noop_qdisc, because new_q was NULL).
The code is based on an assumption that these two pointers are
the same, so it can lead to refcount leaks.
The correct fix is to add a NULL pointer check to protect
qdisc_refcount_inc inside htb_parent_to_leaf_offload.
Fixes: ae81feb733 ("sch_htb: fix null pointer dereference on a null new_q")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Suggested-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-06-04
This series contains updates to virtchnl header file and ice driver.
Brett fixes VF being unable to request a different number of queues then
allocated and adds clearing of VF_MBX_ATQLEN register for VF reset.
Haiyue handles error of rebuilding VF VSI during reset.
Paul fixes reporting of autoneg to use the PHY capabilities.
Dave allows LLDP packets without priority of TC_PRIO_CONTROL to be
transmitted.
Geert Uytterhoeven adds explicit padding to virtchnl_proto_hdrs
structure in the virtchnl header file.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason A. Donenfeld says:
====================
wireguard fixes for 5.13-rc5
Here are bug fixes to WireGuard for 5.13-rc5:
1-2,6) These are small, trivial tweaks to our test harness.
3) Linus thinks -O3 is still dangerous to enable. The code gen wasn't so
much different with -O2 either.
4) We were accidentally calling synchronize_rcu instead of
synchronize_net while holding the rtnl_lock, resulting in some rather
large stalls that hit production machines.
5) Peer allocation was wasting literally hundreds of megabytes on real
world deployments, due to oddly sized large objects not fitting
nicely into a kmalloc slab.
7-9) We move from an insanely expensive O(n) algorithm to a fast O(1)
algorithm, and cleanup a massive memory leak in the process, in
which allowed ips churn would leave danging nodes hanging around
without cleanup until the interface was removed. The O(1) algorithm
eliminates packet stalls and high latency issues, in addition to
bringing operations that took as much as 10 minutes down to less
than a second.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing single nodes, it's possible that that node's parent is an
empty intermediate node, in which case, it too should be removed.
Otherwise the trie fills up and never is fully emptied, leading to
gradual memory leaks over time for tries that are modified often. There
was originally code to do this, but was removed during refactoring in
2016 and never reworked. Now that we have proper parent pointers from
the previous commits, we can implement this properly.
In order to reduce branching and expensive comparisons, we want to keep
the double pointer for parent assignment (which lets us easily chain up
to the root), but we still need to actually get the parent's base
address. So encode the bit number into the last two bits of the pointer,
and pack and unpack it as needed. This is a little bit clumsy but is the
fastest and less memory wasteful of the compromises. Note that we align
the root struct here to a minimum of 4, because it's embedded into a
larger struct, and we're relying on having the bottom two bits for our
flag, which would only be 16-bit aligned on m68k.
The existing macro-based helpers were a bit unwieldy for adding the bit
packing to, so this commit replaces them with safer and clearer ordinary
functions.
We add a test to the randomized/fuzzer part of the selftests, to free
the randomized tries by-peer, refuzz it, and repeat, until it's supposed
to be empty, and then then see if that actually resulted in the whole
thing being emptied. That combined with kmemcheck should hopefully make
sure this commit is doing what it should. Along the way this resulted in
various other cleanups of the tests and fixes for recent graphviz.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous commit moved from O(n) to O(1) for removal, but in the
process introduced an additional pointer member to a struct that
increased the size from 60 to 68 bytes, putting nodes in the 128-byte
slab. With deployed systems having as many as 2 million nodes, this
represents a significant doubling in memory usage (128 MiB -> 256 MiB).
Fix this by using our own kmem_cache, that's sized exactly right. This
also makes wireguard's memory usage more transparent in tools like
slabtop and /proc/slabinfo.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, deleting peers would require traversing the entire trie in
order to rebalance nodes and safely free them. This meant that removing
1000 peers from a trie with a half million nodes would take an extremely
long time, during which we're holding the rtnl lock. Large-scale users
were reporting 200ms latencies added to the networking stack as a whole
every time their userspace software would queue up significant removals.
That's a serious situation.
This commit fixes that by maintaining a double pointer to the parent's
bit pointer for each node, and then using the already existing node list
belonging to each peer to go directly to the node, fix up its pointers,
and free it with RCU. This means removal is O(1) instead of O(n), and we
don't use gobs of stack.
The removal algorithm has the same downside as the code that it fixes:
it won't collapse needlessly long runs of fillers. We can enhance that
in the future if it ever becomes a problem. This commit documents that
limitation with a TODO comment in code, a small but meaningful
improvement over the prior situation.
Currently the biggest flaw, which the next commit addresses, is that
because this increases the node size on 64-bit machines from 60 bytes to
68 bytes. 60 rounds up to 64, but 68 rounds up to 128. So we wind up
using twice as much memory per node, because of power-of-two
allocations, which is a big bummer. We'll need to figure something out
there.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The randomized trie tests weren't initializing the dummy peer list head,
resulting in a NULL pointer dereference when used. Fix this by
initializing it in the randomized trie test, just like we do for the
static unit test.
While we're at it, all of the other strings like this have the word
"self-test", so add it to the missing place here.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With deployments having upwards of 600k peers now, this somewhat heavy
structure could benefit from more fine-grained allocations.
Specifically, instead of using a 2048-byte slab for a 1544-byte object,
we can now use 1544-byte objects directly, thus saving almost 25%
per-peer, or with 600k peers, that's a savings of 303 MiB. This also
makes wireguard's memory usage more transparent in tools like slabtop
and /proc/slabinfo.
Fixes: 8b5553ace8 ("wireguard: queueing: get rid of per-peer ring buffers")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many of the synchronization points are sometimes called under the rtnl
lock, which means we should use synchronize_net rather than
synchronize_rcu. Under the hood, this expands to using the expedited
flavor of function in the event that rtnl is held, in order to not stall
other concurrent changes.
This fixes some very, very long delays when removing multiple peers at
once, which would cause some operations to take several minutes.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some distros may enable strict rp_filter by default, which will prevent
vethc from receiving the packets with an unrouteable reverse path address.
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark bus as suspended during system suspend to block the future
transfers. Implement geni_i2c_resume_noirq() to resume the bus.
Fixes: 37692de5d5 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Signed-off-by: Roja Rani Yarubandi <rojay@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
If the hardware is still accessing memory after SMMU translation
is disabled (as part of smmu shutdown callback), then the
IOVAs (I/O virtual address) which it was using will go on the bus
as the physical addresses which will result in unknown crashes
like NoC/interconnect errors.
So, implement shutdown callback for i2c driver to suspend the bus
during system "reboot" or "shutdown".
Fixes: 37692de5d5 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Signed-off-by: Roja Rani Yarubandi <rojay@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
It causes mlxreg-hotplug probing failure: request_threaded_irq()
returns -EINVAL due to true value of condition:
((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN))
after flag "IRQF_NO_AUTOEN" has been added to:
err = devm_request_irq(&pdev->dev, priv->irq,
mlxreg_hotplug_irq_handler, IRQF_TRIGGER_FALLING
| IRQF_SHARED | IRQF_NO_AUTOEN,
"mlxreg-hotplug", priv);
This reverts commit bee3ecfed0 ("platform/mellanox: mlxreg-hotplug: move
to use request_irq by IRQF_NO_AUTOEN flag").
Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210603172827.2599908-1-c_mykolak@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Pull sound fixes from Takashi Iwai:
"A couple of small fixes are found in the ALSA core side at this time;
a fix in the new LED handling code and a long-standing (and likely no
one would notice) ioctl bug.
The rest are usual HD-audio fixes, mostly device-specific quirks but
also one major regression fix that was introduced in 5.13"
* tag 'sound-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda: update the power_state during the direct-complete
ALSA: timer: Fix master timer notification
ALSA: control led: fix memory leak in snd_ctl_led_register
ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx
ALSA: hda/cirrus: Set Initial DMIC volume to -26 dB
ALSA: hda: Fix a regression in Capture Switch mixer read
ALSA: hda: Add AlderLake-M PCI ID
The first two bits of the CPUID leaf 0x8000001F EAX indicate whether SEV
or SME is supported, respectively. It's better to check whether SEV or
SME is actually supported before accessing the MSR_AMD64_SEV to check
whether SEV or SME is enabled.
This is both a bare-metal issue and a guest/VM issue. Since the first
generation Hygon Dhyana CPU doesn't support the MSR_AMD64_SEV, reading that
MSR results in a #GP - either directly from hardware in the bare-metal
case or via the hypervisor (because the RDMSR is actually intercepted)
in the guest/VM case, resulting in a failed boot. And since this is very
early in the boot phase, rdmsrl_safe()/native_read_msr_safe() can't be
used.
So check the CPUID bits first, before accessing the MSR.
[ tlendacky: Expand and improve commit message. ]
[ bp: Massage commit message. ]
Fixes: eab696d8e8 ("x86/sev: Do not require Hypervisor CPUID bit for SEV guests")
Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@vger.kernel.org> # v5.10+
Link: https://lkml.kernel.org/r/20210602070207.2480-1-puwen@hygon.cn
Pull drm fixes from Dave Airlie:
"Two big regression reverts in here, one for fbdev and one i915.
Otherwise it's mostly amdgpu display fixes, and tegra fixes.
fb:
- revert broken fb_defio patch
amdgpu:
- Display fixes
- FRU EEPROM error handling fix
- RAS fix
- PSP fix
- Releasing pinned BO fix
i915:
- Revert conversion to io_mapping_map_user() which lead to BUG_ON()
- Fix check for error valued returns in a selftest
tegra:
- SOR power domain race condition fix
- build warning fix
- runtime pm ref leak fix
- modifier fix"
* tag 'drm-fixes-2021-06-04-1' of git://anongit.freedesktop.org/drm/drm:
amd/display: convert DRM_DEBUG_ATOMIC to drm_dbg_atomic
drm/amdgpu: make sure we unpin the UVD BO
drm/amd/amdgpu:save psp ring wptr to avoid attack
drm/amd/display: Fix potential memory leak in DMUB hw_init
drm/amdgpu: Don't query CE and UE errors
drm/amd/display: Fix overlay validation by considering cursors
drm/amdgpu: refine amdgpu_fru_get_product_info
drm/amdgpu: add judgement for dc support
drm/amd/display: Fix GPU scaling regression by FS video support
drm/amd/display: Allow bandwidth validation for 0 streams.
Revert "i915: use io_mapping_map_user"
drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest()
Revert "fb_defio: Remove custom address_space_operations"
drm/tegra: Correct DRM_FORMAT_MOD_NVIDIA_SECTOR_LAYOUT
drm/tegra: sor: Fix AUX device reference leak
drm/tegra: Get ref for DP AUX channel, not its ddc adapter
drm/tegra: Fix shift overflow in tegra_shared_plane_atomic_update
drm/tegra: sor: Fully initialize SOR before registration
gpu: host1x: Split up client initalization and registration
drm/tegra: sor: Do not leak runtime PM reference
On m68k (Coldfire M547x):
CC drivers/net/ethernet/intel/i40e/i40e_main.o
In file included from drivers/net/ethernet/intel/i40e/i40e_prototype.h:9,
from drivers/net/ethernet/intel/i40e/i40e.h:41,
from drivers/net/ethernet/intel/i40e/i40e_main.c:12:
include/linux/avf/virtchnl.h:153:36: warning: division by zero [-Wdiv-by-zero]
153 | { virtchnl_static_assert_##X = (n)/((sizeof(struct X) == (n)) ? 1 : 0) }
| ^
include/linux/avf/virtchnl.h:844:1: note: in expansion of macro ‘VIRTCHNL_CHECK_STRUCT_LEN’
844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/avf/virtchnl.h:844:33: error: enumerator value for ‘virtchnl_static_assert_virtchnl_proto_hdrs’ is not an integer constant
844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs);
| ^~~~~~~~~~~~~~~~~~~
On m68k, integers are aligned on addresses that are multiples of two,
not four, bytes. Hence the size of a structure containing integers may
not be divisible by 4.
Fix this by adding explicit padding.
Fixes: 1f7ea1cd6a ("ice: Enable FDIR Configure for AVF")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently in the ice driver, the check whether to
allow a LLDP packet to egress the interface from the
PF_VSI is being based on the SKB's priority field.
It checks to see if the packets priority is equal to
TC_PRIO_CONTROL. Injected LLDP packets do not always
meet this condition.
SCAPY defaults to a sk_buff->protocol value of ETH_P_ALL
(0x0003) and does not set the priority field. There will
be other injection methods (even ones used by end users)
that will not correctly configure the socket so that
SKB fields are correctly populated.
Then ethernet header has to have to correct value for
the protocol though.
Add a check to also allow packets whose ethhdr->h_proto
matches ETH_P_LLDP (0x88CC).
Fixes: 0c3a6101ff ("ice: Allow egress control packets from PF_VSI")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Ethtool incorrectly reported supported and advertised auto-negotiation
settings for a backplane PHY image which did not support auto-negotiation.
This can occur when using media or PHY type for reporting ethtool
supported and advertised auto-negotiation settings.
Remove setting supported and advertised auto-negotiation settings based
on PHY type in ice_phy_type_to_ethtool(), and MAC type in
ice_get_link_ksettings().
Ethtool supported and advertised auto-negotiation settings should be
based on the PHY image using the AQ command get PHY capabilities with
media. Add setting supported and advertised auto-negotiation settings
based get PHY capabilities with media in ice_get_link_ksettings().
Fixes: 48cb27f2fd ("ice: Implement handlers for ethtool PHY/link operations")
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
VSI rebuild can be failed for LAN queue config, then the VF's VSI will
be NULL, the VF reset should be stopped with the VF entering into the
disable state.
Fixes: 12bb018c53 ("ice: Refactor VF reset")
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Some AVF drivers expect the VF_MBX_ATQLEN register to be cleared for any
type of VFR/VFLR. Fix this by clearing the VF_MBX_ATQLEN register at the
same time as VF_MBX_ARQLEN.
Fixes: 82ba01282c ("ice: clear VF ARQLEN register on reset")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Commit 12bb018c53 ("ice: Refactor VF reset") caused a regression
that removes the ability for a VF to request a different amount of
queues via VIRTCHNL_OP_REQUEST_QUEUES. This prevents VF drivers to
either increase or decrease the number of queue pairs they are
allocated. Fix this by using the variable vf->num_req_qs when
determining the vf->num_vf_qs during VF VSI creation.
Fixes: 12bb018c53 ("ice: Refactor VF reset")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
ASan reported a memory leak caused by info_linear not being deallocated.
The info_linear was allocated during in perf_event__synthesize_one_bpf_prog().
This patch adds the corresponding free() when bpf_prog_info_node
is freed in perf_env__purge_bpf().
$ sudo ./perf record -- sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]
=================================================================
==297735==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 7688 byte(s) in 19 object(s) allocated from:
#0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f)
#1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16
#2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16
#3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9
#4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8
#5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8
#6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8
#7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3
#11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20210602224024.300485-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
__bad_area_nosemaphore() calls both force_sig_pkuerr() and
force_sig_fault() when handling SEGV_PKUERR. This does not cause
problems because the second signal is filtered by the legacy_queue()
check in __send_signal() because in both cases, the signal is SIGSEGV,
the second one seeing that the first one is already pending.
This causes the kernel to do unnecessary work so send the signal only
once for SEGV_PKUERR.
[ bp: Massage commit message. ]
Fixes: 9db812dbb2 ("signal/x86: Call force_sig_pkuerr from __bad_area_nosemaphore")
Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiashuo Liang <liangjs@pku.edu.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://lkml.kernel.org/r/20210601085203.40214-1-liangjs@pku.edu.cn