The debug message in tcp_retransmit_timer() is slightly wrong, because
they could be printed even if we did not receive a new ACK packet from
the remote peer.
Change it to probing zero-window, as it is a expected case now. The
description may be not correct.
Adding the duration since the last ACK we received, and the duration of
the retransmission, which are useful for debugging.
And the message now like this:
Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 209ms ago, lasting 209ms
Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 404ms ago, lasting 408ms
Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 812ms ago, lasting 1224ms
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In tcp_retransmit_timer(), a window shrunk connection will be regarded
as timeout if 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX'. This is not
right all the time.
The retransmits will become zero-window probes in tcp_retransmit_timer()
if the 'snd_wnd==0'. Therefore, the icsk->icsk_rto will come up to
TCP_RTO_MAX sooner or later.
However, the timer can be delayed and be triggered after 122877ms, not
TCP_RTO_MAX, as I tested.
Therefore, 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX' is always true
once the RTO come up to TCP_RTO_MAX, and the socket will die.
Fix this by replacing the 'tcp_jiffies32' with '(u32)icsk->icsk_timeout',
which is exact the timestamp of the timeout.
However, "tp->rcv_tstamp" can restart from idle, then tp->rcv_tstamp
could already be a long time (minutes or hours) in the past even on the
first RTO. So we double check the timeout with the duration of the
retransmission.
Meanwhile, making "2 * TCP_RTO_MAX" as the timeout to avoid the socket
dying too soon.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/netdev/CADxym3YyMiO+zMD4zj03YPM3FBi-1LHi6gSD2XT8pyAMM096pg@mail.gmail.com/
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fow now, an ACK can update the window in following case, according to
the tcp_may_update_window():
1. the ACK acknowledged new data
2. the ACK has new data
3. the ACK expand the window and the seq of it is valid
Now, we allow the ACK update the window if the window is 0, and the
seq/ack of it is valid. This is for the case that the receiver replies
an zero-window ACK when it is under memory stress and can't queue the new
data.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For now, skb will be dropped when no memory, which makes client keep
retrans util timeout and it's not friendly to the users.
In this patch, we reply an ACK with zero-window in this case to update
the snd_wnd of the sender to 0. Therefore, the sender won't timeout the
connection and will probe the zero-window with the retransmits.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
i40e: Replace one-element arrays with flexible-array members
Replace one-element arrays with flexible-array members in multiple
structures.
This results in no differences in binary output.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
i40e: Replace one-element array with flex-array member in struct i40e_profile_aq_section
i40e: Replace one-element array with flex-array member in struct i40e_section_table
i40e: Replace one-element array with flex-array member in struct i40e_profile_segment
i40e: Replace one-element array with flex-array member in struct i40e_package_header
====================
Link: https://lore.kernel.org/r/20230810175302.1964182-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré says:
====================
net: dsa: rzn1-a5psw: add support for vlan and .port_bridge_flags
this series enables vlan support in Renesas RZN1 internal ethernet switch,
and is a follow up to the work initiated by Clement Leger a few months ago,
who handed me over the topic.
This new revision aims to iron the last few points raised by Vladimir to
ensure that the driver is in line with switch drivers expectations, and is
based on the lengthy discussion in [1] (thanks Vladimir for the valuable
explanations)
[1] https://lore.kernel.org/netdev/20230314163651.242259-1-clement.leger@bootlin.com/
----
V5:
- ensure that flooding can be enabled only if port belongs to a bridge
- enable learning in a5psw_port_stp_state_set() only if port has learning
enabled
- toggle vlan tagging on vlan filtering in
- removed reviewed-by on second patch since its modified
RESEND V4:
- Resent due to net-next being closed
V4:
- Fix missing CPU port bit in a5psw->bridged_ports
- Use unsigned int for vlan_res_id parameters
- Rename a5psw_get_vlan_res_entry() to a5psw_new_vlan_res_entry()
- In a5psw_port_vlan_add(), return -ENOSPC when no VLAN entry is found
- In a5psw_port_vlan_filtering(), compute "val" from "mask"
V3:
- Target net-next tree and correct version...
V2:
- Fixed a few formatting errors
- Add .port_bridge_flags implementation
====================
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for vlan operation (add, del, filtering) on the RZN1
driver. The a5psw switch supports up to 32 VLAN IDs with filtering,
tagged/untagged VLANs and PVID for each ports.
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When running vlan test (bridge_vlan_aware/unaware.sh), there were some
failure due to the lack .port_bridge_flag function to disable port
flooding. Implement this operation for BR_LEARNING, BR_FLOOD,
BR_MCAST_FLOOD and BR_BCAST_FLOOD.
Since .port_bridge_flags affects the bits disabling learning for a port,
ensure that any other modification on the same register done by
a5psw_port_stp_state_set is in sync by using the port learning state to
enable/disable learning on the port.
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
.port_bridge_flags will be added and allows to modify the flood mask
independently for each port. Keeping the existing bridged_ports write
in a5psw_flooding_set_resolution() would potentially messed up this.
Use a read-modify-write to set that value and move bridged_ports
handling in bridge_port_join/leave.
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current implementation does not allow the user to enable both
hw-tc-offload and ntuple features on the interface. These checks
are added as TC flower offload and ntuple features internally configures
the same hardware resource MCAM. But TC HTB offload configures the
transmit scheduler which can be safely enabled on the interface with
ntuple feature.
This patch adds the same and ensures only TC flower offload and ntuple
features are mutually exclusive.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhengchao Shao says:
====================
bonding: do some cleanups in bond driver
Do some cleanups in bond driver.
---
v2: use IS_ERR instead of NULL check in patch 2/5, update commit
information in patch 3/5, remove inline modifier in patch 4/5
====================
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The free_percpu function also could check whether "rr_tx_counter"
parameter is NULL. Therefore, remove NULL check in bond_destructor.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bond_reset_slave_arr(), values are assigned and memory is released only
when the variables "usable" and "all" are not NULL. But even if the
"usable" and "all" variables are NULL, they can still work, because value
will be checked in kfree_rcu. Therefore, use bond_set_slave_arr() and set
the input parameters "usable_slaves" and "all_slaves" to NULL to simplify
the code in bond_reset_slave_arr(). And the same to bond_uninit().
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because debugfs_create_dir returns ERR_PTR, so bonding_debug_root will
never be NULL. Remove redundant NULL check for bonding_debug_root in
debugfs function. The later debugfs_create_dir/debugfs_remove_recursive
/debugfs_remove_recursive functions will check the dentry with IS_ERR().
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because debugfs_create_dir returns ERR_PTR, so IS_ERR should be used to
check whether the directory is successfully created.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some functions are only used in initialization and exit functions, so add
the __init/__net_init and __net_exit modifiers to these functions.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'type' is an enum, thus cast of pointer on 64-bit compile test with
W=1 causes:
mvmdio.c:272:9: error: cast to smaller integer type 'enum orion_mdio_bus_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
'enet_id' is an enum, thus cast of pointer on 64-bit compile test with
W=1 causes:
xgene_enet_main.c:2044:20: error: cast to smaller integer type 'enum xgene_enet_id' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
Extended performance counter stats in 'ethtool -S <interface>'
for MANA VF to include GDMA tx LSO packets and bytes count.
Tested-on: Ubuntu22
Testcases:
1. LISA testcase:
PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-Synthetic
2. LISA testcase:
PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-SRIOV
3. Validated the GDMA stat packets and byte counters
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement control plane mailbox versions for host and firmware.
Versions are published in info area of control mailbox bar4
memory structure.Firmware will publish minimum and maximum
supported versions.Control plane mailbox apis will check for
firmware version before sending any control commands to firmware.
Notifications from firmware will similarly be checked for host
version compatibility.
Signed-off-by: Sathesh Edara <sedara@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Accept TC offload classifier rule only if SPI field
can be extracted by HW.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If netdev_mc_count() is not zero and not IFF_ALLMULTI, filter
incoming multicast packets. The chip has a Multicast Address Hash Table
for allowed multicast addresses, so we fill it.
Implement .ndo_set_rx_mode and recalculate multicast hash table. Also
observe change of IFF_PROMISC and IFF_ALLMULTI netdev flags.
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When gso.hdr_len is zero and a packet is transmitted via write() or
writev(), all payload is treated as header which requires a contiguous
memory allocation. This allocation request is harder to satisfy, and may
even fail if there is enough fragmentation.
Note that sendmsg() code path limits the linear copy length, so this change
makes write()/writev() and sendmsg() paths more consistent.
Signed-off-by: Tahsin Erdogan <trdgn@amazon.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230809164753.2247594-1-trdgn@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cross merge x86 fixes to fix clang linking errors:
ld.lld: error: ./arch/x86/kernel/vmlinux.lds:221: at least one side of the expression must be absolute
These will hopefully be downstream by the time we ship
the next batch of fixes.
* 'x86/bugs' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Move gds_ucode_mitigated() declaration to header
x86/speculation: Add cpu_show_gds() prototype
driver core: cpu: Make cpu_show_not_affected() static
x86/srso: Fix build breakage with the LLVM linker
Documentation/srso: Document IBPB aspect and fix formatting
driver core: cpu: Unify redundant silly stubs
Documentation/hw-vuln: Unify filename specification in index
Link: https://lore.kernel.org/all/CAHk-=wj_b+FGTnevQSBAtCWuhCk=0oQ_THvthBW2hzqpOTLFmg@mail.gmail.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Andrew Lunn says:
====================
Support offload LED blinking to PHY.
Allow offloading of the LED trigger netdev to PHY drivers and
implement it for the Marvell PHY driver. Additionally, correct the
handling of when the initial state of the LED cannot be represented by
the trigger, and so an error is returned. As with ledtrig-timer,
disable offload when the trigger is deactivate, or replaced by another
trigger.
====================
Link: https://lore.kernel.org/r/20230808210436.838995-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linux LEDs can be requested to perform hardware accelerated blinking
to indicate link, RX, TX etc. Pass the rules for blinking to the PHY
driver, if it implements the ops needed to determine if a given
pattern can be offloaded, to offload it, and what the current offload
is. Additionally implement the op needed to get what device the LED is
for.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/20230808210436.838995-3-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the netdev trigger is activates, it tries to determine what
device the LED blinks for, and what the current blink mode is.
The documentation for hw_control_get() says:
* Return 0 on success, a negative error number on failing parsing the
* initial mode. Error from this function is NOT FATAL as the device
* may be in a not supported initial state by the attached LED trigger.
*/
For the Marvell PHY and the Armada 370-rd board, the initial LED blink
mode is not supported by the trigger, so it returns an error. This
resulted in not getting the device the LED is blinking for. As a
result, the device is unknown and offloaded is never performed.
Change to condition to always get the device if offloading is
supported, and reduce the scope of testing for an error from
hw_control_get() to skip setting trigger internal state if there is an
error.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/20230808210436.838995-2-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When using a fixed-link setup, certain devices like the SJA1105 require a
small pause in the TXC clock line to enable their internal tunable
delay line (TDL).
To satisfy this requirement, this patch temporarily disables the TX clock,
and restarts it after a required period. This provides the required
silent interval on the clock line for SJA1105 to complete the frequency
transition and enable the internal TDLs. This action occurs before the link
is built up, so it does not impact a normal device too. There is no
need to identify if the connected device is an SJA1105 alike or not during
the implementation.
So far we have only enabled this feature on the i.MX93 platform.
Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
Reviewed-by: Frank Li <frank.li@nxp.com>
Link: https://lore.kernel.org/r/20230807160716.259072-3-shenwei.wang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Martin KaFai Lau says:
====================
pull-request: bpf-next 2023-08-09
We've added 19 non-merge commits during the last 6 day(s) which contain
a total of 25 files changed, 369 insertions(+), 141 deletions(-).
The main changes are:
1) Fix array-index-out-of-bounds access when detaching from an
already empty mprog entry from Daniel Borkmann.
2) Adjust bpf selftest because of a recent llvm change
related to the cpu-v4 ISA from Eduard Zingerman.
3) Add uprobe support for the bpf_get_func_ip helper from Jiri Olsa.
4) Fix a KASAN splat due to the kernel incorrectly accepted
an invalid program using the recent cpu-v4 instruction from
Yonghong Song.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
bpf: btf: Remove two unused function declarations
bpf: lru: Remove unused declaration bpf_lru_promote()
selftests/bpf: relax expected log messages to allow emitting BPF_ST
selftests/bpf: remove duplicated functions
bpf, docs: Fix small typo and define semantics of sign extension
selftests/bpf: Add bpf_get_func_ip test for uprobe inside function
selftests/bpf: Add bpf_get_func_ip tests for uprobe on function entry
bpf: Add support for bpf_get_func_ip helper for uprobe program
selftests/bpf: Add a movsx selftest for sign-extension of R10
bpf: Fix an incorrect verification success with movsx insn
bpf, docs: Formalize type notation and function semantics in ISA standard
bpf: change bpf_alu_sign_string and bpf_movsx_string to static
libbpf: Use local includes inside the library
bpf: fix bpf_dynptr_slice() to stop return an ERR_PTR.
bpf: fix inconsistent return types of bpf_xdp_copy_buf().
selftests/bpf: fix the incorrect verification of port numbers.
selftests/bpf: Add test for detachment on empty mprog entry
bpf: Fix mprog detachment for empty mprog entry
bpf: bpf_struct_ops: Remove unnecessary initial values of variables
====================
Link: https://lore.kernel.org/r/20230810055123.109578-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, wireless and bpf.
Still trending up in size but the good news is that the "current"
regressions are resolved, AFAIK.
We're getting weirdly many fixes for Wake-on-LAN and suspend/resume
handling on embedded this week (most not merged yet), not sure why.
But those are all for older bugs.
Current release - regressions:
- tls: set MSG_SPLICE_PAGES consistently when handing encrypted data
over to TCP
Current release - new code bugs:
- eth: mlx5: correct IDs on VFs internal to the device (IPU)
Previous releases - regressions:
- phy: at803x: fix WoL support / reporting on AR8032
- bonding: fix incorrect deletion of ETH_P_8021AD protocol VID from
slaves, leading to BUG_ON()
- tun: prevent tun_build_skb() from exceeding the packet size limit
- wifi: rtw89: fix 8852AE disconnection caused by RX full flags
- eth/PCI: enetc: fix probing after 6fffbc7ae1 ("PCI: Honor
firmware's device disabled status"), keep PCI devices around even
if they are disabled / not going to be probed to be able to apply
quirks on them
- eth: prestera: fix handling IPv4 routes with nexthop IDs
Previous releases - always broken:
- netfilter: re-work garbage collection to avoid races between
user-facing API and timeouts
- tunnels: fix generating ipv4 PMTU error on non-linear skbs
- nexthop: fix infinite nexthop bucket dump when using maximum
nexthop ID
- wifi: nl80211: fix integer overflow in nl80211_parse_mbssid_elems()
Misc:
- unix: use consistent error code in SO_PEERPIDFD
- ipv6: adjust ndisc_is_useropt() to include PREFIX_INFO, in prep for
upcoming IETF RFC"
* tag 'net-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
net: hns3: fix strscpy causing content truncation issue
net: tls: set MSG_SPLICE_PAGES consistently
ibmvnic: Ensure login failure recovery is safe from other resets
ibmvnic: Do partial reset on login failure
ibmvnic: Handle DMA unmapping of login buffs in release functions
ibmvnic: Unmap DMA login rsp buffer on send login fail
ibmvnic: Enforce stronger sanity checks on login response
net: mana: Fix MANA VF unload when hardware is unresponsive
netfilter: nf_tables: remove busy mark and gc batch API
netfilter: nft_set_hash: mark set element as dead when deleting from packet path
netfilter: nf_tables: adapt set backend to use GC transaction API
netfilter: nf_tables: GC transaction API to avoid race with control plane
selftests/bpf: Add sockmap test for redirecting partial skb data
selftests/bpf: fix a CI failure caused by vsock sockmap test
bpf, sockmap: Fix bug that strp_done cannot be called
bpf, sockmap: Fix map type error in sock_map_del_link
xsk: fix refcount underflow in error path
ipv6: adjust ndisc_is_useropt() to also return true for PIO
selftests: forwarding: bridge_mdb: Make test more robust
selftests: forwarding: bridge_mdb_max: Fix failing test with old libnet
...
hns3_dbg_fill_content()/hclge_dbg_fill_content() is aim to integrate some
items to a string for content, and we add '\n' and '\0' in the last
two bytes of content.
strscpy() will add '\0' in the last byte of destination buffer(one of
items), it result in finishing content print ahead of schedule and some
dump content truncation.
One Error log shows as below:
cat mac_list/uc
UC MAC_LIST:
Expected:
UC MAC_LIST:
FUNC_ID MAC_ADDR STATE
pf 00:2b:19:05:03:00 ACTIVE
The destination buffer is length-bounded and not required to be
NUL-terminated, so just change strscpy() to memcpy() to fix it.
Fixes: 1cf3d5567f ("net: hns3: fix strncpy() not using dest-buf length as length issue")
Signed-off-by: Hao Chen <chenhao418@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://lore.kernel.org/r/20230809020902.1941471-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We used to change the flags for the last segment, because
non-last segments had the MSG_SENDPAGE_NOTLAST flag set.
That flag is no longer a thing so remove the setting.
Since flags most likely don't have MSG_SPLICE_PAGES set
this avoids passing parts of the sg as splice and parts
as non-splice. Before commit under Fixes we'd have called
tcp_sendpage() which would add the MSG_SPLICE_PAGES.
Why this leads to trouble remains unclear but Tariq
reports hitting the WARN_ON(!sendpage_ok()) due to
page refcount of 0.
Fixes: e117dcfd64 ("tls: Inline do_tcp_sendpages()")
Reported-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/all/4c49176f-147a-4283-f1b1-32aac7b4b996@gmail.com/
Tested-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20230808180917.1243540-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull dmaengine fixes from Vinod Koul:
- HAS_IOMEM fixes for fsl edma and intel idma
- return-value fix, interrupt vector setting and typo fix for xilinx
xdma
- email updates for codeaurora email domain move
- correct pause status for pl330 driver
- idxd clear flag on disable fix
- function documentation fix for owl dma
- potential un-allocated memory fix for mcf driver
* tag 'dmaengine-fix-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: xilinx: xdma: Fix typo
dmaengine: xilinx: xdma: Fix interrupt vector setting
dmaengine: owl-dma: Modify mismatched function name
dmaengine: idxd: Clear PRS disable flag when disabling IDXD device
dmaengine: pl330: Return DMA_PAUSED when transaction is paused
dmaengine: qcom_hidma: Update codeaurora email domain
dmaengine: mcf-edma: Fix a potential un-allocated memory access
dmaengine: xilinx: xdma: Fix Judgment of the return value
idmaengine: make FSL_EDMA and INTEL_IDMA64 depends on HAS_IOMEM
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The existing attempt to resolve races between control plane and GC work
is error prone, as reported by Bien Pham <phamnnb@sea.com>, some places
forgot to call nft_set_elem_mark_busy(), leading to double-deactivation
of elements.
This series contains the following patches:
1) Do not skip expired elements during walk otherwise elements might
never decrement the reference counter on data, leading to memleak.
2) Add a GC transaction API to replace the former attempt to deal with
races between control plane and GC. GC worker sets on NFT_SET_ELEM_DEAD_BIT
on elements and it creates a GC transaction to remove the expired
elements, GC transaction could abort in case of interference with
control plane and retried later (GC async). Set backends such as
rbtree and pipapo also perform GC from control plane (GC sync), in
such case, element deactivation and removal is safe because mutex
is held then collected elements are released via call_rcu().
3) Adapt existing set backends to use the GC transaction API.
4) Update rhash set backend to set on _DEAD bit to report deleted
elements from datapath for GC.
5) Remove old GC batch API and the NFT_SET_ELEM_BUSY_BIT.
* tag 'nf-23-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: remove busy mark and gc batch API
netfilter: nft_set_hash: mark set element as dead when deleting from packet path
netfilter: nf_tables: adapt set backend to use GC transaction API
netfilter: nf_tables: GC transaction API to avoid race with control plane
netfilter: nf_tables: don't skip expired elements during walk
====================
Link: https://lore.kernel.org/r/20230810070830.24064-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>