linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-09 04:21:03 -04:00

Author	SHA1	Message	Date
Cosmin Ratiu	107a034d5c	net/mlx5: qos: Store rate groups in a qos domain Groups are currently maintained as a list in their corresponding eswitch, protected by the esw state_lock. The upcoming cross-eswitch scheduling feature cannot work with this approach, as it would require acquiring multiple eswitch locks (in the correct order) in order to maintain group membership. This commit moves the rate groups into a new 'qos domain' struct and adds explicit qos init/cleanup steps to the eswitch init/cleanup. Upcoming patches will expand the qos domain struct and allow it to be shared between eswitches. For now, qos domains are private to each esw so there's only an extra indirection. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:12:00 +02:00
Cosmin Ratiu	43f9011a3d	net/mlx5: qos: Rename rate group 'list' as 'parent_entry' 'list' is not very descriptive, I prefer list membership to clearly specify which list the entry belongs to. This commit renames the list entry into the esw groups list as 'parent_entry' to make the code more readable. This is a no-op change. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:12:00 +02:00
Cosmin Ratiu	0c4cf09eca	net/mlx5: qos: Add an explicit 'dev' to vport trace calls vport qos trace calls used vport->dev implicitly as the device to which the command was sent (and thus the device logged in traces). But that will no longer be the case for cross-esw scheduling, where the commands have to be sent to the group esw device instead. This commit corrects that. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:12:00 +02:00
Cosmin Ratiu	b9cfe193eb	net/mlx5: qos: Store the eswitch in a mlx5_esw_rate_group The rate groups are about to be moved out of eswitches, so store a reference to the eswitch they belong to so things can still work later. This allows dropping the esw parameter from a couple of functions and simplifying some of the code. Use this opportunity to make sure that vport scheduling element commands are always sent to the group eswitch, because that will be relevant for cross-esw scheduling. For now though, the eswitches are not different. There is no functionality change here. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	e9fa32f110	net/mlx5: qos: Drop 'esw' param from vport qos functions The vport has a pointer to its own eswitch in vport->dev->priv.eswitch, so passing the same eswitch as a parameter to the various functions manipulating vport qos is superfluous at best and prone to errors at worst. More importantly, with the upcoming cross-esw scheduling changes, the eswitch that should receive the various scheduling element commands is NOT the same as the vport's eswitch, so the current code's assumptions will break. To avoid confusion and bugs, this commit drops the 'esw' parameter from all vport qos functions and uses the vport's own eswitch pointer instead. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	a87a561b80	net/mlx5: qos: Always create group0 All vports not explicitly members of a group with QoS enabled are part of the internal esw group0, except when the hw reports that groups aren't supported (log_esw_max_sched_depth == 0). This creates corner cases in the code, which has to make sure that this case is supported. Additionally, the groups are about to be moved out of eswitches, and group0 being NULL creates additional complications there. This patch makes sure to always create group0, even if max sched depth is 0. In that case, a software-only group0 is created referencing the root TSAR. Vports can point to this group when their QoS is enabled and they'll be attached to the root TSAR directly. This eliminates corner cases in the code by offering the guarantee that if qos is enabled, vport->qos.group is non-NULL. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	d3a3b0765e	net/mlx5: qos: Maintain rate group vport members in a list Previously, finding group members was done by iterating over all vports of an eswitch and comparing their group with the required one, but that approach will break down when a group can contain vports from multiple eswitches. Solve that by maintaining a list of vport members. Instead of iterating over esw vports, loop over the members list. Use this opportunity to provide two new functions to allocate and free a group, so that the number of state transitions is smaller. This will also be used in a future patch. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	8746eeb7f8	net/mlx5: qos: Refactor and document bw_share calculation The previous function (esw_qos_calculate_group_min_rate_divider) had two completely different modes of execution, depending on the 'group_level' parameter. Split it into two separate functions: - esw_qos_calculate_min_rate_divider - computes min across groups. - esw_qos_calculate_group_min_rate_divider - computes min in a group. Fold the divider calculation into the corresponding normalize functions to avoid having the caller compute the corresponding divider. Also rename the normalize functions to better indicate what level they're operating on. Finally, document everything so that this topic can more easily be understood by future maintainers. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	16efefde21	net/mlx5: qos: Consistently name vport vars as 'vport' The current mixture of 'vport' and 'evport' can be improved. There is no functional change. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	158205ca4b	net/mlx5: qos: Rename vport 'tsar' into 'sched_elem'. Vports do not use TSARs (Transmit Scheduling ARbiters), which are used for grouping multiple entities together. Use the correct name in variables and functions for clarity. Also move the scheduling context to a local variable in the esw_qos_sched_elem_config function instead of an empty parameter that needs to be provided by all callers. There is no functional change here. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Cosmin Ratiu	016f426a14	net/mlx5: qos: Flesh out element_attributes in mlx5_ifc.h This is used for multiple purposes, depending on the scheduling element created. There are a few helper struct defined a long time ago, but they are not easy to find in the file and they are about to get new members. This commit cleans up this area a bit by: - moving the helper structs closer to where they are relevant. - defining a helper union to include all of them to help discoverability. - making use of it everywhere element_attributes is used. - using a consistent 'attr' name. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 13:11:59 +02:00
Paolo Abeni	d9d28b6f6a	Merge branch 'eth-fbnic-add-timestamping-support' Vadim Fedorenko says: ==================== eth: fbnic: add timestamping support The series is to add timestamping support for Meta's NIC driver. Changelog: v3 -> v4: - use adjust_by_scaled_ppm() instead of open coding it - adjust cached value of high bits of timestamp to be sure it is older then incoming timestamps v2 -> v3: - rebase on top of net-next - add doc to describe retur value of fbnic_ts40_to_ns() v1 -> v2: - adjust comment about using u64 stats locking primitive - fix typo in the first patch - Cc Richard ==================== Link: https://patch.msgid.link/20241008181436.4120604-1-vadfed@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 12:52:29 +02:00
Vadim Fedorenko	96f358f75d	eth: fbnic: add ethtool timestamping statistics Add counters of packets with HW timestamps requests and lost timestamps with no associated skbs. Use ethtool interface to report these counters. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 12:52:12 +02:00
Vadim Fedorenko	ad3d9f8bc6	eth: fbnic: add TX packets timestamping support Add TX configuration to ethtool interface. Add processing of TX timestamp completions as well as configuration to request HW to create TX timestamp completion. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 12:52:11 +02:00
Vadim Fedorenko	6a2b3ede95	eth: fbnic: add RX packets timestamping support Add callbacks to support timestamping configuration via ethtool. Add processing of RX timestamps. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 12:52:11 +02:00
Vadim Fedorenko	ad8e66a4d9	eth: fbnic: add initial PHC support Create PHC device and provide callbacks needed for ptp_clock device. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 12:52:11 +02:00
Vadim Fedorenko	be65bfc957	eth: fbnic: add software TX timestamping support Add software TX timestamping support. RX software timestamping is implemented in the core and there is no need to provide special flag in the driver anymore. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 12:52:11 +02:00
Breno Leitao	9e542ff8b7	net: Remove likely from l3mdev_master_ifindex_by_index The likely() annotation in l3mdev_master_ifindex_by_index() has been found to be incorrect 100% of the time in real-world workloads (e.g., web servers). Annotated branches shows the following in these servers: correct incorrect % Function File Line 0 169053813 100 l3mdev_master_ifindex_by_index l3mdev.h 81 This is happening because l3mdev_master_ifindex_by_index() is called from __inet_check_established(), which calls l3mdev_master_ifindex_by_index() passing the socked bounded interface. l3mdev_master_ifindex_by_index(net, sk->sk_bound_dev_if); Since most sockets are not going to be bound to a network device, the likely() is giving the wrong assumption. Remove the likely() annotation to ensure more accurate branch prediction. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241008163205.3939629-1-leitao@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-10 11:57:34 +02:00
Jakub Kicinski	09cf85ef18	Merge branch 'ipv4-namespacify-ipv4-address-hash-table' Kuniyuki Iwashima says: ==================== ipv4: Namespacify IPv4 address hash table. This is a prep of per-net RTNL conversion for RTM_(NEW\|DEL\|SET)ADDR. Currently, each IPv4 address is linked to the global hash table, and this needs to be protected by another global lock or namespacified to support per-net RTNL. Adding a global lock will cause deadlock in the rtnetlink path and GC, rtnetlink check_lifetime \|- rtnl_net_lock(net) \|- acquire the global lock \|- acquire the global lock \|- check ifa's netns `- put ifa into hash table `- rtnl_net_lock(net) so we need to namespacify the hash table. The IPv6 one is already namespacified, let's follow that. v2: https://lore.kernel.org/netdev/20241004195958.64396-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20241001024837.96425-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20241008172906.1326-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 20:08:10 -07:00
Kuniyuki Iwashima	99ee348e6a	ipv4: Retire global IPv4 hash table inet_addr_lst. No one uses inet_addr_lst anymore, so let's remove it. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-5-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 20:08:08 -07:00
Kuniyuki Iwashima	1675f38521	ipv4: Namespacify IPv4 address GC. Each IPv4 address could have a lifetime, which is useful for DHCP, and GC is periodically executed as check_lifetime_work. check_lifetime() does the actual GC under RTNL. 1. Acquire RTNL 2. Iterate inet_addr_lst 3. Remove IPv4 address if expired 4. Release RTNL Namespacifying the GC is required for per-netns RTNL, but using the per-netns hash table will shorten the time on the hash bucket iteration under RTNL. Let's add per-netns GC work and use the per-netns hash table. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-4-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 20:08:08 -07:00
Kuniyuki Iwashima	49e6131942	ipv4: Use per-netns hash table in inet_lookup_ifaddr_rcu(). Now, all IPv4 addresses are put in the per-netns hash table. Let's use it in inet_lookup_ifaddr_rcu(). Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 20:08:08 -07:00
Kuniyuki Iwashima	87173021f1	ipv4: Link IPv4 address to per-netns hash table. As a prep for per-netns RTNL conversion, we want to namespacify the IPv4 address hash table and the GC work. Let's allocate the per-netns IPv4 address hash table to net->ipv4.inet_addr_lst and link IPv4 addresses into it. The actual users will be converted later. Note that the IPv6 address hash table is already namespacified. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 20:08:07 -07:00
Jakub Kicinski	22ee378eb6	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-10-08 (ice, iavf, igb, e1000e, e1000) This series contains updates to ice, iavf, igb, e1000e, and e1000 drivers. For ice: Wojciech adds support for ethtool reset. Paul adds support for hardware based VF mailbox limits for E830 devices. Jake adjusts to a common iterator in ice_vc_cfg_qs_msg() and moves storing of max_frame and rx_buf_len from VSI struct to the ring structure. Hongbo Li uses assign_bit() to replace an open-coded instance. Markus Elfring adjusts a couple of PTP error paths to use a common, shared exit point. Yue Haibing removes unused declarations. For iavf: Yue Haibing removes unused declarations. For igb: Yue Haibing removes unused declarations. For e1000e: Takamitsu Iwai removes unneccessary writel() calls. Joe Damato adds support for netdev-genl support to query IRQ, NAPI, and queue information. For e1000: Joe Damato adds support for netdev-genl support to query IRQ, NAPI, and queue information. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: e1000: Link NAPI instances to queues and IRQs e1000e: Link NAPI instances to queues and IRQs e1000e: Remove duplicated writel() in e1000_configure_tx/rx() igb: Cleanup unused declarations iavf: Remove unused declarations ice: Cleanup unused declarations ice: Use common error handling code in two functions ice: Make use of assign_bit() API ice: store max_frame and rx_buf_len only in ice_rx_ring ice: consistently use q_idx in ice_vc_cfg_qs_msg() ice: add E830 HW VF mailbox message limit support ice: Implement ethtool reset support ==================== Link: https://patch.msgid.link/20241008233441.928802-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 20:04:43 -07:00
Alexander Zubkov	80c549cd1a	Fix misspelling of "accept" in net Several files have "accept" misspelled as "accpet*" in the comments. Fix all such occurrences. Signed-off-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008162756.22618-2-green@qrator.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:55:40 -07:00
Eric Dumazet	e4650d7ae4	net_sched: sch_sfq: handle bigger packets SFQ has an assumption on dealing with packets smaller than 64KB. Even before BIG TCP, TCA_STAB can provide arbitrary big values in qdisc_pkt_len(skb) It is time to switch (struct sfq_slot)->allot to a 32bit field. sizeof(struct sfq_slot) is now 64 bytes, giving better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20241008111603.653140-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:50:31 -07:00
Minda Chen	0a316b16a6	net: stmmac: Add DW QoS Eth v4/v5 ip payload error statistics Add DW QoS Eth v4/v5 ip payload error statistics, and rename descriptor bit macro because v4/v5 descriptor IPCE bit claims ip checksum error or TCP/UDP/ICMP segment length error. Here is bit description from DW QoS Eth data book(Part 19.6.2.2) bit7 IPCE: IP Payload Error When this bit is programmed, it indicates either of the following: 1).The 16-bit IP payload checksum (that is, the TCP, UDP, or ICMP checksum) calculated by the MAC does not match the corresponding checksum field in the received segment. 2).The TCP, UDP, or ICMP segment length does not match the payload length value in the IP Header field. 3).The TCP, UDP, or ICMP segment length is less than minimum allowed segment length for TCP, UDP, or ICMP. Signed-off-by: Minda Chen <minda.chen@starfivetech.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Link: https://patch.msgid.link/20241008111443.81467-1-minda.chen@starfivetech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:48:58 -07:00
Tobias Klauser	3a1beabe11	ipv6: Remove redundant unlikely() IS_ERR_OR_NULL() already implies unlikely(). Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008085454.8087-1-tklauser@distanz.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:40:46 -07:00
Eric Dumazet	4daf4dc275	ipv6: switch inet6_acaddr_hash() to less predictable hash commit `2384d02520` ("net/ipv6: Add anycast addresses to a global hashtable") added inet6_acaddr_hash(), using ipv6_addr_hash() and net_hash_mix() to get hash spreading for typical users. However ipv6_addr_hash() is highly predictable and a malicious user could abuse a specific hash bucket. Switch to __ipv6_addr_jhash(). We could use a dedicated secret, or reuse net_hash_mix() as I did in this patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008121307.800040-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:33:57 -07:00
Eric Dumazet	4a0ec2aa07	ipv6: switch inet6_addr_hash() to less predictable hash In commit `3f27fb2321` ("ipv6: addrconf: add per netns perturbation in inet6_addr_hash()"), I added net_hash_mix() in inet6_addr_hash() to get better hash dispersion, at a time all netns were sharing the hash table. Since then, commit `21a216a8fc` ("ipv6/addrconf: allocate a per netns hash table") made the hash table per netns. We could remove the net_hash_mix() from inet6_addr_hash(), but there is still an issue with ipv6_addr_hash(). It is highly predictable and a malicious user can easily create thousands of IPv6 addresses all stored in the same hash bucket. Switch to __ipv6_addr_jhash(). We could use a dedicated secret, or reuse net_hash_mix() as I did in this patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008120101.734521-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:33:46 -07:00
Lorenzo Bianconi	2518b11963	net: airoha: Fix EGRESS_RATE_METER_EN_MASK definition Fix typo in EGRESS_RATE_METER_EN_MASK mask definition. This bus in not introducing any user visible problem since, even if we are setting EGRESS_RATE_METER_EN_MASK bit in REG_EGRESS_RATE_METER_CFG register, egress QoS metering is not supported yet since we are missing some other hw configurations (e.g token bucket rate, token bucket size). Introduced by commit `23020f0493` ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241009-airoha-fixes-v2-1-18af63ec19bf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:29:11 -07:00
Dr. David Alan Gilbert	3325964e99	net: liquidio: Remove unused cn23xx_dump_pf_initialized_regs cn23xx_dump_pf_initialized_regs() was added in 2016's commit `72c0091293` ("liquidio: CN23XX device init and sriov config") but hasn't been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241009003841.254853-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 19:25:34 -07:00
Jakub Kicinski	652c5017e2	Merge branch 'qca_spi-improvements-to-qca7000-sync' Stefan Wahren says: ==================== qca_spi: Improvements to QCA7000 sync This series contains patches which improve the QCA7000 sync behavior. ==================== Link: https://patch.msgid.link/20241007113312.38728-1-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 18:00:29 -07:00
Stefan Wahren	c81cdba640	qca_spi: Improve reset mechanism The commit `92717c2356` ("net: qca_spi: Avoid high load if QCA7000 is not available") fixed the high load in case the QCA7000 is not available but introduced sync delays for some corner cases like buffer errors. So add the reset requests to the atomics flags, which are polled by the SPI thread. As a result reset requests and sync state are now separated. This has the nice benefit to make the code easier to understand. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241007113312.38728-3-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 18:00:27 -07:00
Stefan Wahren	234b526896	qca_spi: Count unexpected WRBUF_SPC_AVA after reset After a reset of the QCA7000, the amount of available write buffer space should match QCASPI_HW_BUF_LEN. If this is not the case this error should be counted as such. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241007113312.38728-2-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 18:00:26 -07:00
xin.guo	d35bd24cea	tcp: remove unnecessary update for tp->write_seq in tcp_connect() Commit `783237e8da` ("net-tcp: Fast Open client - sending SYN-data") introduces tcp_connect_queue_skb() and it would overwrite tcp->write_seq, so it is no need to update tp->write_seq before invoking tcp_connect_queue_skb(). Signed-off-by: xin.guo <guoxin0309@gmail.com> Link: https://patch.msgid.link/1728289544-4611-1-git-send-email-guoxin0309@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:58:49 -07:00
Donald Hunter	54b771e6c6	doc: net: Fix .rst rendering of net_cachelines pages The doc pages under /networking/net_cachelines are unreadable because they lack .rst formatting for the tabular text. Add simple table markup and tidy up the table contents: - remove dashes that represent empty cells because they render as bullets and are not needed - replace 'struct_' with 'struct ' in the first column so that sphinx can render links for any structs that appear in the docs Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008165329.45647-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:34:49 -07:00
Jakub Kicinski	c786a2a8bc	Merge branch 'ipv4-convert-__fib_validate_source-and-its-callers-to-dscp_t' Guillaume Nault says: ==================== ipv4: Convert __fib_validate_source() and its callers to dscp_t. This patch series continues to prepare users of ->flowi4_tos to a future conversion of this field (__u8 to dscp_t). This time, we convert __fib_validate_source() and its call chain. The objective is to eventually make all users of ->flowi4_tos use a dscp_t value. Making ->flowi4_tos a dscp_t field will help avoiding regressions where ECN bits are erroneously interpreted as DSCP bits. ==================== Link: https://patch.msgid.link/cover.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Guillaume Nault	3768b40273	ipv4: Convert __fib_validate_source() to dscp_t. Pass a dscp_t variable to __fib_validate_source(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Only fib_validate_source() actually calls __fib_validate_source(). Since it already has a dscp_t variable to pass as parameter, we only need to remove the inet_dscp_to_dsfield() conversion. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/8206b0a64a21a208ed94774e261a251c8d7bc251.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Guillaume Nault	d36236ab52	ipv4: Convert fib_validate_source() to dscp_t. Pass a dscp_t variable to fib_validate_source(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. All callers of fib_validate_source() already have a dscp_t variable to pass as parameter. We just need to remove the inet_dscp_to_dsfield() conversions. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/08612a4519bc5a3578bb493fbaad82437ebb73dc.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Guillaume Nault	d329764087	ipv4: Convert ip_mc_validate_source() to dscp_t. Pass a dscp_t variable to ip_mc_validate_source(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Callers of ip_mc_validate_source() to consider are: * ip_route_input_mc() which already has a dscp_t variable to pass as parameter. We just need to remove the inet_dscp_to_dsfield() conversion. * udp_v4_early_demux() which gets the DSCP directly from the IPv4 header and can simply use the ip4h_dscp() helper. Also, stop including net/inet_dscp.h in udp.c as we don't use any of its declarations anymore. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/c91b2cca04718b7ee6cf5b9c1d5b40507d65a8d4.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Guillaume Nault	1a7c292617	ipv4: Convert ip_route_input_mc() to dscp_t. Pass a dscp_t variable to ip_route_input_mc(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Only ip_route_input_rcu() actually calls ip_route_input_mc(). Since it already has a dscp_t variable to pass as parameter, we only need to remove the inet_dscp_to_dsfield() conversion. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/0cc653ef59bbc0a28881f706d34896c61eba9e01.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Guillaume Nault	0936c67191	ipv4: Convert __mkroute_input() to dscp_t. Pass a dscp_t variable to __mkroute_input(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Only ip_mkroute_input() actually calls __mkroute_input(). Since it already has a dscp_t variable to pass as parameter, we only need to remove the inet_dscp_to_dsfield() conversion. While there, reorganise the function parameters to fill up horizontal space. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/40853c720aee4d608e6b1b204982164c3b76697d.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Guillaume Nault	34f28ffd62	ipv4: Convert ip_mkroute_input() to dscp_t. Pass a dscp_t variable to ip_mkroute_input(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Only ip_route_input_slow() actually calls ip_mkroute_input(). Since it already has a dscp_t variable to pass as parameter, we only need to remove the inet_dscp_to_dsfield() conversion. While there, reorganise the function parameters to fill up horizontal space. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/6aa71e28f9ff681cbd70847080e1ab6b526f94f1.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Guillaume Nault	2b78d30620	ipv4: Convert ip_route_use_hint() to dscp_t. Pass a dscp_t variable to ip_route_use_hint(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Only ip_rcv_finish_core() actually calls ip_route_use_hint(). Use the ip4h_dscp() helper to get the DSCP from the IPv4 header. While there, modify the declaration of ip_route_use_hint() in include/net/route.h so that it matches the prototype of its implementation in net/ipv4/route.c. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/c40994fdf804db7a363d04fdee01bf48dddda676.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 17:31:40 -07:00
Shradha Gupta	6607c17c6c	net: mana: Enable debugfs files for MANA device Implement debugfs in MANA driver to be able to view RX,TX,EQ queue specific attributes and dump their gdma queues. These dumps can be used by other userspace utilities to improve debuggability and troubleshooting Following files are added in debugfs: /sys/kernel/debug/mana/ \|-------------- 1 \|--------------- EQs \| \|------- eq0 \| \| \|---head \| \| \|---tail \| \| \|---eq_dump \| \|------- eq1 \| . \| . \| \|--------------- adapter-MTU \|--------------- vport0 \|------- RX-0 \| \|---cq_budget \| \|---cq_dump \| \|---cq_head \| \|---cq_tail \| \|---rq_head \| \|---rq_nbuf \| \|---rq_tail \| \|---rxq_dump \|------- RX-1 . . \|------- TX-0 \| \|---cq_budget \| \|---cq_dump \| \|---cq_head \| \|---cq_tail \| \|---sq_head \| \|---sq_pend_skb_qlen \| \|---sq_tail \| \|---txq_dump \|------- TX-1 . . Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-10-09 13:42:04 +01:00
Heiner Kallweit	1ffcc8d413	r8169: add support for the temperature sensor being available from RTL8125B This adds support for the temperature sensor being available from RTL8125B. Register information was taken from r8125 vendor driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-10-09 13:38:19 +01:00
David S. Miller	2a80d89256	Merge branch 'net-improve-multicast-group-join-performance' Jonas Rebmann says: ==================== improve multicast join group performance This series seeks to improve performance on updating igmp group memberships such as with IP_ADD_MEMBERSHIP or MCAST_JOIN_SOURCE_GROUP. Our use case was to add 2000 multicast memberships on a TQMLS1046A which took about 3.6 seconds for the membership additions alone. Our userspace reproducer tool was instrumented to log runtimes of the individual setsockopt invocations which clearly indicated quadratic complexity of setting up the membership with regard to the total number of multicast groups to be joined. We used perf to locate the hotspots and subsequently optimized the most costly sections of code. This series includes a patch to Linux igmp handling as well as a patch to the DPAA/Freescale driver. With both patches applied, our memberships can be set up in only about 87 miliseconds, which corresponds to a speedup of around 40. While we have acheived practically linear run-time complexity on the kernel side, a small quadratic factor remains in parts of the freescale driver code which we haven't yet optimized. We have by now payed little attention to the optimization potential in dropping group memberships, yet the dpaa patch applies to joining and leaving groups alike. Overall, this patch series brings great improvements in use cases involving large numbers of multicast groups, particularly when using the fsl_dpa driver, without noteworthy drawbacks in other scenarios. ==================== Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-10-09 12:50:11 +01:00
Jonas Rebmann	298f70b371	net: dpaa: use __dev_mc_sync in dpaa_set_rx_mode() The original driver first unregisters then re-registers all multicast addresses in the struct net_device_ops::ndo_set_rx_mode() callback. As the networking stack calls ndo_set_rx_mode() if a single multicast address change occurs, a significant amount of time may be used to first unregister and then re-register unchanged multicast addresses. This leads to performance issues when tracking large numbers of multicast addresses. Replace the unregister and register loop and the hand crafted mc_addr_list list handling with __dev_mc_sync(), to only update entries which have changed. On profiling with an fsl_dpa NIC, this patch presented a speedup of around 40 when successively setting up 2000 multicast groups using setsockopt(), without drawbacks on smaller numbers of multicast groups. Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Reviewed-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-10-09 12:50:11 +01:00
Jonas Rebmann	69a3272d78	net: ipv4: igmp: optimize ____ip_mc_inc_group() using mc_hash The runtime cost of joining a single multicast group in the current implementation of ____ip_mc_inc_group grows linearly with the number of existing memberships. This is caused by the linear search for an existing group record in the multicast address list. This linear complexity results in quadratic complexity when successively adding memberships, which becomes a performance bottleneck when setting up large numbers of multicast memberships. If available, use the existing multicast hash map mc_hash to quickly search for an existing group membership record. This leads to near-constant complexity on the addition of a new multicast record, significantly improving performance for workloads involving many multicast memberships. On profiling with a loopback device, this patch presented a speedup of around 6 when successively setting up 2000 multicast groups using setsockopt without measurable drawbacks on smaller numbers of multicast groups. Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-10-09 12:50:11 +01:00

1 2 3 4 5 ...

1309142 Commits