linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 18:25:26 -04:00

Author	SHA1	Message	Date
Jakub Kicinski	2eecd3a41e	eth: fbnic: fix reporting of alloc_failed qstats Rx processing under normal circumstances has 3 rings - 2 buffer rings (heads, payloads) and a completion ring. All the rings have a struct fbnic_ring. Make sure we expose alloc_failed counter from the buffer rings, previously only the alloc_failed from the completion ring was reported, even tho all ring types may increment this counter (buffer rings in __fbnic_fill_bdq()). This makes the pp_alloc_fail.py test pass, it expects the qstat to be incrementing as page pool injections happen. Reviewed-by: Simon Horman <horms@kernel.org> Fixes: `67dc4eb5fc` ("eth: fbnic: report software Rx queue stats") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-7-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-09 11:10:02 +02:00
Jakub Kicinski	858b78b24a	eth: fbnic: fix saving stats from XDP_TX rings on close When rings are freed - stats get added to the device level stat structs. Save the stats from the XDP_TX ring just as Tx stats. Previously they would be saved to Rx and Tx stats. So we'd not see XDP_TX packets as Rx during runtime but after an down/up cycle the packets would appear in stats. Correct the helper used by ethtool code which does a runtime config switch. Reviewed-by: Simon Horman <horms@kernel.org> Fixes: `5213ff0863` ("eth: fbnic: Collect packet statistics for XDP") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-4-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-09 11:10:02 +02:00
Jakub Kicinski	613e9e8dcb	eth: fbnic: fix accounting of XDP packets Make XDP-handled packets appear in the Rx stats. The driver has been counting XDP_TX packets on the Tx ring, but there wasn't much accounting on the Rx side (the Rx bytes appear to be incremented on XDP_TX but XDP_DROP / XDP_ABORT are only counted as Rx drops). Counting XDP_TX packets (not just bytes) in Rx stats looks like a simple bug of omission. The XDP_DROP handling appears to be intentional. Whether XDP_DROP packets should be counted in interface-level Rx stats is a bit unclear historically. When we were defining qstats, however, we clarified based on operational experience that in this context: name: rx-packets doc: \| Number of wire packets successfully received and passed to the stack. For drivers supporting XDP, XDP is considered the first layer of the stack, so packets consumed by XDP are still counted here. fbnic does not obey this requirement. Since XDP support has been added in current release cycle, instead of splitting interface and qstat handling - make them both follow the qstat definition. Another small tweak here is that we count bytes as received on the wire rather than post-XDP bytes (xdp_get_buff_len() vs skb->len). Reviewed-by: Simon Horman <horms@kernel.org> Fixes: `5213ff0863` ("eth: fbnic: Collect packet statistics for XDP") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-09 11:10:02 +02:00
Jakub Kicinski	7e617d57f2	eth: fbnic: fix missing programming of the default descriptor XDP_TX typically uses no offloads. To optimize XDP we added a "default descriptor" feature to the chip, which allows us to send XDP frames with just the buffer descriptors (DMA address + length). All the metadata descriptors are derived from the queue config. Commit under Fixes missed adding setting the defaults up when transplanting the code from the prototype driver. Importantly after reset the "request completion" bit is not set. Packets still get sent but there's no completion, so ring is not cleaned up. We can send one ring's worth of packets and then will start dropping all frames that got the XDP_TX action from the XDP prog. Reviewed-by: Simon Horman <horms@kernel.org> Fixes: `168deb7b31` ("eth: fbnic: Add support for XDP_TX action") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-09 11:10:01 +02:00
Mohsin Bashir	20a2e46f9e	eth: fbnic: Add support to read lane count We are reporting the lane count in the link settings but the flag is not set to indicate that the driver supports lanes. Set the flag to report lane count. ~]# ethtool eth0 \| grep Lanes Lanes: 2 Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250924184445.2293325-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-26 17:27:35 -07:00
Vadim Fedorenko	cc2f081299	ethtool: add FEC bins histogram report IEEE 802.3ck-2022 defines counters for FEC bins and 802.3df-2024 clarifies it a bit further. Implement reporting interface through as addition to FEC stats available in ethtool. Drivers can leave bin counter uninitialized if per-lane values are provided. In this case the core will recalculate summ for the bin. Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20250924124037.1508846-2-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-26 16:49:18 -07:00
Mohsin Bashir	bb6a22651b	eth: fbnic: Read module EEPROM Add support to read module EEPROM for fbnic. Towards this, add required support to issue a new command to the firmware and to receive the response to the corresponding command. Create a local copy of the data in the completion struct before writing to ethtool_module_eeprom to avoid writing to data in case it is freed. Given that EEPROM pages are small, the overhead of additional copy is negligible. Do not block API with explicit checks since API has appropriate checks in place for length, offset, and page. Explicitly check bank, page, offset, and length in fbnic_fw_parse_qsfp_read_resp() to match EEPROM read responses to the correct request. This is important because if the driver times out waiting for an EEPROM read response, a subsequent read request with different values is susceptible to receiving an erroneous response (i.e., the response to the previous request). Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250922231855.3717483-1-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-25 11:58:49 +02:00
Jakub Kicinski	e6afcd60c2	eth: fbnic: add OTP health reporter OTP memory ("fuses") are used for secure boot and anti-rollback protection. The OTP memory is ECC protected. Check for its health periodically to notice when the chip is starting to go bad. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-10-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	6da8344f92	eth: fbnic: report FW uptime in health diagnose FW crashes are detected based on uptime going back, expose the uptime via devlink health diagnose. $ devlink -j health diagnose pci/0000:01:00.0 reporter fw {"last_heartbeat":{"fw_uptime":{"sec":201,"msec":76}}} $ devlink -j health diagnose pci/0000:01:00.0 reporter fw last_heartbeat: fw_uptime: sec: 201 msec: 76 Reviewed-by: Lee Trager <lee@trager.us> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-9-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	005a54722e	eth: fbnic: add FW health reporter Add a health reporter to catch FW crashes. Dumping the reporter if FW has not crashed will create a snapshot of FW memory. Reviewed-by: Lee Trager <lee@trager.us> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-8-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	5df1d0a084	eth: fbnic: support FW communication for core dump To read FW core dump we need to issue two commands to FW: - first get the FW core dump info - second read the dump chunk by chunk Implement these two FW commands. Subsequent commits will use them to expose FW dump via devlink heath. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-7-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	a8896d14fc	eth: fbnic: support allocating FW completions with extra space Support allocating extra space after the FW completion. This makes it easy to pass extra variable size buffer space to FW response handlers without worrying about synchronization (completion itself is already refcounted). Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-6-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	6ae7da8e9e	eth: fbnic: reprogram TCAMs after FW crash FW may mess with the TCAM after it boots, to try to restore the traffic flow to the BMC (it may not be aware that the host is already up). Make sure that we reprogram the TCAMs after detecting a crash. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-5-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	504f8b7119	eth: fbnic: factor out clearing the action TCAM We'll want to wipe the driver TCAM state after FW crash, to force a re-programming. Factor out the clearing logic. Remove the micro- -optimization to skip clearing the BMC entry twice, it doesn't hurt. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-4-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	7fd1f7bac2	eth: fbnic: use fw uptime to detect fw crashes Currently we only detect FW crashes when it stops responding to heartbeat messages. FW has a watchdog which will reset it in case of crashes. Use FW uptime sent in the ownership and heartbeat messages to detect that the watchdog has fired (uptime went down). Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	e6c8ab0a11	eth: fbnic: make fbnic_fw_log_write() parameter const Make the log message parameter const, it's not modified and this lets us pass in strings which are const for the caller. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 11:37:23 +02:00
Jakub Kicinski	b127e355f1	eth: fbnic: support devmem Tx Support devmem Tx. We already use skb_frag_dma_map(), we just need to make sure we don't try to unmap the frags. Check if frag is unreadable and mark the ring entry. # ./tools/testing/selftests/drivers/net/hw/devmem.py TAP version 13 1..3 ok 1 devmem.check_rx ok 2 devmem.check_tx ok 3 devmem.check_tx_chunks # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Acked-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916145401.1464550-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 10:12:05 +02:00
Jakub Kicinski	0574c27cbe	eth: fbnic: support persistent NAPI config No shenanigans in this driver, AFAIU, pass the vector index to NAPI registration. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250905022254.2635707-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-09 16:03:27 +02:00
Jakub Kicinski	da43127a8e	eth: fbnic: support queue ops / zero-copy Rx Support queue ops. fbnic doesn't shut down the entire device just to restart a single queue. ./tools/testing/selftests/drivers/net/hw/iou-zcrx.py TAP version 13 1..3 ok 1 iou-zcrx.test_zcrx ok 2 iou-zcrx.test_zcrx_oneshot ok 3 iou-zcrx.test_zcrx_rss # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Acked-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-15-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	3812339b6c	eth: fbnic: don't pass NAPI into pp alloc Queue API may ask us to allocate page pools when the device is down, to validate that we ingested a memory provider binding. Don't require NAPI to be passed to fbnic_alloc_qt_page_pools(), to make calling fbnic_alloc_qt_page_pools() without NAPI possible. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-14-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	49c429ec6b	eth: fbnic: defer page pool recycling activation to queue start We need to be more careful about when direct page pool recycling is enabled in preparation for queue ops support. Don't set the NAPI pointer, call page_pool_enable_direct_recycling() from the function that activates the queue (once the config can no longer fail). Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-13-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	8a11010fdd	eth: fbnic: allocate unreadable page pool for the payloads Allow allocating a page pool with unreadable memory for the payload ring (sub1). We need to provide the queue ID so that the memory provider can match the PP. Use the appropriate page pool DMA sync helper. For unreadable mem the direction has to be FROM_DEVICE. The default is BIDIR for XDP, but obviously unreadable mem is not compatible with XDP in the first place, so that's fine. While at it remove the define for page pool flags. The rxq_idx is passed to fbnic_alloc_rx_qt_resources() explicitly to make it easy to allocate page pools without NAPI (see the patch after the next). Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-12-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	709da681f4	eth: fbnic: split fbnic_fill() Factor out handling a single nv from fbnic_fill() to make it reusable for queue ops. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-10-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	8a47d940cf	eth: fbnic: split fbnic_enable() Factor out handling a single nv from fbnic_enable() to make it reusable for queue ops. Use a __ prefix for the factored out code. The real fbnic_nv_enable() which will include fbnic_wrfl() will be added with the qops, to avoid unused function warnings. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-9-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	be2be74af8	eth: fbnic: split fbnic_flush() Factor out handling a single nv from fbnic_flush() to make it reusable for queue ops. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-8-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	cbfc047429	eth: fbnic: split fbnic_disable() Factor out handling a single nv from fbnic_disable() to make it reusable for queue ops. Use a __ prefix for the factored out code. The real fbnic_nv_disable() which will include fbnic_wrfl() will be added with the qops, to avoid unused function warnings. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-7-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	4ddb17c1a2	eth: fbnic: request ops lock We'll add queue ops soon so. queue ops will opt the driver into extra locking. Request this locking explicitly already to make future patches smaller and easier to review. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-6-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:17 +02:00
Jakub Kicinski	426e13db36	eth: fbnic: use netmem_ref where applicable Use netmem_ref instead of struct page pointer in prep for unreadable memory. fbnic has separate free buffer submission queues for headers and for data. Refactor the helper which returns page pointer for a submission buffer to take the high level queue container, create a separate handler for header and payload rings. This ties the "upcast" from netmem to system page to use of sub0 which we know has system pages. Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-5-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:16 +02:00
Jakub Kicinski	b6396b71d1	eth: fbnic: move page pool alloc to fbnic_alloc_rx_qt_resources() page pools are now at the ring level, move page pool alloc to fbnic_alloc_rx_qt_resources(), and freeing to fbnic_free_qt_resources(). This significantly simplifies fbnic_alloc_napi_vector() error handling, by removing a late failure point. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-4-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:16 +02:00
Jakub Kicinski	894d4a4ea6	eth: fbnic: move xdp_rxq_info_reg() to resource alloc Move rxq_info and mem model registration from fbnic_alloc_napi_vector() and fbnic_alloc_nv_resources() to fbnic_alloc_rx_qt_resources(). The rxq_info is now registered later in the process, but that should not cause any issues. rxq_info lives in the fbnic_q_triad (qt) struct so qt init is a more natural place. Encapsulating the logic in the qt functions will also allow simplifying the cleanup in the NAPI related alloc functions in the next commit. Rx does not have a dedicated fbnic_free_rx_qt_resources(), but we can use xdp_rxq_info_is_reg() to tell whether given rxq_info was in use (effectively - if it's a qt for an Rx queue). Having to pass nv into fbnic_alloc_rx_qt_resources() is not great in terms of layering, but that's temporary, pp will move soon.. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:16 +02:00
Jakub Kicinski	33478dca2b	eth: fbnic: move page pool pointer from NAPI to the ring struct In preparation for memory providers we need a closer association between queues and page pools. We used to have a page pool at the NAPI level to serve all associated queues but with MP the queues under a NAPI may no longer be created equal. The "ring" structure in fbnic is a descriptor ring. We have separate "rings" for payload and header pages ("to device"), as well as a ring for completions ("from device"). Technically we only need the page pool pointers in the "to device" rings, so adding the pointer to the ring struct is a bit wasteful. But it makes passing the structures around much easier. For now both "to device" rings store a pointer to the same page pool. Using more than one queue per NAPI is extremely rare so don't bother trying to share a single page pool between queues. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-04 10:19:16 +02:00
Jakub Kicinski	d23ad54de7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.17-rc4). No conflicts. Adjacent changes: drivers/net/ethernet/intel/idpf/idpf_txrx.c `02614eee26` ("idpf: do not linearize big TSO packets") `6c4e684802` ("idpf: remove obsolete stashing code") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-29 11:48:01 -07:00
Alexander Duyck	cee8d21d80	fbnic: Push local unicast MAC addresses to FW to populate TCAMs The MACDA TCAM can only be accessed by one entity at a time and as such we cannot have simultaneous reads from the firmware to probe for changes from the host. As such we have to send a message indicating what the state of the MACDA is to the firmware when we updated it so that the firmware can sync up the TCAMs it owns to route BMC packets to the host. To support that we are adding a new message that is invoked when we write the MACDA that will notify the firmware of updates from the host and allow it to sync up the TCAM configuration to match the one on the host side. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/175623750782.2246365.9178255870985916357.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:51:07 +02:00
Alexander Duyck	04a230b27d	fbnic: Add logic to repopulate RPC TCAM if BMC enables channel The BMC itself can decide to abandon a link and move onto another link in the event of things such as a link flap. As a result the driver may load with the BMC not present, and then needs to update things to support the BMC being present while the link is up and the NIC is passing traffic. To support this we add support to the watchdog to reinitialize the RPC to support adding the BMC unicast, multicast, and multicast promiscuous filters while the link is up and the NIC owns the link. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/175623750101.2246365.8518307324797058580.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:51:07 +02:00
Alexander Duyck	284a67d59f	fbnic: Pass fbnic_dev instead of netdev to __fbnic_set/clear_rx_mode To make the __fbnic_set_rx_mode and __fbnic_clear_rx_mode calls usable by more points in the code we can make to that they expect a fbnic_dev pointer instead of a netdev pointer. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/175623749436.2246365.6068665520216196789.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:51:07 +02:00
Alexander Duyck	cf79bd4495	fbnic: Move promisc_sync out of netdev code and into RPC path In order for us to support the BMC possibly connecting, disconnecting, and then reconnecting we need to be able to support entities outside of just the NIC setting up promiscuous mode as the BMC can use a multicast promiscuous setup. To support that we should move the promisc_sync code out of the netdev and into the RPC section of the driver so that it is reachable from more paths. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/175623748769.2246365.2130394904175851458.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:51:07 +02:00
Alexander Duyck	6ede14a2c6	fbnic: Move phylink resume out of service_task and into open/close The fbnic driver was presenting with the following locking assert coming out of a PM resume: [ 42.208116][ T164] RTNL: assertion failed at drivers/net/phy/phylink.c (2611) [ 42.208492][ T164] WARNING: CPU: 1 PID: 164 at drivers/net/phy/phylink.c:2611 phylink_resume+0x190/0x1e0 [ 42.208872][ T164] Modules linked in: [ 42.209140][ T164] CPU: 1 UID: 0 PID: 164 Comm: bash Not tainted 6.17.0-rc2-virtme #134 PREEMPT(full) [ 42.209496][ T164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014 [ 42.209861][ T164] RIP: 0010:phylink_resume+0x190/0x1e0 [ 42.210057][ T164] Code: 83 e5 01 0f 85 b0 fe ff ff c6 05 1c cd 3e 02 01 90 ba 33 0a 00 00 48 c7 c6 20 3a 1d a5 48 c7 c7 e0 3e 1d a5 e8 21 b8 90 fe 90 <0f> 0b 90 90 e9 86 fe ff ff e8 42 ea 1f ff e9 e2 fe ff ff 48 89 ef [ 42.210708][ T164] RSP: 0018:ffffc90000affbd8 EFLAGS: 00010296 [ 42.210983][ T164] RAX: 0000000000000000 RBX: ffff8880078d8400 RCX: 0000000000000000 [ 42.211235][ T164] RDX: 0000000000000000 RSI: 1ffffffff4f10938 RDI: 0000000000000001 [ 42.211466][ T164] RBP: 0000000000000000 R08: ffffffffa2ae79ea R09: fffffbfff4b3eb84 [ 42.211707][ T164] R10: 0000000000000003 R11: 0000000000000000 R12: ffff888007ad8000 [ 42.211997][ T164] R13: 0000000000000002 R14: ffff888006a18800 R15: ffffffffa34c59e0 [ 42.212234][ T164] FS: 00007f0dc8e39740(0000) GS:ffff88808f51f000(0000) knlGS:0000000000000000 [ 42.212505][ T164] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.212704][ T164] CR2: 00007f0dc8e9fe10 CR3: 000000000b56d003 CR4: 0000000000772ef0 [ 42.213227][ T164] PKRU: 55555554 [ 42.213366][ T164] Call Trace: [ 42.213483][ T164] <TASK> [ 42.213565][ T164] __fbnic_pm_attach.isra.0+0x8e/0xa0 [ 42.213725][ T164] pci_reset_function+0x116/0x1d0 [ 42.213895][ T164] reset_store+0xa0/0x100 [ 42.214025][ T164] ? pci_dev_reset_attr_is_visible+0x50/0x50 [ 42.214221][ T164] ? sysfs_file_kobj+0xc1/0x1e0 [ 42.214374][ T164] ? sysfs_kf_write+0x65/0x160 [ 42.214526][ T164] kernfs_fop_write_iter+0x2f8/0x4c0 [ 42.214677][ T164] ? kernfs_vma_page_mkwrite+0x1f0/0x1f0 [ 42.214836][ T164] new_sync_write+0x308/0x6f0 [ 42.214987][ T164] ? __lock_acquire+0x34c/0x740 [ 42.215135][ T164] ? new_sync_read+0x6f0/0x6f0 [ 42.215288][ T164] ? lock_acquire.part.0+0xbc/0x260 [ 42.215440][ T164] ? ksys_write+0xff/0x200 [ 42.215590][ T164] ? perf_trace_sched_switch+0x6d0/0x6d0 [ 42.215742][ T164] vfs_write+0x65e/0xbb0 [ 42.215876][ T164] ksys_write+0xff/0x200 [ 42.215994][ T164] ? __ia32_sys_read+0xc0/0xc0 [ 42.216141][ T164] ? do_user_addr_fault+0x269/0x9f0 [ 42.216292][ T164] ? rcu_is_watching+0x15/0xd0 [ 42.216442][ T164] do_syscall_64+0xbb/0x360 [ 42.216591][ T164] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 42.216784][ T164] RIP: 0033:0x7f0dc8ea9986 A bit of digging showed that we were invoking the phylink_resume as a part of the fbnic_up path when we were enabling the service task while not holding the RTNL lock. We should be enabling this sooner as a part of the ndo_open path and then just letting the service task come online later. This will help to enforce the correct locking and brings the phylink interface online at the same time as the network interface, instead of at a later time. I tested this on QEMU to verify this was working by putting the system to sleep using "echo mem > /sys/power/state" to put the system to sleep in the guest and then using the command "system_wakeup" in the QEMU monitor. Fixes: `69684376ee` ("eth: fbnic: Add link detection") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/175616257316.1963577.12238158800417771119.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:57:08 -07:00
Alexander Duyck	2ddaa562b4	fbnic: Fixup rtnl_lock and devl_lock handling related to mailbox code The exception handling path for the __fbnic_pm_resume function had a bug in that it was taking the devlink lock and then exiting to exception handling instead of waiting until after it released the lock to do so. In order to handle that I am swapping the placement of the unlock and the exception handling jump to label so that we don't trigger a deadlock by holding the lock longer than we need to. In addition this change applies the same ordering to the rtnl_lock/unlock calls in the same function as it should make the code easier to follow if it adheres to a consistent pattern. Fixes: `82534f446d` ("eth: fbnic: Add devlink dev flash support") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/175616256667.1963577.5543500806256052549.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:57:08 -07:00
Mohsin Bashir	e9faf4db5f	eth: fbnic: Add pause stats support Add support to read pause stats for fbnic. Unlike FEC and PCS stats, pause stats won't wrap, do not fetch them under the service task. Since, they are exclusively accessed via the ethtool API, don't include them in fbnic_get_hw_stats(). ]# ethtool -I -a eth0 Pause parameters for eth0: Autonegotiate: on RX: off TX: off Statistics: tx_pause_frames: 0 rx_pause_frames: 0 Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	33c493791b	eth: fbnic: Read PHY stats via the ethtool API Provide support to read PHY stats (FEC and PCS) via the ethtool API. ]# ethtool -I --show-fec eth0 FEC parameters for eth0: Supported/Configured FEC encodings: RS Active FEC encoding: RS Statistics: corrected_blocks: 0 uncorrectable_blocks: 0 ]# ethtool -S eth0 --groups eth-phy Standard stats for eth0: eth-phy-SymbolErrorDuringCarrier: 0 Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	df4c5d9a29	eth: fbnic: Fetch PHY stats from device Add support to fetch PHY stats consisting of PCS and FEC stats from the device. When reading the stats counters, the lo part is read first, which latches the hi part to ensure consistent reading of the stats counter. FEC and PCS stats can wrap depending on the access frequency. To prevent wrapping, fetch these stats periodically under the service task. Also to maintain consistency fetch these stats along with other 32b stats under __fbnic_get_hw_stats32(). Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	bcf54e5d7c	eth: fbnic: Reset MAC stats Reset the MAC stats as part of the hardware stats reset to ensure consistency. Currently, hardware stats are reset during device bring-up and upon experiencing PCI errors; however, MAC stats are being skipped during these resets. When fbnic_reset_hw_stats() is called upon recovering from PCI error, MAC stats are accessed outside the rtnl_lock. The only other access to MAC stats is via the ethtool API, which is protected by rtnl_lock. This can result in concurrent access to MAC stats and a potential race. Protect the fbnic_reset_hw_stats() call in __fbnic_pm_attach() with rtnl_lock to avoid this. Note that fbnic_reset_hw_mac_stats() is called outside the hardware stats lock which protects access to the fbnic_hw_stats. This is intentional because MAC stats are fetched from the device outside this lock and are exclusively read via the ethtool API. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	b1161b1863	eth: fbnic: Reset hw stats upon PCI error Upon experiencing a PCI error, fbnic reset the device to recover from the failure. Reset the hardware stats as part of the device reset to ensure accurate stats reporting. Note that the reset is not really resetting the aggregate value to 0, which may result in a spike for a system collecting deltas in stats. Rather, the reset re-latches the current value as previous, in case HW got reset. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	2ee5c8c0c2	eth: fbnic: Move hw_stats_lock out of fbnic_dev Move hw_stats_lock out of fbnic_dev to a more appropriate struct fbnic_hw_stats since the only use of this lock is to protect access to the hardware stats. While at it, enclose the lock and stats initialization in a single init call. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:18 -07:00
Mohsin Bashir	7fedb8f267	eth: fbnic: Report XDP stats via ethtool Add support to collect XDP stats via ethtool API. We record packets and bytes sent, and packets dropped on the XDP_TX path. ethtool -S eth0 \| grep xdp \| grep -v "0" xdp_tx_queue_13_packets: 2 xdp_tx_queue_13_bytes: 16126 Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250813221319.3367670-10-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-19 10:51:16 +02:00
Mohsin Bashir	5213ff0863	eth: fbnic: Collect packet statistics for XDP Add support for XDP statistics collection and reporting via rtnl_link and netdev_queue API. For XDP programs without frags support, fbnic requires MTU to be less than the HDS threshold. If an over-sized frame is received, the frame is dropped and recorded as rx_length_errors reported via ip stats to highlight that this is an error. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250813221319.3367670-9-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-19 10:51:16 +02:00
Mohsin Bashir	168deb7b31	eth: fbnic: Add support for XDP_TX action Add support for XDP_TX action and cleaning the associated work. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250813221319.3367670-8-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-19 10:51:16 +02:00
Mohsin Bashir	cf4facfb13	eth: fbnic: Add support for XDP queues Add support for allocating XDP_TX queues and configuring ring support. FBNIC has been designed with XDP support in mind. Each Tx queue has 2 submission queues and one completion queue, with the expectation that one of the submission queues will be used by the stack, and the other by XDP. XDP queues are populated by XDP_TX and start from index 128 in the TX queue array. The support for XDP_TX is added in the next patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250813221319.3367670-7-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-19 10:51:16 +02:00
Mohsin Bashir	1b0a3950db	eth: fbnic: Add XDP pass, drop, abort support Add basic support for attaching an XDP program to the device and support for PASS/DROP/ABORT actions. In fbnic, buffers are always mapped as DMA_BIDIRECTIONAL. The BPF program pointer can be read either on a per-packet basis or on a per-NAPI poll basis. Both approaches are functionally equivalent, in the current code. Stick to per-packet as it limits number of arguments we need to pass around. On the XDP hot path, check that packets with fragments are only allowed when multi-buffer support is enabled for the XDP program. Ideally, this check should not be necessary because ndo_bpf verifies that for XDP programs without multi-buff support, MTU is less than the hds_thresh. However, the MTU currently does not enforce the receive size which would require cleaning up the data path and bouncing the link. For practical reasons, prioritize the ability to enter and exit BPF mode with different MTU sizes without requiring a full reconfig. Testing: Hook a simple XDP program that passes all the packets destined for a specific port iperf3 -c 192.168.1.10 -P 5 -p 12345 Connecting to host 192.168.1.10, port 12345 [ 5] local 192.168.1.9 port 46702 connected to 192.168.1.10 port 12345 [ ID] Interval Transfer Bitrate Retr Cwnd - - - - - - - - - - - - - - - - - - - - - - - - - [SUM] 1.00-2.00 sec 3.86 GBytes 33.2 Gbits/sec 0 XDP_DROP: Hook an XDP program that drops packets destined for a specific port iperf3 -c 192.168.1.10 -P 5 -p 12345 ^C- - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec 0 sender [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated XDP with HDS: - Validate XDP attachment failure when HDS is low ~] ethtool -G eth0 hds-thresh 512 ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp ~] Error: fbnic: MTU too high, or HDS threshold is too low for single buffer XDP. - Validate successful XDP attachment when HDS threshold is appropriate ~] ethtool -G eth0 hds-thresh 1536 ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp - Validate when the XDP program is attached, changing HDS thresh to a lower value fails ~] ethtool -G eth0 hds-thresh 512 ~] netlink error: fbnic: Use higher HDS threshold or multi-buf capable program - Validate HDS thresh does not matter when xdp frags support is available ~] ethtool -G eth0 hds-thresh 512 ~] sudo ip link set eth0 xdpdrv obj xdp_pass_mb_12345.o sec xdp.frags Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250813221319.3367670-6-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-19 10:51:16 +02:00
Mohsin Bashir	9064ab485f	eth: fbnic: Prefetch packet headers on Rx Issue a prefetch for the start of the buffer on Rx to try to avoid cache miss on packet headers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250813221319.3367670-5-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-19 10:51:16 +02:00

1 2 3 4

187 Commits