From 885c5e57924dc040b23d0ad0d8388f0e35772159 Mon Sep 17 00:00:00 2001 From: Grzegorz Nitka Date: Thu, 16 Apr 2026 17:53:25 -0700 Subject: [PATCH 01/11] ice: fix 'adjust' timer programming for E830 devices MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix incorrect 'adjust the timer' programming sequence for E830 devices series. Only shadow registers GLTSYN_SHADJ were programmed in the current implementation. According to the specification [1], write to command GLTSYN_CMD register is also required with CMD field set to "Adjust the Time" value, for the timer adjustment to take the effect. The flow was broken for the adjustment less than S32_MAX/MIN range (around +/- 2 seconds). For bigger adjustment, non-atomic programming flow is used, involving set timer programming. Non-atomic flow is implemented correctly. Testing hints: Run command: phc_ctl /dev/ptpX get adj 2 get Expected result: Returned timestamps differ at least by 2 seconds [1] IntelĀ® Ethernet Controller E830 Datasheet rev 1.3, chapter 9.7.5.4 https://cdrdv2.intel.com/v1/dl/getContent/787353?explicitVersion=true Fixes: f00307522786 ("ice: Implement PTP support for E830 devices") Reviewed-by: Aleksandr Loktionov Signed-off-by: Grzegorz Nitka Reviewed-by: Simon Horman Tested-by: Rinitha S Reviewed-by: Jacob Keller Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-1-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index 61c0a0d93ea8..5a5c511ccbb6 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -5381,8 +5381,8 @@ int ice_ptp_write_incval_locked(struct ice_hw *hw, u64 incval) */ int ice_ptp_adj_clock(struct ice_hw *hw, s32 adj) { + int err = 0; u8 tmr_idx; - int err; tmr_idx = hw->func_caps.ts_func_info.tmr_index_owned; @@ -5399,8 +5399,8 @@ int ice_ptp_adj_clock(struct ice_hw *hw, s32 adj) err = ice_ptp_prep_phy_adj_e810(hw, adj); break; case ICE_MAC_E830: - /* E830 sync PHYs automatically after setting GLTSYN_SHADJ */ - return 0; + /* E830 sync PHYs automatically after setting cmd register */ + break; case ICE_MAC_GENERIC: err = ice_ptp_prep_phy_adj_e82x(hw, adj); break; From 05567e4052732d70c7ff9655217b3d14d25f639a Mon Sep 17 00:00:00 2001 From: Grzegorz Nitka Date: Thu, 16 Apr 2026 17:53:26 -0700 Subject: [PATCH 02/11] ice: update PCS latency settings for E825 10G/25Gb modes Update MAC Rx/Tx offset registers settings (PHY_MAC_[RX|TX]_OFFSET registers) with the data obtained with the latest research. It applies to PCS latency settings for the following speeds/modes: * 10Gb NO-FEC - TX latency changed from 71.25 ns to 73 ns - RX latency changed from -25.6 ns to -28 ns * 25Gb NO-FEC - TX latency changed from 28.17 ns to 33 ns - RX latency changed from -12.45 ns to -12 ns * 25Gb RS-FEC - TX latency changed from 64.5 ns to 69 ns - RX latency changed from -3.6 ns to -3 ns The original data came from simulation and pre-production hardware. The new data measures the actual delays and as such is more accurate. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Co-developed-by: Zoltan Fodor Signed-off-by: Zoltan Fodor Reviewed-by: Aleksandr Loktionov Reviewed-by: Jacob Keller Signed-off-by: Grzegorz Nitka Tested-by: Sunitha Mekala Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-2-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice_ptp_consts.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h index 19dddd9b53dd..4d298c27bfb2 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h @@ -78,14 +78,14 @@ struct ice_eth56g_mac_reg_cfg eth56g_mac_cfg[NUM_ICE_ETH56G_LNK_SPD] = { .blktime = 0x666, /* 3.2 */ .tx_offset = { .serdes = 0x234c, /* 17.6484848 */ - .no_fec = 0x8e80, /* 71.25 */ + .no_fec = 0x93d9, /* 73 */ .fc = 0xb4a4, /* 90.32 */ .sfd = 0x4a4, /* 2.32 */ .onestep = 0x4ccd /* 38.4 */ }, .rx_offset = { .serdes = 0xffffeb27, /* -10.42424 */ - .no_fec = 0xffffcccd, /* -25.6 */ + .no_fec = 0xffffc7b6, /* -28 */ .fc = 0xfffc557b, /* -469.26 */ .sfd = 0x4a4, /* 2.32 */ .bs_ds = 0x32 /* 0.0969697 */ @@ -118,17 +118,17 @@ struct ice_eth56g_mac_reg_cfg eth56g_mac_cfg[NUM_ICE_ETH56G_LNK_SPD] = { .mktime = 0x147b, /* 10.24, only if RS-FEC enabled */ .tx_offset = { .serdes = 0xe1e, /* 7.0593939 */ - .no_fec = 0x3857, /* 28.17 */ + .no_fec = 0x4266, /* 33 */ .fc = 0x48c3, /* 36.38 */ - .rs = 0x8100, /* 64.5 */ + .rs = 0x8a00, /* 69 */ .sfd = 0x1dc, /* 0.93 */ .onestep = 0x1eb8 /* 15.36 */ }, .rx_offset = { .serdes = 0xfffff7a9, /* -4.1697 */ - .no_fec = 0xffffe71a, /* -12.45 */ + .no_fec = 0xffffe700, /* -12 */ .fc = 0xfffe894d, /* -187.35 */ - .rs = 0xfffff8cd, /* -3.6 */ + .rs = 0xfffff8cc, /* -3 */ .sfd = 0x1dc, /* 0.93 */ .bs_ds = 0x14 /* 0.0387879, RS-FEC 0 */ } From 9aab1c3d7299285e2569cbc0ed5892d631a241b2 Mon Sep 17 00:00:00 2001 From: Guangshuo Li Date: Thu, 16 Apr 2026 17:53:27 -0700 Subject: [PATCH 03/11] ice: fix double free in ice_sf_eth_activate() error path When auxiliary_device_add() fails, ice_sf_eth_activate() jumps to aux_dev_uninit and calls auxiliary_device_uninit(&sf_dev->adev). The device release callback ice_sf_dev_release() frees sf_dev, but the current error path falls through to sf_dev_free and calls kfree(sf_dev) again, causing a double free. Keep kfree(sf_dev) for the auxiliary_device_init() failure path, but avoid falling through to sf_dev_free after auxiliary_device_uninit(). Fixes: 13acc5c4cdbe ("ice: subfunction activation and base devlink ops") Cc: stable@vger.kernel.org Reviewed-by: Aleksandr Loktionov Signed-off-by: Guangshuo Li Reviewed-by: Simon Horman Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-3-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice_sf_eth.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_sf_eth.c b/drivers/net/ethernet/intel/ice/ice_sf_eth.c index 2cf04bc6edce..a730aa368c92 100644 --- a/drivers/net/ethernet/intel/ice/ice_sf_eth.c +++ b/drivers/net/ethernet/intel/ice/ice_sf_eth.c @@ -305,6 +305,8 @@ ice_sf_eth_activate(struct ice_dynamic_port *dyn_port, aux_dev_uninit: auxiliary_device_uninit(&sf_dev->adev); + return err; + sf_dev_free: kfree(sf_dev); xa_erase: From 1a303baa715e6b78d6a406aaf335f87ff35acfcd Mon Sep 17 00:00:00 2001 From: Michal Schmidt Date: Thu, 16 Apr 2026 17:53:28 -0700 Subject: [PATCH 04/11] ice: fix double-free of tx_buf skb If ice_tso() or ice_tx_csum() fail, the error path in ice_xmit_frame_ring() frees the skb, but the 'first' tx_buf still points to it and is marked as valid (ICE_TX_BUF_SKB). 'next_to_use' remains unchanged, so the potential problem will likely fix itself when the next packet is transmitted and the tx_buf gets overwritten. But if there is no next packet and the interface is brought down instead, ice_clean_tx_ring() -> ice_unmap_and_free_tx_buf() will find the tx_buf and free the skb for the second time. The fix is to reset the tx_buf type to ICE_TX_BUF_EMPTY in the error path, so that ice_unmap_and_free_tx_buf(). Move the initialization of 'first' up, to ensure it's already valid in case we hit the linearization error path. The bug was spotted by AI while I had it looking for something else. It also proposed an initial version of the patch. I reproduced the bug and tested the fix by adding code to inject failures, on a build with KASAN. I looked for similar bugs in related Intel drivers and did not find any. Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads") Assisted-by: Claude:claude-4.6-opus-high Cursor Signed-off-by: Michal Schmidt Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-4-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice_txrx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index a2cd4cf37734..7be9c062949b 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -2158,6 +2158,9 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring) ice_trace(xmit_frame_ring, tx_ring, skb); + /* record the location of the first descriptor for this packet */ + first = &tx_ring->tx_buf[tx_ring->next_to_use]; + count = ice_xmit_desc_count(skb); if (ice_chk_linearize(skb, count)) { if (__skb_linearize(skb)) @@ -2183,8 +2186,6 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring) offload.tx_ring = tx_ring; - /* record the location of the first descriptor for this packet */ - first = &tx_ring->tx_buf[tx_ring->next_to_use]; first->skb = skb; first->type = ICE_TX_BUF_SKB; first->bytecount = max_t(unsigned int, skb->len, ETH_ZLEN); @@ -2249,6 +2250,7 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring) out_drop: ice_trace(xmit_frame_ring_drop, tx_ring, skb); dev_kfree_skb_any(skb); + first->type = ICE_TX_BUF_EMPTY; return NETDEV_TX_OK; } From 55e74f9ea7fea3d3da1cb6d5cacdaf8cf0fe3516 Mon Sep 17 00:00:00 2001 From: Paul Greenwalt Date: Thu, 16 Apr 2026 17:53:29 -0700 Subject: [PATCH 05/11] ice: fix PHY config on media change with link-down-on-close Commit 1a3571b5938c ("ice: restore PHY settings on media insertion") introduced separate flows for setting PHY configuration on media present: ice_configure_phy() when link-down-on-close is disabled, and ice_force_phys_link_state() when enabled. The latter incorrectly uses the previous configuration even after module change, causing link issues such as wrong speed or no link. Unify PHY configuration into a single ice_phy_cfg() function with a link_en parameter, ensuring PHY capabilities are always fetched fresh from hardware. Fixes: 1a3571b5938c ("ice: restore PHY settings on media insertion") Reviewed-by: Przemek Kitszel Signed-off-by: Paul Greenwalt Reviewed-by: Aleksandr Loktionov Tested-by: Sunitha Mekala Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-5-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice_main.c | 121 +++++----------------- 1 file changed, 27 insertions(+), 94 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 3c36e3641b9e..ce3a0afe302d 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -1922,82 +1922,6 @@ static void ice_handle_mdd_event(struct ice_pf *pf) ice_print_vfs_mdd_events(pf); } -/** - * ice_force_phys_link_state - Force the physical link state - * @vsi: VSI to force the physical link state to up/down - * @link_up: true/false indicates to set the physical link to up/down - * - * Force the physical link state by getting the current PHY capabilities from - * hardware and setting the PHY config based on the determined capabilities. If - * link changes a link event will be triggered because both the Enable Automatic - * Link Update and LESM Enable bits are set when setting the PHY capabilities. - * - * Returns 0 on success, negative on failure - */ -static int ice_force_phys_link_state(struct ice_vsi *vsi, bool link_up) -{ - struct ice_aqc_get_phy_caps_data *pcaps; - struct ice_aqc_set_phy_cfg_data *cfg; - struct ice_port_info *pi; - struct device *dev; - int retcode; - - if (!vsi || !vsi->port_info || !vsi->back) - return -EINVAL; - if (vsi->type != ICE_VSI_PF) - return 0; - - dev = ice_pf_to_dev(vsi->back); - - pi = vsi->port_info; - - pcaps = kzalloc_obj(*pcaps); - if (!pcaps) - return -ENOMEM; - - retcode = ice_aq_get_phy_caps(pi, false, ICE_AQC_REPORT_ACTIVE_CFG, pcaps, - NULL); - if (retcode) { - dev_err(dev, "Failed to get phy capabilities, VSI %d error %d\n", - vsi->vsi_num, retcode); - retcode = -EIO; - goto out; - } - - /* No change in link */ - if (link_up == !!(pcaps->caps & ICE_AQC_PHY_EN_LINK) && - link_up == !!(pi->phy.link_info.link_info & ICE_AQ_LINK_UP)) - goto out; - - /* Use the current user PHY configuration. The current user PHY - * configuration is initialized during probe from PHY capabilities - * software mode, and updated on set PHY configuration. - */ - cfg = kmemdup(&pi->phy.curr_user_phy_cfg, sizeof(*cfg), GFP_KERNEL); - if (!cfg) { - retcode = -ENOMEM; - goto out; - } - - cfg->caps |= ICE_AQ_PHY_ENA_AUTO_LINK_UPDT; - if (link_up) - cfg->caps |= ICE_AQ_PHY_ENA_LINK; - else - cfg->caps &= ~ICE_AQ_PHY_ENA_LINK; - - retcode = ice_aq_set_phy_cfg(&vsi->back->hw, pi, cfg, NULL); - if (retcode) { - dev_err(dev, "Failed to set phy config, VSI %d error %d\n", - vsi->vsi_num, retcode); - retcode = -EIO; - } - - kfree(cfg); -out: - kfree(pcaps); - return retcode; -} - /** * ice_init_nvm_phy_type - Initialize the NVM PHY type * @pi: port info structure @@ -2066,7 +1990,7 @@ static void ice_init_link_dflt_override(struct ice_port_info *pi) * first time media is available. The ICE_LINK_DEFAULT_OVERRIDE_PENDING state * is used to indicate that the user PHY cfg default override is initialized * and the PHY has not been configured with the default override settings. The - * state is set here, and cleared in ice_configure_phy the first time the PHY is + * state is set here, and cleared in ice_phy_cfg the first time the PHY is * configured. * * This function should be called only if the FW doesn't support default @@ -2172,14 +2096,18 @@ static int ice_init_phy_user_cfg(struct ice_port_info *pi) } /** - * ice_configure_phy - configure PHY + * ice_phy_cfg - configure PHY * @vsi: VSI of PHY + * @link_en: true/false indicates to set link to enable/disable * * Set the PHY configuration. If the current PHY configuration is the same as - * the curr_user_phy_cfg, then do nothing to avoid link flap. Otherwise - * configure the based get PHY capabilities for topology with media. + * the curr_user_phy_cfg and link_en hasn't changed, then do nothing to avoid + * link flap. Otherwise configure the PHY based get PHY capabilities for + * topology with media and link_en. + * + * Return: 0 on success, negative on failure */ -static int ice_configure_phy(struct ice_vsi *vsi) +static int ice_phy_cfg(struct ice_vsi *vsi, bool link_en) { struct device *dev = ice_pf_to_dev(vsi->back); struct ice_port_info *pi = vsi->port_info; @@ -2199,9 +2127,6 @@ static int ice_configure_phy(struct ice_vsi *vsi) phy->link_info.topo_media_conflict == ICE_AQ_LINK_TOPO_UNSUPP_MEDIA) return -EPERM; - if (test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, pf->flags)) - return ice_force_phys_link_state(vsi, true); - pcaps = kzalloc_obj(*pcaps); if (!pcaps) return -ENOMEM; @@ -2215,10 +2140,8 @@ static int ice_configure_phy(struct ice_vsi *vsi) goto done; } - /* If PHY enable link is configured and configuration has not changed, - * there's nothing to do - */ - if (pcaps->caps & ICE_AQC_PHY_EN_LINK && + /* Configuration has not changed. There's nothing to do. */ + if (link_en == !!(pcaps->caps & ICE_AQC_PHY_EN_LINK) && ice_phy_caps_equals_cfg(pcaps, &phy->curr_user_phy_cfg)) goto done; @@ -2282,8 +2205,12 @@ static int ice_configure_phy(struct ice_vsi *vsi) */ ice_cfg_phy_fc(pi, cfg, phy->curr_user_fc_req); - /* Enable link and link update */ - cfg->caps |= ICE_AQ_PHY_ENA_AUTO_LINK_UPDT | ICE_AQ_PHY_ENA_LINK; + /* Enable/Disable link and link update */ + cfg->caps |= ICE_AQ_PHY_ENA_AUTO_LINK_UPDT; + if (link_en) + cfg->caps |= ICE_AQ_PHY_ENA_LINK; + else + cfg->caps &= ~ICE_AQ_PHY_ENA_LINK; err = ice_aq_set_phy_cfg(&pf->hw, pi, cfg, NULL); if (err) @@ -2336,7 +2263,7 @@ static void ice_check_media_subtask(struct ice_pf *pf) test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, vsi->back->flags)) return; - err = ice_configure_phy(vsi); + err = ice_phy_cfg(vsi, true); if (!err) clear_bit(ICE_FLAG_NO_MEDIA, pf->flags); @@ -4892,9 +4819,15 @@ static int ice_init_link(struct ice_pf *pf) if (!test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, pf->flags)) { struct ice_vsi *vsi = ice_get_main_vsi(pf); + struct ice_link_default_override_tlv *ldo; + bool link_en; + + ldo = &pf->link_dflt_override; + link_en = !(ldo->options & + ICE_LINK_OVERRIDE_AUTO_LINK_DIS); if (vsi) - ice_configure_phy(vsi); + ice_phy_cfg(vsi, link_en); } } else { set_bit(ICE_FLAG_NO_MEDIA, pf->flags); @@ -9707,7 +9640,7 @@ int ice_open_internal(struct net_device *netdev) } } - err = ice_configure_phy(vsi); + err = ice_phy_cfg(vsi, true); if (err) { netdev_err(netdev, "Failed to set physical link up, error %d\n", err); @@ -9748,7 +9681,7 @@ int ice_stop(struct net_device *netdev) } if (test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, vsi->back->flags)) { - int link_err = ice_force_phys_link_state(vsi, false); + int link_err = ice_phy_cfg(vsi, false); if (link_err) { if (link_err == -ENOMEDIUM) From 4a3a940059e98539de293a6e36e464094c2e875b Mon Sep 17 00:00:00 2001 From: Paul Greenwalt Date: Thu, 16 Apr 2026 17:53:30 -0700 Subject: [PATCH 06/11] ice: fix ICE_AQ_LINK_SPEED_M for 200G When setting PHY configuration during driver initialization, 200G link speed is not being advertised even when the PHY is capable. This is because the get PHY capabilities link speed response is being masked by ICE_AQ_LINK_SPEED_M, which does not include the 200G link speed bit. ICE_AQ_LINK_SPEED_200GB is defined as BIT(11), but the mask 0x7FF only covers bits 0-10. Fix ICE_AQ_LINK_SPEED_M to use GENMASK(11, 0) so that it covers all defined link speed bits including 200G. Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use") Signed-off-by: Paul Greenwalt Signed-off-by: Aleksandr Loktionov Reviewed-by: Simon Horman Tested-by: Sunitha Mekala Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-6-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 859e9c66f3e7..3cbb1b0582e3 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1252,7 +1252,7 @@ struct ice_aqc_get_link_status_data { #define ICE_AQ_LINK_PWR_QSFP_CLASS_3 2 #define ICE_AQ_LINK_PWR_QSFP_CLASS_4 3 __le16 link_speed; -#define ICE_AQ_LINK_SPEED_M 0x7FF +#define ICE_AQ_LINK_SPEED_M GENMASK(11, 0) #define ICE_AQ_LINK_SPEED_10MB BIT(0) #define ICE_AQ_LINK_SPEED_100MB BIT(1) #define ICE_AQ_LINK_SPEED_1000MB BIT(2) From 7c72ec18c2a4111204c2e915f8e4f6d849ce9398 Mon Sep 17 00:00:00 2001 From: Keita Morisaki Date: Thu, 16 Apr 2026 17:53:31 -0700 Subject: [PATCH 07/11] ice: fix race condition in TX timestamp ring cleanup Fix a race condition between ice_free_tx_tstamp_ring() and ice_tx_map() that can cause a NULL pointer dereference. ice_free_tx_tstamp_ring currently clears the ICE_TX_FLAGS_TXTIME flag after NULLing the tstamp_ring. This could allow a concurrent ice_tx_map call on another CPU to dereference the tstamp_ring, which could lead to a NULL pointer dereference. CPU A:ice_free_tx_tstamp_ring() | CPU B:ice_tx_map() --------------------------------|--------------------------------- tx_ring->tstamp_ring = NULL | | ice_is_txtime_cfg() -> true | tstamp_ring = tx_ring->tstamp_ring | tstamp_ring->count // NULL deref! flags &= ~ICE_TX_FLAGS_TXTIME | Fix by: 1. Reordering ice_free_tx_tstamp_ring() to clear the flag before NULLing the pointer, with smp_wmb() to ensure proper ordering. 2. Adding smp_rmb() in ice_tx_map() after the flag check to order the flag read before the pointer read, using READ_ONCE() for the pointer, and adding a NULL check as a safety net. 3. Converting tx_ring->flags from u8 to DECLARE_BITMAP() and using atomic bitops (set_bit(), clear_bit(), test_bit()) for all flag operations throughout the driver: - ICE_TX_RING_FLAGS_XDP - ICE_TX_RING_FLAGS_VLAN_L2TAG1 - ICE_TX_RING_FLAGS_VLAN_L2TAG2 - ICE_TX_RING_FLAGS_TXTIME Fixes: ccde82e909467 ("ice: add E830 Earliest TxTime First Offload support") Signed-off-by: Keita Morisaki Reviewed-by: Aleksandr Loktionov Tested-by: Rinitha S Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-7-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice.h | 4 ++-- drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 2 +- drivers/net/ethernet/intel/ice/ice_lib.c | 4 ++-- drivers/net/ethernet/intel/ice/ice_txrx.c | 23 ++++++++++++++------ drivers/net/ethernet/intel/ice/ice_txrx.h | 16 +++++++++----- 5 files changed, 31 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index eb3a48330cc1..725b130dd3a2 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -753,7 +753,7 @@ static inline bool ice_is_xdp_ena_vsi(struct ice_vsi *vsi) static inline void ice_set_ring_xdp(struct ice_tx_ring *ring) { - ring->flags |= ICE_TX_FLAGS_RING_XDP; + set_bit(ICE_TX_RING_FLAGS_XDP, ring->flags); } /** @@ -778,7 +778,7 @@ static inline bool ice_is_txtime_ena(const struct ice_tx_ring *ring) */ static inline bool ice_is_txtime_cfg(const struct ice_tx_ring *ring) { - return !!(ring->flags & ICE_TX_FLAGS_TXTIME); + return test_bit(ICE_TX_RING_FLAGS_TXTIME, ring->flags); } /** diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c index bd77f1c001ee..16aa25535152 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c @@ -943,7 +943,7 @@ ice_tx_prepare_vlan_flags_dcb(struct ice_tx_ring *tx_ring, /* if this is not already set it means a VLAN 0 + priority needs * to be offloaded */ - if (tx_ring->flags & ICE_TX_FLAGS_RING_VLAN_L2TAG2) + if (test_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG2, tx_ring->flags)) first->tx_flags |= ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN; else first->tx_flags |= ICE_TX_FLAGS_HW_VLAN; diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 689c6025ea82..837b71b7b2b7 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -1412,9 +1412,9 @@ static int ice_vsi_alloc_rings(struct ice_vsi *vsi) ring->count = vsi->num_tx_desc; ring->txq_teid = ICE_INVAL_TEID; if (dvm_ena) - ring->flags |= ICE_TX_FLAGS_RING_VLAN_L2TAG2; + set_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG2, ring->flags); else - ring->flags |= ICE_TX_FLAGS_RING_VLAN_L2TAG1; + set_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG1, ring->flags); WRITE_ONCE(vsi->tx_rings[i], ring); } diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 7be9c062949b..4ca1a0602307 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -190,9 +190,10 @@ void ice_free_tstamp_ring(struct ice_tx_ring *tx_ring) void ice_free_tx_tstamp_ring(struct ice_tx_ring *tx_ring) { ice_free_tstamp_ring(tx_ring); + clear_bit(ICE_TX_RING_FLAGS_TXTIME, tx_ring->flags); + smp_wmb(); /* order flag clear before pointer NULL */ kfree_rcu(tx_ring->tstamp_ring, rcu); - tx_ring->tstamp_ring = NULL; - tx_ring->flags &= ~ICE_TX_FLAGS_TXTIME; + WRITE_ONCE(tx_ring->tstamp_ring, NULL); } /** @@ -405,7 +406,7 @@ static int ice_alloc_tstamp_ring(struct ice_tx_ring *tx_ring) tx_ring->tstamp_ring = tstamp_ring; tstamp_ring->desc = NULL; tstamp_ring->count = ice_calc_ts_ring_count(tx_ring); - tx_ring->flags |= ICE_TX_FLAGS_TXTIME; + set_bit(ICE_TX_RING_FLAGS_TXTIME, tx_ring->flags); return 0; } @@ -1521,13 +1522,20 @@ ice_tx_map(struct ice_tx_ring *tx_ring, struct ice_tx_buf *first, return; if (ice_is_txtime_cfg(tx_ring)) { - struct ice_tstamp_ring *tstamp_ring = tx_ring->tstamp_ring; - u32 tstamp_count = tstamp_ring->count; - u32 j = tstamp_ring->next_to_use; + struct ice_tstamp_ring *tstamp_ring; + u32 tstamp_count, j; struct ice_ts_desc *ts_desc; struct timespec64 ts; u32 tstamp; + smp_rmb(); /* order flag read before pointer read */ + tstamp_ring = READ_ONCE(tx_ring->tstamp_ring); + if (unlikely(!tstamp_ring)) + goto ring_kick; + + tstamp_count = tstamp_ring->count; + j = tstamp_ring->next_to_use; + ts = ktime_to_timespec64(first->skb->tstamp); tstamp = ts.tv_nsec >> ICE_TXTIME_CTX_RESOLUTION_128NS; @@ -1555,6 +1563,7 @@ ice_tx_map(struct ice_tx_ring *tx_ring, struct ice_tx_buf *first, tstamp_ring->next_to_use = j; writel_relaxed(j, tstamp_ring->tail); } else { +ring_kick: writel_relaxed(i, tx_ring->tail); } return; @@ -1814,7 +1823,7 @@ ice_tx_prepare_vlan_flags(struct ice_tx_ring *tx_ring, struct ice_tx_buf *first) */ if (skb_vlan_tag_present(skb)) { first->vid = skb_vlan_tag_get(skb); - if (tx_ring->flags & ICE_TX_FLAGS_RING_VLAN_L2TAG2) + if (test_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG2, tx_ring->flags)) first->tx_flags |= ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN; else first->tx_flags |= ICE_TX_FLAGS_HW_VLAN; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index b6547e1b7c42..5e517f219379 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -212,6 +212,14 @@ enum ice_rx_dtype { ICE_RX_DTYPE_SPLIT_ALWAYS = 2, }; +enum ice_tx_ring_flags { + ICE_TX_RING_FLAGS_XDP, + ICE_TX_RING_FLAGS_VLAN_L2TAG1, + ICE_TX_RING_FLAGS_VLAN_L2TAG2, + ICE_TX_RING_FLAGS_TXTIME, + ICE_TX_RING_FLAGS_NBITS, +}; + struct ice_pkt_ctx { u64 cached_phctime; __be16 vlan_proto; @@ -352,11 +360,7 @@ struct ice_tx_ring { u16 count; /* Number of descriptors */ u16 q_index; /* Queue number of ring */ - u8 flags; -#define ICE_TX_FLAGS_RING_XDP BIT(0) -#define ICE_TX_FLAGS_RING_VLAN_L2TAG1 BIT(1) -#define ICE_TX_FLAGS_RING_VLAN_L2TAG2 BIT(2) -#define ICE_TX_FLAGS_TXTIME BIT(3) + DECLARE_BITMAP(flags, ICE_TX_RING_FLAGS_NBITS); struct xsk_buff_pool *xsk_pool; @@ -398,7 +402,7 @@ static inline bool ice_ring_ch_enabled(struct ice_tx_ring *ring) static inline bool ice_ring_is_xdp(struct ice_tx_ring *ring) { - return !!(ring->flags & ICE_TX_FLAGS_RING_XDP); + return test_bit(ICE_TX_RING_FLAGS_XDP, ring->flags); } enum ice_container_type { From fa28351f970fa5138c7c5dedfe5dea480a0ee065 Mon Sep 17 00:00:00 2001 From: Kohei Enju Date: Thu, 16 Apr 2026 17:53:32 -0700 Subject: [PATCH 08/11] ice: fix potential NULL pointer deref in error path of ice_set_ringparam() ice_set_ringparam nullifies tstamp_ring of temporary tx_rings, without clearing ICE_TX_RING_FLAGS_TXTIME bit. When ICE_TX_RING_FLAGS_TXTIME is set and the subsequent ice_setup_tx_ring() call fails, a NULL pointer dereference could happen in the unwinding sequence: ice_clean_tx_ring() -> ice_is_txtime_cfg() == true (ICE_TX_RING_FLAGS_TXTIME is set) -> ice_free_tx_tstamp_ring() -> ice_free_tstamp_ring() -> tstamp_ring->desc (NULL deref) Clear ICE_TX_RING_FLAGS_TXTIME bit to avoid the potential issue. Note that this potential issue is found by manual code review. Compile test only since unfortunately I don't have E830 devices. Fixes: ccde82e90946 ("ice: add E830 Earliest TxTime First Offload support") Signed-off-by: Kohei Enju Reviewed-by: Paul Greenwalt Tested-by: Rinitha S Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-8-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/ice/ice_ethtool.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index e6a20af6f63d..f28416a707d7 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -3290,6 +3290,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring, tx_rings[i].desc = NULL; tx_rings[i].tx_buf = NULL; tx_rings[i].tstamp_ring = NULL; + clear_bit(ICE_TX_RING_FLAGS_TXTIME, tx_rings[i].flags); tx_rings[i].tx_tstamps = &pf->ptp.port.tx; err = ice_setup_tx_ring(&tx_rings[i]); if (err) { From a24162f18825684ad04e3a5d0531f8a50d679347 Mon Sep 17 00:00:00 2001 From: Kohei Enju Date: Thu, 16 Apr 2026 17:53:33 -0700 Subject: [PATCH 09/11] i40e: don't advertise IFF_SUPP_NOFCS i40e advertises IFF_SUPP_NOFCS, allowing users to use the SO_NOFCS socket option. However, this option is silently ignored, as the driver does not check skb->no_fcs, and always enables FCS insertion offload. Fix this by removing the advertisement of IFF_SUPP_NOFCS. This behavior can be reproduced with a simple AF_PACKET socket: import socket s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW) s.setsockopt(socket.SOL_SOCKET, 43, 1) # SO_NOFCS s.bind(("eth0", 0)) s.send(b'\xff' * 64) Previously, send() succeeds but the driver ignores SO_NOFCS. With this change, send() fails with -EPROTONOSUPPORT, as expected. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Kohei Enju Reviewed-by: Aleksandr Loktionov Tested-by: Sunitha Mekala Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-9-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 926d001b2150..028bd500603a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -13783,7 +13783,6 @@ static int i40e_config_netdev(struct i40e_vsi *vsi) netdev->neigh_priv_len = sizeof(u32) * 4; netdev->priv_flags |= IFF_UNICAST_FLT; - netdev->priv_flags |= IFF_SUPP_NOFCS; /* Setup netdev TC information */ i40e_vsi_config_netdev_tc(vsi, vsi->tc_config.enabled_tc); From 496d9f91062fa07956702e0f234c5203f03a974d Mon Sep 17 00:00:00 2001 From: Petr Oros Date: Thu, 16 Apr 2026 17:53:34 -0700 Subject: [PATCH 10/11] iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2 The IAVF_RXD_LEGACY_L2TAG2_M mask was incorrectly defined as GENMASK_ULL(63, 32), extracting 32 bits from qw2 instead of the 16-bit VLAN tag. In the legacy Rx descriptor layout, the 2nd L2TAG2 (VLAN tag) occupies bits 63:48 of qw2, not 63:32. The oversized mask causes FIELD_GET to return a 32-bit value where the actual VLAN tag sits in bits 31:16. When this value is passed to iavf_receive_skb() as a u16 parameter, it gets truncated to the lower 16 bits (which contain the 1st L2TAG2, typically zero). As a result, __vlan_hwaccel_put_tag() is never called and software VLAN interfaces on VFs receive no traffic. This affects VFs behind ice PF (VIRTCHNL VLAN v2) when the PF advertises VLAN stripping into L2TAG2_2 and legacy descriptors are used. The flex descriptor path already uses the correct mask (IAVF_RXD_FLEX_L2TAG2_2_M = GENMASK_ULL(63, 48)). Reproducer: 1. Create 2 VFs on ice PF (echo 2 > sriov_numvfs) 2. Disable spoofchk on both VFs 3. Move each VF into a separate network namespace 4. On each VF: create VLAN interface (e.g. vlan 198), assign IP, bring up 5. Set rx-vlan-offload OFF on both VFs 6. Ping between VLAN interfaces -> expect PASS (VLAN tag stays in packet data, kernel matches in-band) 7. Set rx-vlan-offload ON on both VFs 8. Ping between VLAN interfaces -> expect FAIL if bug present (HW strips VLAN tag into descriptor L2TAG2 field, wrong mask extracts bits 47:32 instead of 63:48, truncated to u16 -> zero, __vlan_hwaccel_put_tag() never called, packet delivered to parent interface, not VLAN interface) The reproducer requires legacy Rx descriptors. On modern ice + iavf with full PTP support, flex descriptors are always negotiated and the buggy legacy path is never reached. Flex descriptors require all of: - CONFIG_PTP_1588_CLOCK enabled - VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC granted by PF - PTP capabilities negotiated (VIRTCHNL_VF_CAP_PTP) - VIRTCHNL_1588_PTP_CAP_RX_TSTAMP supported - VIRTCHNL_RXDID_2_FLEX_SQ_NIC present in DDP profile If any condition is not met, iavf_select_rx_desc_format() falls back to legacy descriptors (RXDID=1) and the wrong L2TAG2 mask is hit. Fixes: 2dc8e7c36d80 ("iavf: refactor iavf_clean_rx_irq to support legacy and flex descriptors") Signed-off-by: Petr Oros Reviewed-by: Aleksandr Loktionov Reviewed-by: Paul Menzel Reviewed-by: Jacob Keller Tested-by: Rafal Romanowski Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-10-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/iavf/iavf_type.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/iavf/iavf_type.h b/drivers/net/ethernet/intel/iavf/iavf_type.h index 1d8cf29cb65a..5bb1de1cfd33 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_type.h +++ b/drivers/net/ethernet/intel/iavf/iavf_type.h @@ -277,7 +277,7 @@ struct iavf_rx_desc { /* L2 Tag 2 Presence */ #define IAVF_RXD_LEGACY_L2TAG2P_M BIT(0) /* Stripped S-TAG VLAN from the receive packet */ -#define IAVF_RXD_LEGACY_L2TAG2_M GENMASK_ULL(63, 32) +#define IAVF_RXD_LEGACY_L2TAG2_M GENMASK_ULL(63, 48) /* Stripped S-TAG VLAN from the receive packet */ #define IAVF_RXD_FLEX_L2TAG2_2_M GENMASK_ULL(63, 48) /* The packet is a UDP tunneled packet */ From aa3f7fe409350857c25d050482a2eef2cfd69b58 Mon Sep 17 00:00:00 2001 From: Matt Vollrath Date: Thu, 16 Apr 2026 17:53:36 -0700 Subject: [PATCH 11/11] e1000e: Unroll PTP in probe error handling If probe fails after registering the PTP clock and its delayed work, these resources must be released. This was not an issue until a 2016 fix moved the e1000e_ptp_init() call before the jump to err_register. Fixes: aa524b66c5ef ("e1000e: don't modify SYSTIM registers during SIOCSHWTSTAMP ioctl") Signed-off-by: Matt Vollrath Tested-by: Avigail Dahan Signed-off-by: Jacob Keller Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-12-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/e1000e/netdev.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index 9befdacd6730..7ce0cc8ab8f4 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -7706,6 +7706,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent) err_register: if (!(adapter->flags & FLAG_HAS_AMT)) e1000e_release_hw_control(adapter); + e1000e_ptp_remove(adapter); err_eeprom: if (hw->phy.ops.check_reset_block && !hw->phy.ops.check_reset_block(hw)) e1000_phy_hw_reset(&adapter->hw);