BA CAM of 8852C has more entries and more fields of H2C, and needs
initialization before using. Due to differences from 8852A/8852B, we name
it as V1 before. However, real V1 of BA CAM is introduced now, so change
it to V0_EXT to avoid confusing with firmware design.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230421024551.29994-7-pkshih@realtek.com
The following are scan offload related H2C (host to chip) function types.
* H2C_FUNC_ADD_SCANOFLD_CH
* H2C_FUNC_SCANOFLD
Before doing FW scan, we will continuously send multiple H2Cs with above
types which are used to tell FW the scan configuration of this time. But,
if FW doesn't handle one of these H2Cs well, the FW scan process might
not run as expected and driver should notice it early.
So, this commits makes scan offload related H2Cs wait for FW done ACK via
rtw89_wait_for_cond() and rtw89_complete_cond(). And, we check the return
code of these H2Cs from FW.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/a15f4bd594f71d7602ea67698b035805143700c9.camel@realtek.com
We have some MAC H2Cs (host to chip packets), which have no clear
individual C2Hs (chip to host packets) to indicate FW execution
response, but they are going to require to wait for FW completion.
So, we have to deal with this via common MAC C2H receive/done ACKs.
This commit changes the context, where common MAC C2H receive/done
ACK handlers are executed, to interrupt context. And, code comments
are added to prevent future commits from using it incorrectly.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/c4d766885e00b9f9dcf7954a80096c8b9d21149b.camel@realtek.com
The H2Cs (host to chip packets) related to packet offload functions
need to wait for FW responses in case FW state machine gets wrong
and makes driver status no longer able to align FW one. In flow,
driver may continuously send multiple H2Cs of packet offload series.
If somehow FW doesn't deal with the former yet but the latter has
gotten in, it might cause the problem mentioned above.
So, we block these H2Cs by rtw89_wait_for_cond(). And then, when
the corresponding C2Hs (chip to host packets) is received, we call
rtw89_complete_cond(). Besides, RTW89_MAC_C2H_FUNC_PKT_OFLD_RSP's
C2H handler should be executed in interrupt context to make our
wait/complete process work as expected.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/9ae8c1f105901c65e3171276a9fd6c99ae51803f.camel@realtek.com
There are two places where offload packets of 6 GHz probe would be deleted
from FW, i.e. calling rtw89_fw_h2c_del_pkt_offload().
* rtw89_core_cancel_6ghz_probe_tx()
* rtw89_release_pkt_list()
It is possible that we try to delete the same one from FW twice. Although
it might not be a big problem for now, it will depend on the runtime chip
firmware. So, we add a check to avoid it. In case things becomes complex
due to racing problem, we don't choose to do list_del(info->list) and
kfree(info) in both sides.
Besides, rtw89_fw_h2c_del_pkt_offload() will needs to wait for completion
after the follow-up commit. However, rtw89_core_cancel_6ghz_probe_tx()
was called in interrupt context. So, we move the stuffs of calling
rtw89_fw_h2c_del_pkt_offload() from rtw89_core_cancel_6ghz_probe_tx()
into a work. Then, we also need a check there before we call it.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/091966e5709cd7caecf9b81f7fd6388ae2b70a7e.camel@realtek.com
We have a pair of FW functions, rtw89_fw_h2c_add_pkt_offload() and
rtw89_fw_h2c_del_pkt_offload(). The rtw89_fw_h2c_add_pkt_offload()
acquires the bit itself, but the bit needs to be released by the
caller of rtw89_fw_h2c_del_pkt_offload(). This looks asymmetrical
and is not friendly to callers.
Second, if callers always releases the bits, it might make driver
unaligned to bitmap status of FW after some failures of calling
rtw89_fw_h2c_del_pkt_offload(). So, this commit move bit release
into rtw89_fw_h2c_del_pkt_offload().
In general, driver will call rtw89_fw_h2c_add_pkt_offload() and
rtw89_fw_h2c_del_pkt_offload(), and then, SW bitmap can align
with FW one. There is one exception when notify_fw is false.
It happens when driver detects FW problems and is going to
reset FW. Only in this case, driver needs to release bits
outside rtw89_fw_h2c_del_pkt_offload().
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/8cf5d45c5b04e7b680d4eb9dda62056cdce14cec.camel@realtek.com
RSSI statistics are grouped by CCK, OFDM or non-legacy rate. These
statistics will be collected in training state for both (main/aux)
antenna. There is a time period (ANTDIV_DELAY) for rate adaptive
settle down before start collect statistics when switch antenna.
Antenna diversity checks packet count from training state for each
group and use the most one as the final RSSI for comparison, and
then choose the better one as target antenna.
Signed-off-by: Eric Huang <echuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230418012820.5139-7-pkshih@realtek.com
RSSI strength is only from PHY path A, but there are two antenna for the
module which supports antenna diversity. So, set RSSI value to index 1 of
RSSI array if current antenna is on antenna B. Then, debugfs can show
two RSSI values with a asterisk mark on selected antenna.
RSSI: -23 dBm (raw=174, prev=173) [-26, -23*]
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230418012820.5139-4-pkshih@realtek.com
TX antenna diversity is a mechanism to select a proper antenna from two
antenna for single one hardware PHY chip. It chooses antenna with better
EVM or RSSI, and use GPIO to control SPDT to switch selected antenna.
RFE type from efuse is used to define if a module can support TX antenna
diversity when (type % 3) is 2.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230418012820.5139-3-pkshih@realtek.com
If there is a failure during copy_from_user or user-provided data
buffer is invalid, rtw_debugfs_copy_from_user should return negative
error code instead of a positive value count.
Fix this bug by returning correct error code. Moreover, the check
of buffer against null is removed since it will be handled by
copy_from_user.
Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/tencent_D2EB102CC7435C0110154E62ECA6A7D67505@qq.com
The driver can receive several frames in the same USB transfer.
Add the code to handle this in rtl8xxxu_parse_rxdesc24(), even though
currently all the relevant chips send only one frame per USB transfer
(RTL8723BU, RTL8192EU, RTL8188FU, RTL8710BU).
This was tested with RTL8188FU, RTL8192EU, RTL8710BU, and RTL8192FU.
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/16d2d1ff-6438-10c9-347f-6e14dd358ccf@gmail.com
As this driver uses HAS_RATE_CONTROL, rate_flags will not be provided by
mac80211.
Stop using tx_info->control.rates[0].flags and ieee80211_get_rts_cts_rate()
and use rts_threshold and bss_conf.use_cts_prot instead to determine
when to use RTS and CTS.
Send RTS with 24M rate like the vendor drivers. Also set this RTS rate
for ampdu_enable = true, because we also enable RTS for these frames.
Signed-off-by: Martin Kaistra <martin.kaistra@linutronix.de>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230428150833.218605-17-martin.kaistra@linutronix.de
As this driver uses HAS_RATE_CONTROL, tx_rates will not be provided by
mac80211.
For some frames c->control.rates[0].idx is negative, which means
ieee80211_get_tx_rate() will print a warning and return NULL.
Only management frames have USE_DRIVER_RATE set, so for all others the
rate info of txdesc is ignored anyway.
Remove call to ieee80211_get_tx_rate() and send management frames with
1M (rate info = 0).
Signed-off-by: Martin Kaistra <martin.kaistra@linutronix.de>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230428150833.218605-16-martin.kaistra@linutronix.de
When the action of an xdp program was XDP_TX, lan966x was creating
a xdp_frame and use this one to send the frame back. But it is also
possible to send back the frame without needing a xdp_frame, because
it is possible to send it back using the page.
And then once the frame is transmitted is possible to use directly
page_pool_recycle_direct as lan966x is using page pools.
This would save some CPU usage on this path, which results in higher
number of transmitted frames. Bellow are the statistics:
Frame size: Improvement:
64 ~8%
256 ~11%
512 ~8%
1000 ~0%
1500 ~0%
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230422142344.3630602-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2023-04-24
We've added 5 non-merge commits during the last 3 day(s) which contain
a total of 7 files changed, 87 insertions(+), 44 deletions(-).
The main changes are:
1) Workaround for bpf iter selftest due to lack of subprog support
in precision tracking, from Andrii.
2) Disable bpf_refcount_acquire kfunc until races are fixed, from Dave.
3) One more test_verifier test converted from asm macro to asm in C,
from Eduard.
4) Fix build with NETFILTER=y INET=n config, from Florian.
5) Add __rcu_read_{lock,unlock} into deny list, from Yafang.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests/bpf: avoid mark_all_scalars_precise() trigger in one of iter tests
bpf: Add __rcu_read_{lock,unlock} into btf id deny list
bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed
selftests/bpf: verifier/prevent_map_lookup converted to inline assembly
bpf: fix link failure with NETFILTER=y INET=n
====================
Link: https://lore.kernel.org/r/20230425005648.86714-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gerhard Engleder says:
====================
tsnep: XDP socket zero-copy support
Implement XDP socket zero-copy support for tsnep driver. I tried to
follow existing drivers like igc as far as possible. But one main
difference is that tsnep does not need any reconfiguration for XDP BPF
program setup. So I decided to keep this behavior no matter if a XSK
pool is used or not. As a result, tsnep starts using the XSK pool even
if no XDP BPF program is available.
Another difference is that I tried to prevent potentially failing
allocations during XSK pool setup. E.g. both memory models for page pool
and XSK pool are registered all the time. Thus, XSK pool setup cannot
end up with not working queues.
Some prework is done to reduce the last two XSK commits to actual XSK
changes.
====================
Link: https://lore.kernel.org/r/20230421194656.48063-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Send and complete XSK pool frames within TX NAPI context. NAPI context
is triggered by ndo_xsk_wakeup.
Test results with A53 1.2GHz:
xdpsock txonly copy mode, 64 byte frames:
pps pkts 1.00
tx 284,409 11,398,144
Two CPUs with 100% and 10% utilization.
xdpsock txonly zero-copy mode, 64 byte frames:
pps pkts 1.00
tx 511,929 5,890,368
Two CPUs with 100% and 1% utilization.
xdpsock l2fwd copy mode, 64 byte frames:
pps pkts 1.00
rx 248,985 7,315,885
tx 248,921 7,315,885
Two CPUs with 100% and 10% utilization.
xdpsock l2fwd zero-copy mode, 64 byte frames:
pps pkts 1.00
rx 254,735 3,039,456
tx 254,735 3,039,456
Two CPUs with 100% and 4% utilization.
Packet rate increases and CPU utilization is reduced in both cases.
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for XSK zero-copy to RX path. The setup of the XSK pool can
be done at runtime. If the netdev is running, then the queue must be
disabled and enabled during reconfiguration. This can be done easily
with functions introduced in previous commits.
A more important property is that, if the netdev is running, then the
setup of the XSK pool shall not stop the netdev in case of errors. A
broken netdev after a failed XSK pool setup is bad behavior. Therefore,
the allocation and setup of resources during XSK pool setup is done only
before any queue is disabled. Additionally, freeing and later allocation
of resources is eliminated in some cases. Page pool entries are kept for
later use. Two memory models are registered in parallel. As a result,
the XSK pool setup cannot fail during queue reconfiguration.
In contrast to other drivers, XSK pool setup and XDP BPF program setup
are separate actions. XSK pool setup can be done without any XDP BPF
program. The XDP BPF program can be added, removed or changed without
any reconfiguration of the XSK pool.
Test results with A53 1.2GHz:
xdpsock rxdrop copy mode, 64 byte frames:
pps pkts 1.00
rx 856,054 10,625,775
Two CPUs with both 100% utilization.
xdpsock rxdrop zero-copy mode, 64 byte frames:
pps pkts 1.00
rx 889,388 4,615,284
Two CPUs with 100% and 20% utilization.
Packet rate increases and CPU utilization is reduced.
100% CPU load seems to the base load. This load is consumed by ksoftirqd
just for dropping the generated packets without xdpsock running.
Using batch API reduced CPU utilization slightly, but measurements are
not stable enough to provide meaningful numbers.
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>