The egress timestamp FIFO/circular buffer work different on TJA1120 than
TJA1103.
For TJA1103 the new timestamp should be manually moved from the FIFO to
the hardware buffer before checking if the timestamp is valid.
For TJA1120 the hardware will move automatically the new timestamp
from the FIFO to the buffer and the user should check the valid bit, read
the timestamp and unlock the buffer by writing any of the buffer
registers(which are read only).
Another change for the TJA1120 is the behaviour of the EGR TS IRQ bit.
This bit was a self-clear bit for TJA1103, but now should be cleared
before reading the timestamp.
Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-8-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
PHY_BASIC_T1_FEATURES are not the right features supported by TJA1103
anymore.
For example ethtool reports:
[root@alarm ~]# ethtool end0
Settings for end0:
Supported ports: [ TP ]
Supported link modes: 100baseT1/Full
10baseT1L/Full
10baseT1L/Full is not supported by TJA1103 and supported ports list is
not completed. The PHY also have a MII port.
genphy_c45_pma_read_abilities implementation can detect the PHY features
and they look like this.
[root@alarm ~]# ethtool end0
Settings for end0:
Supported ports: [ TP MII ]
Supported link modes: 100baseT1/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 100baseT1/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 100Mb/s
Duplex: Full
Auto-negotiation: off
master-slave cfg: forced master
master-slave status: master
Port: Twisted Pair
PHYAD: 1
Transceiver: external
MDI-X: Unknown
Supports Wake-on: g
Wake-on: d
Link detected: yes
SQI: 7/7
Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-5-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Between TJA1120 and TJA1103 the hardware was improved, but some register
addresses were changed and some bit fields were moved from one register
to another.
Introduce the nxp_c45_reg_field structure and its associated functions to
abstract the differences between the PHYs.
Remove the defined bits and register addresses that are not common
between TJA1103 and TJA1120 and replace them with reg_fields and
register addresses from phydev->drv->driver_data.
Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-4-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gavin Li says:
====================
virtio_net: add per queue interrupt coalescing support
Currently, coalescing parameters are grouped for all transmit and receive
virtqueues. This patch series add support to set or get the parameters for
a specified virtqueue.
When the traffic between virtqueues is unbalanced, for example, one virtqueue
is busy and another virtqueue is idle, then it will be very useful to
control coalescing parameters at the virtqueue granularity.
Example command:
$ ethtool -Q eth5 queue_mask 0x1 --coalesce tx-packets 10
Would set max_packets=10 to VQ 1.
$ ethtool -Q eth5 queue_mask 0x1 --coalesce rx-packets 10
Would set max_packets=10 to VQ 0.
$ ethtool -Q eth5 queue_mask 0x1 --show-coalesce
Queue: 0
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
rx-usecs: 222
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 256
tx-usecs: 222
tx-frames: 0
tx-usecs-irq: 0
tx-frames-irq: 256
rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0
rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0
====================
Link: https://lore.kernel.org/r/20230731070656.96411-1-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add interrupt_coalesce config in send_queue and receive_queue to cache user
config.
Send per virtqueue interrupt moderation config to underlying device in
order to have more efficient interrupt moderation and cpu utilization of
guest VM.
Additionally, address all the VQs when updating the global configuration,
as now the individual VQs configuration can diverge from the global
configuration.
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230731070656.96411-3-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In preparation for multi-CPU support, set CPU port LOOKUP MEMBER outside
the port loop and setup the LOOKUP MEMBER mask for user ports only to
the first CPU port.
This is to handle flooding condition where every CPU port is set as
target and prevent packet duplication for unknown frames from user ports.
Secondary CPU port LOOKUP MEMBER mask will be setup later when
port_change_master will be implemented.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20230730074113.21889-3-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Address learning should initially be turned off by the driver for port
operation in standalone mode, then the DSA core handles changes to it
via ds->ops->port_bridge_flags().
Currently this is not the case for qca8k where learning is enabled
unconditionally in qca8k_setup for every user port.
Handle ports configured in standalone mode by making the learning
configurable and not enabling it by default.
Implement .port_pre_bridge_flags and .port_bridge_flags dsa ops to
enable learning for bridge that request it and tweak
.port_stp_state_set to correctly disable learning when port is
configured in standalone mode.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-2-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela says:
====================
net/sched: improve class lifetime handling
Valis says[0]:
============
Three classifiers (cls_fw, cls_u32 and cls_route) always copy
tcf_result struct into the new instance of the filter on update.
This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.
============
Turns out these could have been spotted easily with proper warnings.
Improve the current class lifetime with wrappers that check for
overflow/underflow.
While at it add an extack for when a class in use is deleted.
[0] https://lore.kernel.org/all/20230721174856.3045-1-sec@valis.email/
====================
Link: https://lore.kernel.org/r/20230728153537.1865379-1-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The 'filter_cnt' counter is used to control a Qdisc class lifetime.
Each filter referecing this class by its id will eventually
increment/decrement this counter in their respective
'add/update/delete' routines.
As these operations are always serialized under rtnl lock, we don't
need an atomic type like 'refcount_t'.
It also means that we lose the overflow/underflow checks already
present in refcount_t, which are valuable to hunt down bugs
where the unsigned counter wraps around as it aids automated tools
like syzkaller to scream in such situations.
Wrap the open coded increment/decrement into helper functions and
add overflow checks to the operations.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts says:
====================
mptcp: cleanup and improvements in the selftests
This small series of 4 patches adds some improvements in MPTCP
selftests:
- Patch 1 reworks the detailed report of mptcp_join.sh selftest to
better display what went well or wrong per test.
- Patch 2 adds colours (if supported, forced and/or not disabled) in
mptcp_join.sh selftest output to help spotting issues.
- Patch 3 modifies an MPTCP selftest tool to interact with the
path-manager via Netlink to always look for errors if any. This makes
sure odd behaviours can be seen in the logs and errors can be caught
later if needed.
- Patch 4 removes stdout and stderr redirections to /dev/null when using
pm_nl_ctl if no errors are expected in order to log odd behaviours.
====================
Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-0-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If a Netlink command for the MPTCP path-managers is not valid, it is
important to check if there are errors. If yes, they need to be reported
instead of being ignored and exiting without errors.
Now if no replies are expected, an ACK from the kernelspace is asked by
the userspace in order to always expect a reply. We can use the same
buffer that is currently always >1024 bytes. Then we can check if there
is an error (err->error), print it if any and report the error.
After this modification, it is required to mute expected errors in
mptcp_join.sh and pm_netlink.sh selftests:
- when trying to add a bad endpoint, e.g. duplicated
- when trying to set the two limits above the hard limit
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-3-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch modifies how the detailed results are printed, mainly to
improve what is displayed in case of issue:
- Now the test name (title) is printed earlier, when starting the test
if it is not intentionally skipped: by doing that, errors linked to
a test will be printed after having written the test name and then
avoid confusions.
- Due to the previous item, it is required to add a new line after
having printed the test name because in case of error with a command,
it is better not to have the output in the middle of the screen.
- Each check is printed on a dedicated line with aligned status (ok,
skip, fail): it is easier to spot which one has failed, simpler to
manage in the code not having to deal with alignment case by case and
helpers can be used to uniform what is done. These helpers can also be
useful later to do more actions depending on the results or change in
one place what is printed.
- Info messages have been reduced and aligned as well. And info messages
about the creation of the default test files of 1 KB are no longer
printed.
Example:
001 no JOIN
syn [ ok ]
synack [ ok ]
ack [ ok ]
Or with a skip and a failure:
001 no JOIN
syn [ ok ]
synack [fail] got 42 JOIN[s] synack expected 0
Server ns stats
(...)
Client ns stats
(...)
ack [skip]
Or with info:
104 Infinite map
Test file (size 128 KB) for client
Test file (size 128 KB) for server
file received by server has inverted byte at 169
5 corrupted pkts
syn [ ok ]
synack [ ok ]
While at it, verify_listener_events() now also print more info in case
of failure and in pm_nl_check_endpoint(), the test is marked as failed
instead of skipped if no ID has been given (internal selftest issue).
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-1-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
During unregister_netdevice_many_notify(), the ordering of our concerned
function calls is like this:
unregister_netdevice_many_notify
dev_shutdown
qdisc_put
clsact_destroy
tcx_uninstall
The syzbot reproducer triggered a case that the qdisc refcnt is not
zero during dev_shutdown().
tcx_uninstall() will then WARN_ON_ONCE(tcx_entry(entry)->miniq_active)
because the miniq is still active and the entry should not be freed.
The latter assumed that qdisc destruction happens before tcx teardown.
This fix is to avoid tcx_uninstall() doing tcx_entry_free() when the
miniq is still alive and let the clsact_destroy() do the free later, so
that we do not assume any specific ordering for either of them.
If still active, tcx_uninstall() does clear the entry when flushing out
the prog/link. clsact_destroy() will then notice the "!tcx_entry_is_active()"
and then does the tcx_entry_free() eventually.
Fixes: e420bed025 ("bpf: Add fd-based tcx multi-prog infra with link support")
Reported-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com
Reported-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com
Tested-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/222255fe07cb58f15ee662e7ee78328af5b438e4.1690549248.git.daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shuah Khan says:
====================
Connector/proc_filter test fixes
The first patch fixes the LKFT reported compile error, second
one adds .gitignore.
====================
Applying the first 2 patches, third one resent separately.
Link: https://lore.kernel.org/r/cover.1690564372.git.skhan@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The test compile fails with following errors. Fix the Makefile
CFLAGS to include KHDR_INCLUDES to pull in uapi defines.
gcc -Wall proc_filter.c -o ../tools/testing/selftests/connector/proc_filter
proc_filter.c: In function ‘send_message’:
proc_filter.c:22:33: error: invalid application of ‘sizeof’ to incomplete type ‘struct proc_input’
22 | sizeof(struct proc_input))
| ^~~~~~
proc_filter.c:42:19: note: in expansion of macro ‘NL_MESSAGE_SIZE’
42 | char buff[NL_MESSAGE_SIZE];
| ^~~~~~~~~~~~~~~
proc_filter.c:22:33: error: invalid application of ‘sizeof’ to incomplete type ‘struct proc_input’
22 | sizeof(struct proc_input))
| ^~~~~~
proc_filter.c:48:34: note: in expansion of macro ‘NL_MESSAGE_SIZE’
48 | hdr->nlmsg_len = NL_MESSAGE_SIZE;
| ^~~~~~~~~~~~~~~
`
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/all/CA+G9fYt=6ysz636XcQ=-KJp7vJcMZ=NjbQBrn77v7vnTcfP2cA@mail.gmail.com/
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/d0055c8cdf18516db8ba9edec99cfc5c08f32a7c.1690564372.git.skhan@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit df8fc4e934 ("kbuild: Enable -fstrict-flex-arrays=3") started
applying strict rules to standard string functions.
It does not work well with conventional socket code around each protocol-
specific sockaddr_XXX struct, which is cast from sockaddr_storage and has
a bigger size than fortified functions expect. See these commits:
commit 06d4c8a808 ("af_unix: Fix fortify_panic() in unix_bind_bsd().")
commit ecb4534b6a ("af_unix: Terminate sun_path when bind()ing pathname socket.")
commit a0ade8404c ("af_packet: Fix warning of fortified memcpy() in packet_getname().")
We must cast the protocol-specific address back to sockaddr_storage
to call such functions.
However, in the case of getsockaddr(SO_PEERNAME), the rationale is a bit
unclear as the buffer is defined by char[128] which is the same size as
sockaddr_storage.
Let's use sockaddr_storage explicitly.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>