Commit Graph

940 Commits

Author SHA1 Message Date
Linus Torvalds
c45be7c420 Merge tag 'for-linus-7.0-1' of https://github.com/cminyard/linux-ipmi
Pull IPMI driver fixes from Corey Minyard:
 "This mostly revolves around getting the driver to behave when the IPMI
  device misbehaves. Past attempts have not worked very well because I
  didn't have hardware I could make do this, and AI was fairly useless
  for help on this.

  So I modified qemu and my test suite so I could reproduce a
  misbehaving IPMI device, and with that I was able to fix the issues"

* tag 'for-linus-7.0-1' of https://github.com/cminyard/linux-ipmi:
  ipmi:si: Fix check for a misbehaving BMC
  ipmi:msghandler: Handle error returns from the SMI sender
  ipmi:si: Don't block module unload if the BMC is messed up
  ipmi:si: Use a long timeout when the BMC is misbehaving
  ipmi:si: Handle waiting messages when BMC failure detected
  ipmi:ls2k: Make ipmi_ls2k_platform_driver static
  ipmi: ipmb: initialise event handler read bytes
  ipmi: Consolidate the run to completion checking for xmit msgs lock
  ipmi: Fix use-after-free and list corruption on sender error
2026-02-26 14:34:21 -08:00
Corey Minyard
cae66f1a1d ipmi:si: Fix check for a misbehaving BMC
There is a race on checking the state in the sender, it needs to be
checked under a lock.  But you also need a check to avoid issues with
a misbehaving BMC for run to completion mode.  So leave the check at
the beginning for run to completion, and add a check under the lock
to avoid the race.

Reported-by: Rafael J. Wysocki <rafael@kernel.org>
Fixes: bc3a9d2177 ("ipmi:si: Gracefully handle if the BMC is non-functional")
Cc: stable@vger.kernel.org # 4.18
Signed-off-by: Corey Minyard <corey@minyard.net>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
2026-02-23 09:00:48 -06:00
Corey Minyard
62cd145453 ipmi:msghandler: Handle error returns from the SMI sender
It used to be, until recently, that the sender operation on the low
level interfaces would not fail.  That's not the case any more with
recent changes.

So check the return value from the sender operation, and propagate it
back up from there and handle the errors in all places.

Reported-by: Rafael J. Wysocki <rafael@kernel.org>
Fixes: bc3a9d2177 ("ipmi:si: Gracefully handle if the BMC is non-functional")
Cc: stable@vger.kernel.org # 4.18
Signed-off-by: Corey Minyard <corey@minyard.net>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
2026-02-23 09:00:48 -06:00
Corey Minyard
f895e5df80 ipmi:si: Don't block module unload if the BMC is messed up
If the BMC is in a bad state, don't bother waiting for queues messages
since there can't be any.  Otherwise the unload is blocked until the
BMC is back in a good state.

Reported-by: Rafael J. Wysocki <rafael@kernel.org>
Fixes: bc3a9d2177 ("ipmi:si: Gracefully handle if the BMC is non-functional")
Cc: stable@vger.kernel.org # 4.18
Signed-off-by: Corey Minyard <corey@minyard.net>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
2026-02-23 08:58:31 -06:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Corey Minyard
c3bb329563 ipmi:si: Use a long timeout when the BMC is misbehaving
If the driver goes into HOSED state, don't reset the timeout to the
short timeout in the timeout handler.

Reported-by: Igor Raits <igor@gooddata.com>
Closes: https://lore.kernel.org/linux-acpi/CAK8fFZ58fidGUCHi5WFX0uoTPzveUUDzT=k=AAm4yWo3bAuCFg@mail.gmail.com/
Fixes: bc3a9d2177 ("ipmi:si: Gracefully handle if the BMC is non-functional")
Cc: stable@vger.kernel.org # 4.18
Signed-off-by: Corey Minyard <corey@minyard.net>
2026-02-06 11:06:26 -06:00
Corey Minyard
52c9ee202e ipmi:si: Handle waiting messages when BMC failure detected
If a BMC failure is detected, the current message is returned with an
error.  However, if there was a waiting message, it would not be
handled.

Add a check for the waiting message after handling the current message.

Suggested-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Rafael J. Wysocki <rafael@kernel.org>
Closes: https://lore.kernel.org/linux-acpi/CAK8fFZ58fidGUCHi5WFX0uoTPzveUUDzT=k=AAm4yWo3bAuCFg@mail.gmail.com/
Fixes: bc3a9d2177 ("ipmi:si: Gracefully handle if the BMC is non-functional")
Cc: stable@vger.kernel.org # 4.18
Signed-off-by: Corey Minyard <corey@minyard.net>
2026-02-06 10:50:59 -06:00
Corey Minyard
6b157b408d ipmi:ls2k: Make ipmi_ls2k_platform_driver static
No need for it to be global.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601170753.3zDBerGP-lkp@intel.com/
Signed-off-by: Corey Minyard <corey@minyard.net>
2026-02-03 21:06:19 -06:00
Matt Johnston
9f235ccecd ipmi: ipmb: initialise event handler read bytes
IPMB doesn't use i2c reads, but the handler needs to set a value.
Otherwise an i2c read will return an uninitialised value from the bus
driver.

Fixes: 63c4eb3471 ("ipmi:ipmb: Add initial support for IPMI over IPMB")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Message-ID: <20260113-ipmb-read-init-v1-1-a9cbce7b94e3@codeconstruct.com.au>
Signed-off-by: Corey Minyard <corey@minyard.net>
2026-02-03 21:06:18 -06:00
Corey Minyard
1d90e6c1a5 ipmi: Consolidate the run to completion checking for xmit msgs lock
It made things hard to read, move the check to a function.

Signed-off-by: Corey Minyard <corey@minyard.net>
Reviewed-by: Breno Leitao <leitao@debian.org>
2026-02-03 21:06:18 -06:00
Corey Minyard
594c11d0e1 ipmi: Fix use-after-free and list corruption on sender error
The analysis from Breno:

When the SMI sender returns an error, smi_work() delivers an error
response but then jumps back to restart without cleaning up properly:

1. intf->curr_msg is not cleared, so no new message is pulled
2. newmsg still points to the message, causing sender() to be called
   again with the same message
3. If sender() fails again, deliver_err_response() is called with
   the same recv_msg that was already queued for delivery

This causes list_add corruption ("list_add double add") because the
recv_msg is added to the user_msgs list twice. Subsequently, the
corrupted list leads to use-after-free when the memory is freed and
reused, and eventually a NULL pointer dereference when accessing
recv_msg->done.

The buggy sequence:

  sender() fails
    -> deliver_err_response(recv_msg)  // recv_msg queued for delivery
    -> goto restart                    // curr_msg not cleared!
  sender() fails again (same message!)
    -> deliver_err_response(recv_msg)  // tries to queue same recv_msg
    -> LIST CORRUPTION

Fix this by freeing the message and setting it to NULL on a send error.
Also, always free the newmsg on a send error, otherwise it will leak.

Reported-by: Breno Leitao <leitao@debian.org>
Closes: https://lore.kernel.org/lkml/20260127-ipmi-v1-0-ba5cc90f516f@debian.org/
Fixes: 9cf93a8fa9 ("ipmi: Allow an SMI sender to return an error")
Cc: stable@vger.kernel.org # 4.18
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Corey Minyard <corey@minyard.net>
2026-02-03 21:02:54 -06:00
Linus Torvalds
b1ae17cd0f Merge tag 'for-linus-6.19-1' of https://github.com/cminyard/linux-ipmi
Pull IPMI updates from Corey Minyard:
 "Minor IPMI fixes:

   - Some device tree cleanups and a maintainer add

   - Fix a race when handling channel updates that could result in
     errors being reported to the user in some cases"

* tag 'for-linus-6.19-1' of https://github.com/cminyard/linux-ipmi:
  MAINTAINERS: Add entry on Loongson-2K IPMI driver
  dt-bindings: ipmi: Convert aspeed,ast2400-ibt-bmc to DT schema
  dt-bindings: ipmi: Convert nuvoton,npcm750-kcs-bmc to DT schema
  ipmi: Skip channel scan if channels are already marked ready
  ipmi: Fix __scan_channels() failing to rescan channels
  ipmi: Fix the race between __scan_channels() and deliver_response()
2025-12-05 20:49:24 -08:00
Linus Torvalds
4d38b88fd1 Merge tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:

 - Allow creaing nbcon console drivers with an unsafe write_atomic()
   callback that can only be called by the final nbcon_atomic_flush_unsafe().
   Otherwise, the driver would rely on the kthread.

   It is going to be used as the-best-effort approach for an
   experimental nbcon netconsole driver, see

     https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org

   Note that a safe .write_atomic() callback is supposed to work in NMI
   context. But some networking drivers are not safe even in IRQ
   context:

     https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35

   In an ideal world, all networking drivers would be fixed first and
   the atomic flush would be blocked only in NMI context. But it brings
   the question how reliable networking drivers are when the system is
   in a bad state. They might block flushing more reliable serial
   consoles which are more suitable for serious debugging anyway.

 - Allow to use the last 4 bytes of the printk ring buffer.

 - Prevent queuing IRQ work and block printk kthreads when consoles are
   suspended. Otherwise, they create non-necessary churn or even block
   the suspend.

 - Release console_lock() between each record in the kthread used for
   legacy consoles on RT. It might significantly speed up the boot.

 - Release nbcon context between each record in the atomic flush. It
   prevents stalls of the related printk kthread after it has lost the
   ownership in the middle of a record

 - Add support for NBCON consoles into KDB

 - Add %ptsP modifier for printing struct timespec64 and use it where
   possible

 - Misc code clean up

* tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits)
  printk: Use console_is_usable on console_unblank
  arch: um: kmsg_dump: Use console_is_usable
  drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT
  lib/vsprintf: Unify FORMAT_STATE_NUM handlers
  printk: Avoid irq_work for printk_deferred() on suspend
  printk: Avoid scheduling irq_work on suspend
  printk: Allow printk_trigger_flush() to flush all types
  tracing: Switch to use %ptSp
  scsi: snic: Switch to use %ptSp
  scsi: fnic: Switch to use %ptSp
  s390/dasd: Switch to use %ptSp
  ptp: ocp: Switch to use %ptSp
  pps: Switch to use %ptSp
  PCI: epf-test: Switch to use %ptSp
  net: dsa: sja1105: Switch to use %ptSp
  mmc: mmc_test: Switch to use %ptSp
  media: av7110: Switch to use %ptSp
  ipmi: Switch to use %ptSp
  igb: Switch to use %ptSp
  e1000e: Switch to use %ptSp
  ...
2025-12-03 12:42:36 -08:00
Andy Shevchenko
0cfc283d18 ipmi: Switch to use %ptSp
Use %ptSp instead of open coded variants to print content of
struct timespec64 in human readable format.

Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251113150217.3030010-12-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-19 12:26:06 +01:00
Jinhui Guo
1c35d80275 ipmi: Skip channel scan if channels are already marked ready
Channels remain static unless the BMC firmware changes.
Therefore, rescanning is unnecessary while they are marked
ready and no BMC update has occurred.

Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
Message-ID: <20250930074239.2353-4-guojinhui.liam@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-10-14 15:52:58 -05:00
Jinhui Guo
6bd30d8fc5 ipmi: Fix __scan_channels() failing to rescan channels
channel_handler() sets intf->channels_ready to true but never
clears it, so __scan_channels() skips any rescan. When the BMC
firmware changes a rescan is required. Allow it by clearing
the flag before starting a new scan.

Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
Message-ID: <20250930074239.2353-3-guojinhui.liam@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-10-14 15:52:58 -05:00
Jinhui Guo
936750fdba ipmi: Fix the race between __scan_channels() and deliver_response()
The race window between __scan_channels() and deliver_response() causes
the parameters of some channels to be set to 0.

1.[CPUA] __scan_channels() issues an IPMI request and waits with
         wait_event() until all channels have been scanned.
         wait_event() internally calls might_sleep(), which might
         yield the CPU. (Moreover, an interrupt can preempt
         wait_event() and force the task to yield the CPU.)
2.[CPUB] deliver_response() is invoked when the CPU receives the
         IPMI response. After processing a IPMI response,
         deliver_response() directly assigns intf->wchannels to
         intf->channel_list and sets intf->channels_ready to true.
         However, not all channels are actually ready for use.
3.[CPUA] Since intf->channels_ready is already true, wait_event()
         never enters __wait_event(). __scan_channels() immediately
         clears intf->null_user_handler and exits.
4.[CPUB] Once intf->null_user_handler is set to NULL, deliver_response()
         ignores further IPMI responses, leaving the remaining
	 channels zero-initialized and unusable.

CPUA                             CPUB
-------------------------------  -----------------------------
__scan_channels()
 intf->null_user_handler
       = channel_handler;
 send_channel_info_cmd(intf,
       0);
 wait_event(intf->waitq,
       intf->channels_ready);
  do {
   might_sleep();
                                 deliver_response()
                                  channel_handler()
                                   intf->channel_list =
				         intf->wchannels + set;
                                   intf->channels_ready = true;
                                   send_channel_info_cmd(intf,
                                         intf->curr_channel);
   if (condition)
    break;
   __wait_event(wq_head,
          condition);
  } while(0)
 intf->null_user_handler
       = NULL;
                                 deliver_response()
                                  if (!msg->user)
                                   if (intf->null_user_handler)
                                    rv = -EINVAL;
                                  return rv;
-------------------------------  -----------------------------

Fix the race between __scan_channels() and deliver_response() by
deferring both the assignment intf->channel_list = intf->wchannels
and the flag intf->channels_ready = true until all channels have
been successfully scanned or until the IPMI request has failed.

Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
Message-ID: <20250930074239.2353-2-guojinhui.liam@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-10-14 15:52:58 -05:00
Guenter Roeck
e2c69490dd ipmi: Fix handling of messages with provided receive message pointer
Prior to commit b52da4054e ("ipmi: Rework user message limit handling"),
i_ipmi_request() used to increase the user reference counter if the receive
message is provided by the caller of IPMI API functions. This is no longer
the case. However, ipmi_free_recv_msg() is still called and decreases the
reference counter. This results in the reference counter reaching zero,
the user data pointer is released, and all kinds of interesting crashes are
seen.

Fix the problem by increasing user reference counter if the receive message
has been provided by the caller.

Fixes: b52da4054e ("ipmi: Rework user message limit handling")
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20251006201857.3433837-1-linux@roeck-us.net>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-10-07 06:50:08 -05:00
Binbin Zhou
d46651d4e3 ipmi: Add Loongson-2K BMC support
This patch adds Loongson-2K BMC IPMI support.

According to the existing design, we use software simulation to
implement the KCS interface registers: Stauts/Command/Data_Out/Data_In.

Also since both host side and BMC side read and write kcs status, fifo flag
is used to ensure data consistency.

The single KCS message block is as follows:

+-------------------------------------------------------------------------+
|FIFO flags| KCS register data | CMD data | KCS version | WR REQ | WR ACK |
+-------------------------------------------------------------------------+

Co-developed-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Chong Qiao <qiaochong@loongson.cn>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Acked-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Message-ID: <8f9ffb6f0405345af8f04193ce1510aacd075e72.1756987761.git.zhoubinbin@loongson.cn>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-09-16 10:15:54 -05:00
Corey Minyard
bc3a9d2177 ipmi:si: Gracefully handle if the BMC is non-functional
If the BMC is not functional, the driver goes into an error state and
starts a 1 second timer.  When the timer times out, it will attempt a
simple message.  If the BMC interacts correctly, the driver will start
accepting messages again.  If not, it remains in error state.

If the driver goes into error state, all messages current and pending
will return with an error.

This should more gracefully handle when the BMC becomes non-operational,
as opposed to trying each messages individually and failing them.

Signed-off-by: Corey Minyard <corey@minyard.net>
2025-09-08 10:21:41 -05:00
Corey Minyard
3bc54ab3b9 ipmi: Rename "user_data" to "recv_msg" in an SMI message
It's only used to hold the corresponding receive message, so fix the
name to make that clear and the type so nothing else can be accidentally
assigned to it.

Signed-off-by: Corey Minyard <corey@minyard.net>
2025-09-08 10:21:41 -05:00
Corey Minyard
9cf93a8fa9 ipmi: Allow an SMI sender to return an error
Getting ready for handling when a BMC is non-responsive or broken, allow
the sender operation to fail in an SMI.  If it was a user-generated
message it will return the error.

The powernv code was already doing this internally, but the way it was
written could result in deep stack descent if there were a lot of
messages queued.  Have its send return an error in this case.

Signed-off-by: Corey Minyard <corey@minyard.net>
2025-09-08 10:21:41 -05:00
Corey Minyard
abe4918a94 ipmi:si: Move flags get start to its own function
It's about to be used from another place, and this looks better,
anyway.

Signed-off-by: Corey Minyard <corey@minyard.net>
2025-09-08 10:21:41 -05:00
Corey Minyard
753bc23d8f ipmi:si: Merge some if statements
Changes resulted in a silly looking piece of logic.  Get rid of a goto
and use if statements properly.

Signed-off-by: Corey Minyard <corey@minyard.net>
2025-09-08 10:21:41 -05:00
Corey Minyard
bbfb8353cb ipmi: Set a timer for maintenance mode
Now that maintenance mode rejects all messages, there's nothing to
run time timer.  Make sure the timer is running in maintenance mode.

Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Frederick Lawler <fred@cloudflare.com>
2025-09-08 10:21:41 -05:00
Corey Minyard
627118470f ipmi: Add a maintenance mode sysfs file
So you can see if it's in maintenance mode and see how long is left.

Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Frederick Lawler <fred@cloudflare.com>
2025-09-08 10:21:41 -05:00
Corey Minyard
30f6c9d545 ipmi: Disable sysfs access and requests in maintenance mode
If the driver goes into any maintenance mode, disable sysfs access until
it is done.

If the driver goes into reset maintenance mode, disable all messages
until it is done.

Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Frederick Lawler <fred@cloudflare.com>
2025-09-08 10:21:41 -05:00
Corey Minyard
e5feb030d9 ipmi: Differentiate between reset and firmware update in maintenance
This allows later changes to have different behaviour during a reset
verses a firmware update.

Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Frederick Lawler <fred@cloudflare.com>
2025-09-08 10:21:40 -05:00
Corey Minyard
b52da4054e ipmi: Rework user message limit handling
The limit on the number of user messages had a number of issues,
improper counting in some cases and a use after free.

Restructure how this is all done to handle more in the receive message
allocation routine, so all refcouting and user message limit counts
are done in that routine.  It's a lot cleaner and safer.

Reported-by: Gilles BULOZ <gilles.buloz@kontron.com>
Closes: https://lore.kernel.org/lkml/aLsw6G0GyqfpKs2S@mail.minyard.net/
Fixes: 8e76741c3d ("ipmi: Add a limit on the number of users that may use IPMI")
Cc: <stable@vger.kernel.org> # 4.19
Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Gilles BULOZ <gilles.buloz@kontron.com>
2025-09-08 10:21:28 -05:00
Corey Minyard
5d09ee1bec Revert "ipmi: fix msg stack when IPMI is disconnected"
This reverts commit c608966f3f.

This patch has a subtle bug that can cause the IPMI driver to go into an
infinite loop if the BMC misbehaves in a certain way.  Apparently
certain BMCs do misbehave this way because several reports have come in
recently about this.

Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Eric Hagberg <ehagberg@janestreet.com>
Cc: <stable@vger.kernel.org> # 6.2
2025-09-08 10:08:25 -05:00
Corey Minyard
8fd8ea2869 ipmi:msghandler:Change seq_lock to a mutex
Dan Carpenter got a Smatch warning:

	drivers/char/ipmi/ipmi_msghandler.c:5265 ipmi_free_recv_msg()
	warn: sleeping in atomic context

due to the recent rework of the IPMI driver's locking.  I didn't realize
vfree could block.  But there is an easy solution to this, now that
almost everything in the message handler runs in thread context.

I wanted to spend the time earlier to see if seq_lock could be converted
from a spinlock to a mutex, but I wanted the previous changes to go in
and soak before I did that.  So I went ahead and did the analysis and
converting should work.  And solve this problem.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202503240244.LR7pOwyr-lkp@intel.com/
Fixes: 3be997d5a6 ("ipmi:msghandler: Remove srcu from the ipmi user structure")
Cc: <stable@vger.kernel.org> # 6.16
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-09-08 10:08:14 -05:00
Linus Torvalds
d244f9bb59 Merge tag 'for-linus-6.17-1' of https://github.com/cminyard/linux-ipmi
Pull ipmi updates from Corey Minyard:
 "Some small fixes for the IPMI driver

  Nothing huge, some rate limiting on logs, a strncpy fix where the
  source and destination could be the same, and removal of some unused
  cruft"

* tag 'for-linus-6.17-1' of https://github.com/cminyard/linux-ipmi:
  ipmi: Use dev_warn_ratelimited() for incorrect message warnings
  char: ipmi: remove redundant variable 'type' and check
  ipmi: Fix strcpy source and destination the same
2025-08-07 07:38:25 +03:00
Breno Leitao
ec50ec378e ipmi: Use dev_warn_ratelimited() for incorrect message warnings
During BMC firmware upgrades on live systems, the ipmi_msghandler
generates excessive "BMC returned incorrect response" warnings
while the BMC is temporarily offline. This can flood system logs
in large deployments.

Replace dev_warn() with dev_warn_ratelimited() to throttle these
warnings and prevent log spam during BMC maintenance operations.

Signed-off-by: Breno Leitao <leitao@debian.org>
Message-ID: <20250710-ipmi_ratelimit-v1-1-6d417015ebe9@debian.org>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-07-10 07:59:43 -05:00
Colin Ian King
f6f9760320 char: ipmi: remove redundant variable 'type' and check
The variable 'type' is assigned the value SI_INVALID which is zero
and later checks of 'type' is non-zero (which is always false). The
variable is not referenced anywhere else, so it is redundant and
so is the check, so remove these.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Message-ID: <20250708151805.1893858-1-colin.i.king@gmail.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-07-08 12:15:44 -05:00
Corey Minyard
8ffcb7560b ipmi: Fix strcpy source and destination the same
The source and destination of some strcpy operations was the same.
Split out the part of the operations that needed to be done for those
particular calls so the unnecessary copy wasn't done.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506140756.EFXXvIP4-lkp@intel.com/
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-06-13 19:06:26 -05:00
Ingo Molnar
41cb08555c treewide, timers: Rename from_timer() to timer_container_of()
Move this API to the canonical timer_*() namespace.

[ tglx: Redone against pre rc1 ]

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
2025-06-08 09:07:37 +02:00
Dan Carpenter
fa332f5dc6 ipmi:msghandler: Fix potential memory corruption in ipmi_create_user()
The "intf" list iterator is an invalid pointer if the correct
"intf->intf_num" is not found.  Calling atomic_dec(&intf->nr_users) on
and invalid pointer will lead to memory corruption.

We don't really need to call atomic_dec() if we haven't called
atomic_add_return() so update the if (intf->in_shutdown) path as well.

Fixes: 8e76741c3d ("ipmi: Add a limit on the number of users that may use IPMI")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Message-ID: <aBjMZ8RYrOt6NOgi@stanley.mountain>
Signed-off-by: Corey Minyard <corey@minyard.net>
2025-05-07 17:25:48 -05:00
Corey Minyard
971a00454d ipmi:watchdog: Use the new interface for panic messages
It's available, remove all the duplicate code.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
6f7f6605c9 ipmi:msghandler: Export and fix panic messaging capability
Don't have the other users that do things at panic time (the watchdog)
do all this themselves, provide a function to do it.

Also, with the new design where most stuff happens at thread context,
a few things needed to be fixed to avoid doing locking in a panic
context.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
6bd0eb6d75 ipmi:ssif: Fix a shutdown race
It was possible for the SSIF thread to stop and quit before the
kthread_stop() call because ssif->stopping was set before the
stop.  So only exit the SSIF thread is kthread_should_stop()
returns true.

There is no need to wake the thread, as the wait will be interrupted
by kthread_stop().

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
87105e0780 ipmi:msghandler: Don't deliver messages to deleted users
Check to see if they have been destroyed before trying to deliver a
message.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
ada2abadda ipmi:si: Rework startup of IPMI devices
It is possible in some situations that IPMI devices won't get started up
properly.  This change makes it so all non-duplicate devices will get
started up.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
ed59cd28af ipmi:msghandler: Add a error return from unhandle LAN cmds
If we get a command from a LAN channel, return an error instead of just
throwing it away.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
8871e77ec7 ipmi:msghandler: Shut down lower layer first at unregister
This makes sure any outstanding messages are returned to the user before
the interface is cleaned up.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
60afcc429c ipmi:msghandler: Remove proc_fs.h
It's no longer used.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
f2a31163d6 ipmi:msghandler: Don't check for shutdown when returning responses
The lower level interface shouldn't attempt to unregister if it has a
callback in the pending queue.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
ff2d2bc9f2 ipmi:msghandler: Don't acquire a user refcount for queued messages
Messages already have a refcount for the user, so there's no need to
account for a new one.

As part of this, grab a refcount to the interface when processing
received messages.  The messages can be freed there, cause the user
then the interface to be freed.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
84fe1ebcc9 ipmi:msghandler: Fix locking around users and interfaces
Now that SRCU is gone from IPMI, it can no longer be sloppy about
locking.  Use the users mutex now when sending a message, not the big
ipmi_interfaces mutex, because it can result in a recursive lock.  The
users mutex will work because the interface destroy code claims it after
setting the interface in shutdown mode.

Also, due to the same changes, rework the refcounting on users and
interfaces.  Remove the refcount to an interface when the user is
freed, not when it is destroyed.  If the interface is destroyed
while the user still exists, the user will still point to the
interface to test that it is valid if the user tries to do anything
but delete the user.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00
Corey Minyard
83d19f03f3 ipmi:msghandler: Remove some user level processing in panic mode
When run to completion is set, don't call things that will claim
mutexes or call user callbacks.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2025-05-07 17:25:48 -05:00