Commit 38b7cc448a ("gpu: nova-core: implement Display for Spec") in
drm-rust-next introduced some usage of the Display trait, but the
Display trait is being modified in the rust tree this cycle. Thus, to
avoid conflicts with the Rust tree, tweak how the formatting machinery
is used in a way where it works both with and without the changes in the
Rust tree.
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Tested-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20251117-nova-fmt-rust-v1-1-651ca28cd98f@google.com
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Pass in a PCI device to Spec::new(), and provide a Display
implementation for boot42, in order to provide a clear, concise report
of what happened: the driver read NV_PMC_BOOT42, and found that the GPU
is not supported.
For very old GPUs (older than Fermi), the driver still returns ENODEV,
but it does so without a driver-specific dmesg report. That is exactly
appropriate, because if such a GPU is installed, it can only be
supported by Nouveau. And if so, the user is not helped by additional
error messages from Nova.
Here's the full dmesg output for a Blackwell (not yet supported) GPU:
NovaCore 0000:01:00.0: Probe Nova Core GPU driver.
NovaCore 0000:01:00.0: Unsupported chipset: boot42 = 0x1b2a1000 (architecture 0x1b, implementation 0x2)
NovaCore 0000:01:00.0: probe with driver NovaCore failed with error -524
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: fix commit log with ENODEV (not ENOTSUPP) error
code for unsupported GPUs.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251115010923.1192144-5-jhubbard@nvidia.com>
NVIDIA GPUs are moving away from using NV_PMC_BOOT_0 to contain
architecture and revision details, and will instead use NV_PMC_BOOT_42
in the future. NV_PMC_BOOT_0 will contain a specific set of values
that will mean "go read NV_PMC_BOOT_42 instead".
Change the selection logic in Nova so that it will claim Turing and
later GPUs. This will work for the foreseeable future, without any
further code changes here, because all NVIDIA GPUs are considered, from
the oldest supported on Linux (NV04), through the future GPUs.
Add some comment documentation to explain, chronologically, how boot0
and boot42 change with the GPU eras, and how that affects the selection
logic.
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: remove unneeded `From<BOOT_0> for Revision`
implementation.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251115010923.1192144-4-jhubbard@nvidia.com>
After GSP initialization is complete, retrieve the static configuration
information from GSP-RM. This information includes GPU name, capabilities,
memory configuration, and other properties. On some GPU variants, it is
also required to do this for initialization to complete.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: properly abstract the command's bindings, add
relevant methods, make str_from_null_terminated return an Option, fix
size of GPU name array.]
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-14-joelagnelf@nvidia.com>
This adds the GSP init done command to wait for GSP initialization
to complete. Once this command has been received the GSP is fully
operational and will respond properly to normal RPC commands.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[acourbot@nvidia.com: move new definitions to end of commands.rs, rename
to `wait_gsp_init_done` and remove timeout argument.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-13-joelagnelf@nvidia.com>
These opcodes are used for register write, modify, poll and store (save)
sequencer operations.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
[acourbot@nvidia.com: apply Lyude's suggested fixes.]
Message-ID: <20251114195552.739371-9-joelagnelf@nvidia.com>
Implement the GSP sequencer which culminates in INIT_DONE message being
received from the GSP indicating that the GSP has successfully booted.
This is just initial sequencer support, the actual commands will be
added in the next patches.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot@nvidia.com: move GspSequencerInfo definition before its impl
blocks and rename it to GspSequence, adapt imports in sequencer.rs to
new formatting rules, remove `timeout` argument to harmonize with other
commands.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-8-joelagnelf@nvidia.com>
Move falcon reading/writing to mbox functionality into helper so we can
use it from the sequencer resume flow.
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot@nvidia.com: make write/read mailbox methods unfallible.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251114195552.739371-4-joelagnelf@nvidia.com>
Add definition for RISCV_CPUCTL register and use it in a new falcon API
to check if the RISC-V core of a Falcon is active. It is required by
the sequencer to know if the GSP's RISCV processor is active.
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-13-8ae4058e3c0e@nvidia.com>
Add support for sending the SetRegistry command, which is critical to
GSP initialization.
The RM registry is serialized into a packed format and sent via the
command queue. For now only three parameters which are required to boot
GSP are hardcoded. In the future a kernel module parameter will be added
to enable other parameters to be added.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
[acourbot@nvidia.com: split into its own patch.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-12-8ae4058e3c0e@nvidia.com>
Initialise the GSP resource manager arguments (rmargs) which provides
initialisation parameters to the GSP firmware during boot. The rmargs
structure contains arguments to configure the GSP message/command queue
location.
These are mapped for coherent DMA and added to the libos data structure
for access when booting GSP.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-10-8ae4058e3c0e@nvidia.com>
This commit introduces core infrastructure for handling GSP command and
message queues in the nova-core driver. The command queue system enables
bidirectional communication between the host driver and GSP firmware
through a remote message passing interface.
The interface is based on passing serialised data structures over a ring
buffer with separate transmit and receive queues. Commands are sent by
writing to the CPU transmit queue and waiting for completion via the
receive queue.
To ensure safety mutable or immutable (depending on whether it is a send
or receive operation) references are taken on the command queue when
allocating the message to write/read to. This ensures message memory
remains valid and the command queue can't be mutated whilst an operation
is in progress.
Currently this is only used by the probe() routine and therefore can
only used by a single thread of execution. Locking to enable safe access
from multiple threads will be introduced in a future series when that
becomes necessary.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-9-8ae4058e3c0e@nvidia.com>
Derive the Zeroable trait for existing bindgen generated bindings. This
is safe because all bindgen generated types are simple integer types for
which any bit pattern, including all zeros, is valid.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-7-8ae4058e3c0e@nvidia.com>
A data structure that can be used to write across multiple slices which
may be out of order in memory. This lets SBuffer user correctly and
safely write out of memory order, without error-prone tracking of
pointers/offsets.
let mut buf1 = [0u8; 3];
let mut buf2 = [0u8; 5];
let mut sbuffer = SBuffer::new([&mut buf1[..], &mut buf2[..]]);
let data = b"hello";
let result = sbuffer.write(data);
Reviewed-by: Lyude Paul <lyude@redhat.com>
Co-developed-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-6-8ae4058e3c0e@nvidia.com>
The GSP requires some pieces of metadata to boot. These are passed in a
struct which the GSP transfers via DMA. Create this struct and get a
handle to it for future use when booting the GSP.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-5-8ae4058e3c0e@nvidia.com>
The GSP requires several areas of memory to operate. Each of these have
their own simple embedded page tables. Set these up and map them for DMA
to/from GSP using CoherentAllocation's. Return the DMA handle describing
where each of these regions are for future use when booting GSP.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-4-8ae4058e3c0e@nvidia.com>
There are times where we need to store a constant value defined as a
larger type (e.g. through a binding) into a smaller type, knowing
that the value will fit. Rust, unfortunately, only provides us with the
`as` operator for that purpose, the use of which is discouraged as it
silently strips data.
Extend the `num` module with functions allowing to perform the
conversion infallibly, at compile time.
Example:
const FOO_VALUE: u32 = 1;
// `FOO_VALUE` fits into a `u8`, so the conversion is valid.
let foo = num::u32_to_u8::<{ FOO_VALUE }>();
We are going to use this feature extensively in Nova.
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
[acourbot@nvidia.com: fix mistake in example pointed out by Mikko.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-3-8ae4058e3c0e@nvidia.com>
Compute more of the required FB layout information to boot the GSP
firmware.
This information is dependent on the firmware itself, so first we need
to import and abstract the required firmware bindings in the `nvfw`
module.
Then, a new FB HAL method is introduced in `fb::hal` that uses these
bindings and hardware information to compute the correct layout
information.
This information is then used in `fb` and the result made visible in
`FbLayout`.
These 3 things are grouped into the same patch to avoid lots of unused
warnings that would be tedious to work around. As they happen in
different files, they should not be too difficult to track separately.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251110-gsp_boot-v9-1-8ae4058e3c0e@nvidia.com>
There are a few remaining cases where we *do* want to use `as`,
because we specifically want to strip the data that does not fit into
the destination type. Comment these uses to clear confusion about the
intent.
Acked-by: Danilo Krummrich <dakr@kernel.org>
[acourbot@nvidia.com: fix merge conflicts after rebase.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-as-v3-6-6a30c7333ad9@nvidia.com>
Use the newly-introduced `num` module to replace the use of `as`
wherever it is safe to do. This ensures that a given conversion cannot
lose data if its source or destination type ever changes.
Acked-by: Danilo Krummrich <dakr@kernel.org>
[acourbot@nvidia.com: fix merge conflicts after rebase.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-as-v3-5-6a30c7333ad9@nvidia.com>
The core library's `From` implementations do not cover conversions
that are not portable or future-proof. For instance, even though it is
safe today, `From<usize>` is not implemented for `u64` because of the
possibility to support larger-than-64bit architectures in the future.
However, the kernel supports a narrower set of architectures, and in the
case of Nova we only support 64-bit. This makes it helpful and desirable
to provide more infallible conversions, lest we need to rely on the `as`
keyword and carry the risk of silently losing data.
Thus, introduce a new module `num` that provides safe const functions
performing more conversions allowed by the build target, as well as
`FromSafeCast` and `IntoSafeCast` traits that are just extensions of
`From` and `Into` to conversions that are known to be lossless.
Suggested-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/rust-for-linux/DDK4KADWJHMG.1FUPL3SDR26XF@kernel.org/
Acked-by: Danilo Krummrich <dakr@kernel.org>
[acourbot@nvidia.com: fix merge conflicts after rebase.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-as-v3-4-6a30c7333ad9@nvidia.com>
This patch solves one of the existing mentions of COHA, a task
in the Nova task list about improving the `CoherentAllocation` API.
It uses the `write` method from `CoherentAllocation`.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-3-delcastillodelarosadaniel@gmail.com>
Some comments that already existed didn't start with a capital
letter, this patch fixes that.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-2-delcastillodelarosadaniel@gmail.com>
This patch solves one of the existing mentions of COHA, a task
in the Nova task list about improving the `CoherentAllocation` API.
It uses the new `from_bytes` method from the `FromBytes` trait as
well as the `as_slice` and `as_slice_mut` methods from
`CoherentAllocation`.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".]
[acourbot@nvidia.com: fix merge conflict after imports refactor.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-1-delcastillodelarosadaniel@gmail.com>
As per [1], we need one "use" item per line, in order to reduce merge
conflicts. Furthermore, we need a trailing ", //" in order to tell
rustfmt(1) to leave it alone.
This does that for the entire nova-core driver.
[1] https://docs.kernel.org/rust/coding-guidelines.html#imports
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: remove imports already in prelude as pointed out
by Danilo.]
[acourbot@nvidia.com: remove a few unneeded trailing `//`.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251107021006.434109-1-jhubbard@nvidia.com>
There are a few situations in the driver where we convert a `usize` into
a `u32` using `as`. Even though most of these are obviously correct, use
`try_from` and let the compiler optimize wherever it is safe to do so.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-as-v3-3-6a30c7333ad9@nvidia.com>
The `as` operator is best avoided as it silently drops bits if the
destination type is smaller that the source.
For data types where this is clearly not the case, use `from` to
unambiguously signal that these conversions are lossless.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-as-v3-1-6a30c7333ad9@nvidia.com>