Until LNL, intel_dsb_wait_vblanks() used to wait for the undelayed vblank
start. However, from PTL onwards, it waits for the start of the
safe-window defined by the number of lines programmed in the register
TRANS_SET_CONTEXT_LATENCY. This change was introduced to move the SCL
window out of the vblank region, supporting modes with higher refresh
rates and smaller vblanks. This change introduces a "safe window" a
scanline range from (undelayed vblank - SCL) to (delayed vblank - SCL).
As a result, on PTL+ platforms, the DSB wait for vblank completes exactly
SCL lines earlier than the undelayed vblank start (safe window start).
If the flip occurs in the active region and the push happens before the
vmin decision boundary, the DSB wait fires early, and the push is sent
inside this safe window. In such cases, the push bit is cleared at the
delayed vblank, but our wait logic does not account for the early trigger,
leading to DSB poll errors.
To fix this, we add an explicit wait for the end of the safe window i.e.,
the scanline range from (undelayed vblank - SCL) to (delayed vblank - SCL).
Once past this window, we are exactly SCL lines away from the delayed
vblank, and our existing wait logic works as intended.
This additional wait is only effective if the push occurs before the vmin
decision boundary. If the push happens after the boundary, the hardware
already guarantees we're SCL lines away from the delayed vblank, and the
extra wait becomes a no-op.
v2:
- Use helpers for safe window start/end. (Ville)
- Move the extra wait inside the helper to wait for delayed vblank. (Ville)
- Update the commit message.
v3:
- Add more documentation for explanation for the wait. (Ville)
- Rename intel_vrr_vmin_safe_window_start/end as this is vmin safe
window. (Ville)
- Minor refactoring to align with the code. (Ville)
- Update the commit message for more clarity.
v4:
- Retain name for intel_vrr_safe_window_start as it doesn't change with
vmin/vmax etc. (Ville)
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/20250924141542.3122126-7-ankit.k.nautiyal@intel.com
The helper intel_dsb_wait_vblank_delay() is used in DSB to wait for the
delayed vblank after the send push operation. Rename it to
intel_dsb_wait_for_delayed_vblank() to align with the semantics.
v2: Rename to intel_dsb_wait_vblank_delay instead of the proposed SCL
semantics, as this will be ot only about SCL lines with different timing
generator and different refresh rate modes. (Ville)
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/20250924141542.3122126-6-ankit.k.nautiyal@intel.com
For now guardband is equal to the vblank length so ideally it should be
computed as difference between the vmin vtotal and vactive. However
since we are having few lines as SCL, we need to account for this while
computing the guardband.
Since the vblank start is moved by SCL lines from the vactive, the delta
between the vmin vtotal and new vblank start was used to account for this.
Now that SCL is explicitly tracked using the `set_context_latency` member,
use it directly in the guardband calculation.
In the future, when the guardband is shortened or optimized, we may need
to factor in both the change in the vblank start and SCL lines. For now,
explicitly accounting for SCL is sufficient.
v2: Fix typo: replace adjusted_mode->vdisplay with
adjusted_mode->crtc_vdisplay. (Ville)
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/20250924141542.3122126-5-ankit.k.nautiyal@intel.com
'Set context latency' (SCL, Window W2) is defined as the number of lines
before the double buffering point, which are required to complete
programming of the registers, typically when DSB is used to program the
display registers.
Since we are not using this window for programming the registers, this
is mostly set to 0, unless there is a requirement for few cases related
to PSR/PR where the 'set context latency' should be at least 1.
Currently we are using the 'set context latency' (if required) implicitly
by moving the vblank start by the required amount and then measuring the
delay i.e. the difference between undelayed vblank start and delayed vblank
start.
Since our guardband matches the vblank length, this was not a problem as
the difference between the undelayed vblank and delayed vblank was at
the most equal to the 'set context latency' lines.
However, if we want to optimize the guardband, the difference between the
undelayed and the delayed vblank will be large and we cannot use this
difference as the 'set context latency' lines.
To make way for this optimization of guardband, formally introduce the
'set context latency' or SCL and track it as a new member
`set_context_latency` of the structure intel_crtc_state.
Eventually, all references of vblank delay where we mean to use set
context latency will be replaced by this new `set_context_latency`
member.
Note: for TGL the TRANS_SET_CONTEXT_LATENCY doesn't exist to account for
the SCL. However, the VBLANK_START-VACTIVE difference plays an identical
role here ie. it can be used to create the SCL window ahead of the
undelayed vblank.
While readback since there is no specific register to read out the SCL, use
the difference between vblank start and vactive to populate the new member
for TGL.
v2:
- Use u16 for set_context_latency. (Ville)
- s/vblank_delay/set_context_latency. (Ville)
- Meld the changes for TGL with this change. (Ville)
v3:
- Update comment to clarify the TGL case. (Ville)
- Fix typo in commit message.
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/20250924141542.3122126-3-ankit.k.nautiyal@intel.com
Split out display irq handling on ilk. Since the master IRQ enable is in
DEIIR, we'll need to do this in two parts. First, add
ilk_display_irq_master_disable() to disable master and south interrupts,
and second, add (repurposed) ilk_display_irq_handler() to finish display
irq handling.
It's not the prettiest thing you ever saw, but improves separation of
display irq handling. And removes HAS_PCH_NOP() and DISPLAY_VER() checks
from core irq code.
v2:
- Separate ilk_display_irq_master_enable() (Ville)
- Use _fw mmio accessors (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/e8ea7c985c3f3a80870f3333bde2e1bf30d653b0.1758637773.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
On PTL, no combo PHY is connected to PORT B. However, PORT B can
still be used for Type-C and will utilize the C20 PHY for eDP
over Type-C. In such configurations, VBTs also enumerate PORT B.
This leads to issues where PORT B is incorrectly identified as using the
C10 PHY, due to the assumption that returning true for PORT B in
intel_encoder_is_c10phy() would not cause problems.
From PTL's perspective, only PORT A/PHY A uses the C10 PHY.
Update the helper intel_encoder_is_c10phy() to return true only for
PORT A/PHY on PTL.
v2: Change the condition code style for ptl/wcl
Bspec: 72571,73944
Fixes: 9d10de78a3 ("drm/i915/wcl: C10 phy connected to port A and B")
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250922150317.2334680-4-dnyaneshwar.bhadane@intel.com
Bump the latency for all watermark levels in the
16Gb+ DIMM w/a. The spec does ask us to do it only for level
0, but it seems more sane to bump all the levels. If the actual
memory access is slower then the wakeup (WM1+) should also
potentially happen earlier.
This also avoids the theoretical case that WM0 would get bumped
higher than WM1+. Not that it is likely to happen because the WM0
latency is always significantly lower than the WM1 latency.
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250919193000.17665-9-ville.syrjala@linux.intel.com
If WM0 latency is zero we need to bump it (and the WM1+ latencies)
but a fixed amount. But any WM1+ level with zero latency must
not be touched since that indicates that corresponding WM level
isn't supported.
Currently the loop doing that adjustment does work, but only because
the previous loop modified the num_levels used as the loop boundary.
This all seems a bit too fragile. Remove the num_levels adjustment
and instead adjust the read latency loop to abort when it encounters
a zero latency value.
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250919193000.17665-4-ville.syrjala@linux.intel.com
While the spec only asks us to do the WM0 latency bump for 16Gb
DRAM devices I believe we should apply it for larger DRAM chips.
At the time the w/a was added there were no larger chips on
the market, but I think I've seen at least 32Gb DDR4 chips
being available these days.
Whether it's possible to actually find suitable DIMMs for the
affected systems with largers chips I don't know. Also it's
not known whether the 1 usec latency bump would be sufficient
for larger chips. Someone would need to find such DIMMs and
test this. Fortunately we do have a bit of extra latency already
with the 1 usec bump, as the actual requirement was .4 usec for
for 16Gb chips.
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250919193000.17665-2-ville.syrjala@linux.intel.com
Since this switcheroo garbage bypasses all the core pm we
have to manually manage the pci state. To that end add the
missing pci_restore_state() to the switcheroo resume hook.
We already have the pci_save_state() counterpart on the
suspend side.
Arguably none of this code should exist in the driver
in the first place, and instead the entire switcheroo
mechanism should be rewritten and properly integrated into
core pm code...
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250919185015.14561-5-ville.syrjala@linux.intel.com
If the driver doesn't call pci_save_state() drivers/pci will
normally save+power manage the device from the _noirq() pm hooks.
We can't let that happen as some old BIOSes fail to hibernate
when the device is in D3. However, we can get very close to
the standard behaviour by doing our explicit pci_save_state()
and pci_set_power_state() stuff from driver provided _noirq()
hooks.
This results in a change of behaviour where we no longer go
into D3 at the end of freeze_late, so when it comes time
to thaw() we'll already be in D0, and thus we can drop the
explicit pci_set_power_state(D0) call.
Presumably switcheroo suspend will want to go into D3 so
call the _noirq() stuff from the switcheroo suspend hook,
and since we dropped the pci_set_power_state(D0) from
resume_early() we'll need to add one back into the
switcheroo resume hook.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250919185015.14561-4-ville.syrjala@linux.intel.com
Currently we check if the encoder is INVALID or -1 and throw a
WARN_ON but we still end up writing the temp value which will
overflow and corrupt the whole programmed value.
--v2
-Assign a bogus transcoder to master in case we get a INVALID
TRANSCODER [Jani]
Fixes: 6671c367a9 ("drm/i915/tgl: Select master transcoder for MST stream")
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20250908042208.1011144-1-suraj.kandpal@intel.com
ICL/TGL VRR hardware won't allow us to program flipline==vmin. If we do
that the actual effect will be the same as if we had programmed
flipline=vmin+1, which would make the minimum vtotal one scanline taller
than expected.
To compensate for this we reduce vmin by one, and then program
flipline=vmin+1. So we end up with a flipline value that matches
the expected minimum vtotal. Currently this adjustment happens
in intel_vrr_compute_config() which means that crtc_state->vrr.vmin
will no longer be directly usable for the remainder of the high
level VRR code. That is annoying at best, fragile at worst.
Hide the adjustment in low level code instead. This will allow most
of the higher level VRR code to remain blissfully ignorant about this
fact. Afterwards crtc_state->vrr.{vmin,flipline} will be equal
and match the minimum vtotal, exactly how things already work
on ADL+.
The only slight downside is that the actual register value will no
longer match crtc_state->vrr.vmin on ICL/TGL, but that may already
be the case on TGL because the register value will also have been
adjusted by the SCL.
Note that we must change the guardband calculation to account
for intel_vrr_extra_vblank_delay() explicitly. Previously that
was accidentally handled by the earlier vmin reduction by
intel_vrr_flipline_offset().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250918232226.25295-2-ville.syrjala@linux.intel.com
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
num_entries comes from package_header, which is read from an external
firmware blob and thus untrusted. In parse_dmc_fw_package() we assign
package_header->num_entries to a local variable, but the range check
still uses the struct field directly.
Switch the check to use the local copy instead. This makes the
sanitization explicit and avoids a redundant dereference.
Reviewed-by: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20250909083042.1292672-1-luciano.coelho@intel.com
There are three groups of platforms using i915->irq_mask independently:
gen 2-4, VLV/CHV, and gen 5-7.
The gen 5-7 usage is primarily limited to display. Move its irq_mask
usage to struct intel_display as ilk_de_imr_mask for gen 5-7.
ilk_de_imr_mask could be put inside a union with with vlv_imr_mask and
de_irq_mask[], but keep them separate to avoid accidental aliasing of
the values.
With this, we can also drop the irq_mask member from struct xe_device.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/adf60e74b890d52dd20ab4673111ae2063d33b49.1758198300.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>