The current logic was storing symbols source code on a list,
not linked to the actual KdocItem. While this works fine when
kernel-doc markups are OK, on places where there is a "/**"
without a valid kernel-doc markup, it ends that the 1:1 match
between source code and KdocItem doesn't happen, causing
problems to generate the YAML output.
Fix it by storing the source code directly into the KdocItem
structure.
This shouldn't affect performance or memory footprint, except
when --yaml option is used.
While here, add a __repr__() function for KdocItem, as it
helps debugging it.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <77902dafabb5c3250486aa2dc1568d5fafa95c5b.1774256269.git.mchehab+huawei@kernel.org>
Fix check for simple table delimiters.
ReST simple tables use "=" instead of "-". I ended testing it with
a table modified from a complex one, using "--- --- ---", instead
of searching for a real Kernel example.
Only noticed when adding an unit test and seek for an actual
example from kernel-doc markups.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <dea95337c05040f95e5a95ae41d69ddef0aaa8d6.1774256269.git.mchehab+huawei@kernel.org>
The test_kdoc_parser.py already supports loading dynamic tests
when running unit tests.
Add support to read from a different file. This is useful for:
- regression tests before/afer some changes;
- preparing new unit tests;
- test a different yaml before adding its contents at
tools/unittests/kdoc-test.yaml.
It should be noticed that passing an argument to a unit test
is not too trivial, as unittest core will load itself the
runner with a separate environment. The best (only?) way to
do it is by setting the system environment. This way, when
the class is called by the unit test loader, it can pick
the var from the environment without relying on a global
variable.
The unittest_helper has already provision for it, so let's
use its support.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <1d1a56de012c43756f9ca87aa9bf6c285674f113.1774256269.git.mchehab+huawei@kernel.org>
Some documentation pages contain long C API signatures that can exceed
the content width and cause page-wide horizontal scroll overflow.
Apply contained horizontal scrolling to C API description blocks and
keep their signature rows on one line. This preserves signature
formatting while preventing them from breaking page layout.
Contained horizontal scrolling is preferred over wrapping here because
code fidelity is the priority. These blocks are intended to remain
representative of the code itself. Wrapping distorts spacing and line
structure, which affects fidelity, creates misleading renderings, and
reduces readability.
Examples:
https://docs.kernel.org/6.15/driver-api/regulator.htmlhttps://docs.kernel.org/6.15/userspace-api/fwctl/fwctl-cxl.html
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Assisted-by: Codex:GPT-5.4
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260323153342.33447-1-rito@ritovision.com>
Some documentation pages contain long inline literals in paragraph
text that can force page-wide horizontal scroll overflow and break
layout on smaller screens.
Override the default `span.pre` white-space behavior for inline
literals and use `overflow-wrap: anywhere` so they can wrap when
needed. For code used as part of a paragraph, wrapping is appropriate
because it is stylistically part of the surrounding text. Code blocks,
by contrast, are meant to preserve formatting fidelity and are better
served by contained horizontal scrolling.
Examples:
https://docs.kernel.org/6.15/userspace-api/futex2.htmlhttps://docs.kernel.org/6.15/security/IMA-templates.html
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Assisted-by: Codex:GPT-5.4
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260323151401.27415-1-rito@ritovision.com>
Use the existing documentation logo as the HTML favicon.
This makes generated documentation pages use a matching browser tab
icon without introducing a separate favicon asset.
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260321125532.9568-1-rito@ritovision.com>
This series comes after:
https://lore.kernel.org/linux-doc/cover.1773770483.git.mchehab+huawei@kernel.org/
It basically contains patches I submitted before on a 40+ patch series,
but were less relevant, plus a couple of other minor fixes:
- patch 1 improves one of the CTokenizer unit test, fixing some
potential issues on it;
- patches 2 and 3 contain some improvement/fixes for Sphinx
Python autodoc extension. They basically document c_lex.py;
- The remaining patches:
- create a new class for kernel-doc config;
- fix some internal representations of KdocItem;
- add unit tests for KernelDoc() parser class;
- add support to output KdocItem in YAML, which is a
machine-readable output for all documented kAPI.
None of the patches should affect man or html output.
Use the content of kdoc-test.yaml to generate unittests to
verify that kernel-doc internal methods are parsing C code
and generating output the expected way.
Depending on what is written at the parser file at
kdoc-test.yaml, up to 5 tests can be generated from a single
test entry inside the YAML file:
1. from source to kdoc_item: test KernelDoc class;
2. from kdoc_item to man: test ManOutput class;
3. from kdoc_item to rst: test RestOutput class;
4. from source to man without checking expected KdocItem;
5. from source to rst without checking expected KdocItem.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <7ec2593c5b19de3e3b1d8de92675f6b751d3fa21.1773823995.git.mchehab+huawei@kernel.org>
Create a simple kdoc-test.yaml to be used to create unit tests for
kernel-doc parser and output classes.
For now, all we want is a simple function mapped on a yaml test
using the defined schema.
To be sure that the schema is followed, add an unittest for
the file, which will also validate that the schema is properly
parsed.
It should be noticed that the .TH definition for the man format
contains a timestamp. We'll need to handle that when dealing with
the actual implementation for the ManOutput class unit tests.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <74883976348c964f00161696d525c33ddd8c7641.1773823995.git.mchehab+huawei@kernel.org>
Validating that kernel-doc is parsing data properly is tricky.
Add an unittest skeleton that alllows passing a source code
and check if the corresponding values of export_table and
entries returned by the parser are properly filled.
It works by mocking a file input with the contents of a source
string, an comparing if:
- exports set matches;
- expected KernelItem entries match.
Create a new TestSelfValidate meant to check if the logic
inside KdocParser.run_test() does its job of checking for
differences inside KdocItem.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <8d91bfabd69de7aa44a0f5080ccb01aa41957e6d.1773823995.git.mchehab+huawei@kernel.org>
Currently, there are 15 occurrences of section?_start_lines,
with 10 using the plural way.
This is an issue, as, while kdoc_output works with KdocItem,
the term doesn't match its init value.
The variable sections_start_lines stores multiple sections,
so placing it in plural is its correct way.
So, ensure that, on all parts of kdoc, this will be referred
as sections_start_lines.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <d1e0f1d3f80df41c11a1bbde6a12fd9468bc3813.1773823995.git.mchehab+huawei@kernel.org>
When reading the contents on a KdocItem using YAML, the data
will be imported into a dict.
Add a method to create a new KdocItem from a dict to allow
converting such input into a real KdocItem.
While here, address an issue that, if the class is initialized
with an internal parameter outside the 4 initial arguments,
it would end being added inside other_stuff, which breaks
initializing it from a dict.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <fafeac23d1577927e1a3c32cddfbec1e0209ac73.1773823995.git.mchehab+huawei@kernel.org>
When writing unittests for kdoc_output, it became clear that
the logic with handles a series of KdocItem symbols from
a single file belons to kdoc_output, and not to kdoc_files.
Move the code to it.
While here, also ensure that self.config will be placed
together with set.out_style.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <4ebc26e37a0b544c50d50b8077760f147fa6a535.1773823995.git.mchehab+huawei@kernel.org>
The Sphinx output from autodoc doesn't automatically break long
lines, except on spaces.
Change KernRe __repr__() to break the pattern on multiple strings,
each one with a maximum limit of 60 characters.
With that, documentation output for KernRe should now be displayable,
even on long strings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <60c264a9d277fed655b1a62df2195562c8596090.1773823995.git.mchehab+huawei@kernel.org>
Mauro says:
This patch series change how kdoc parser handles macro replacements.
Instead of heavily relying on regular expressions that can sometimes
be very complex, it uses a C lexical tokenizer. This ensures that
BEGIN/END blocks on functions and structs are properly handled,
even when nested.
Checking before/after the patch series, for both man pages and
rst only had:
- whitespace differences;
- struct_group macros now are shown as inner anonimous structs
as it should be.
Also, I didn't notice any relevant change on the documentation build
time. With that regards, right now, every time a CMatch replacement
rule takes in place, it does:
for each transform:
- tokenizes the source code;
- handle CMatch;
- convert tokens back to a string.
A possible optimization would be to do, instead:
- tokenizes source code;
- for each transform handle CMatch;
- convert tokens back to a string.
For now, I opted not do do it, because:
- too much changes on a single row;
- docs build time is taking ~3:30 minutes, which is
about the same time it ws taken before the changes;
- there is a very dirty hack inside function_xforms:
(KernRe(r"_noprof"), ""). This is meant to change
function prototypes instead of function arguments.
So, if ok for you, I would prefer to merge this one first. We can later
optimize kdoc_parser to avoid multiple token <-> string conversions.
-
One important aspect of this series is that it introduces unittests
for kernel-doc. I used it a lot during the development of this series,
to ensure that the changes I was doing were producing the expected
results. Tests are on two separate files that can be executed directly.
Alternatively, there is a run.py script that runs all of them (and
any other python script named tools/unittests/test_*.py"):
$ tools/unittests/run.py
test_cmatch:
TestSearch:
test_search_acquires_multiple: OK
test_search_acquires_nested_paren: OK
test_search_acquires_simple: OK
test_search_must_hold: OK
test_search_must_hold_shared: OK
test_search_no_false_positive: OK
test_search_no_function: OK
test_search_no_macro_remains: OK
TestSubMultipleMacros:
test_acquires_multiple: OK
test_acquires_nested_paren: OK
test_acquires_simple: OK
test_mixed_macros: OK
test_must_hold: OK
test_must_hold_shared: OK
test_no_false_positive: OK
test_no_function: OK
test_no_macro_remains: OK
TestSubSimple:
test_rise_early_greedy: OK
test_rise_multiple_greedy: OK
test_strip_multiple_acquires: OK
test_sub_count_parameter: OK
test_sub_mixed_placeholders: OK
test_sub_multiple_placeholders: OK
test_sub_no_placeholder: OK
test_sub_single_placeholder: OK
test_sub_with_capture: OK
test_sub_zero_placeholder: OK
TestSubWithLocalXforms:
test_functions_with_acquires_and_releases: OK
test_raw_struct_group: OK
test_raw_struct_group_tagged: OK
test_struct_group: OK
test_struct_group_attr: OK
test_struct_group_tagged_with_private: OK
test_struct_kcov: OK
test_vars_stackdepot: OK
test_tokenizer:
TestPublicPrivate:
test_balanced_inner_private: OK
test_balanced_non_greddy_private: OK
test_balanced_private: OK
test_no private: OK
test_unbalanced_inner_private: OK
test_unbalanced_private: OK
test_unbalanced_struct_group_tagged_with_private: OK
test_unbalanced_two_struct_group_tagged_first_with_private: OK
test_unbalanced_without_end_of_line: OK
TestTokenizer:
test_basic_tokens: OK
test_depth_counters: OK
test_mismatch_error: OK
Ran 47 tests
Most of the rules inside CTransforms are of the type CMatch.
Don't re-parse the source code every time.
Doing this doesn't change the output, but makes kdoc almost
as fast as before the tokenizer patches:
# Before tokenizer patches
$ time ./scripts/kernel-doc . -man >original 2>&1
real 0m42.933s
user 0m36.523s
sys 0m1.145s
# After tokenizer patches
$ time ./scripts/kernel-doc . -man >before 2>&1
real 1m29.853s
user 1m23.974s
sys 0m1.237s
# After this patch
$ time ./scripts/kernel-doc . -man >after 2>&1
real 0m48.579s
user 0m45.938s
sys 0m0.988s
$ diff -s before after
Files before and after are identical
Manually checked the differences between original and after
with:
$ diff -U0 -prBw original after|grep -v Warning|grep -v "@@"|less
They're due:
- whitespace fixes;
- struct_group are now better handled;
- several badly-generated man pages from broken inline kernel-doc
markups are now fixed.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <1cc2a4286ebf7d4b2d03fcaf42a1ba9fa09004b9.1773770483.git.mchehab+huawei@kernel.org>
Changeset 2b957decdb6c ("docs: kdoc: don't add broken comments inside prototypes")
revealed a hidden bug at split_struct_proto(): some comments there may break
its capability of properly identifying a struct.
Fixing it is as simple as stripping comments before calling it.
Fixes: 2b957decdb6c ("docs: kdoc: don't add broken comments inside prototypes")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <dcff37b6da5329aea415de31f543b6a1c2cbbbce.1773770483.git.mchehab+huawei@kernel.org>
The previous approach were to unwind nested structs/unions.
Now that we have a logic that can handle it well, use it to
ensure that struct_group macros will properly reflect the
actual struct.
Note that the replacemend logic still simplifies the code
a little bit, as the basic build block for struct group is:
union { \
struct { MEMBERS } ATTRS; \
struct __struct_group_tag(TAG) { MEMBERS } ATTRS NAME; \
} ATTRS
There:
- ATTRS is meant to add extra macro attributes like __packed
which we already discard, as they aren't relevant to
document struct members;
- TAG is used only when built with __cplusplus.
So, instead, convert them into just:
struct { MEMBERS };
Please notice that here, we're using the greedy version of the
backrefs, as MEMBERS is actually MEMBERS... on all such macros.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <24bf2c036b08814d9b4aabc27542fd3b2ff54424.1773770483.git.mchehab+huawei@kernel.org>
The CMatch logic is complex enough to justify tests to ensure
that it is doing its job.
Add unittests to check the functionality provided by CMatch
by replicating expected patterns.
The CMatch class handles with complex macros. Add an unittest
to check if its doing the right thing and detect eventual regressions
as we improve its code.
The initial version was generated using gpt-oss:latest LLM
on my local GPU, as LLMs aren't bad transforming patterns
into unittests.
Yet, the curent version contains only the skeleton of what
LLM produced, as I ended higly changing its content to be
more representative and to have real case scenarios.
The kdoc_xforms test suite contains 3 test groups. Two of
them tests the basic functionality of CMatch to
replace patterns.
The last one (TestRealUsecases) contains real code snippets
from the Kernel with some cleanups to better fit in 80 columns
and uses the same transforms as kernel-doc, thus allowing
to test the logic used inside kdoc_parser to transform
functions, structs and variable patterns.
Its output is like this:
$ tools/unittests/kdoc_xforms.py
Ran 25 tests in 0.003s
OK
test_cmatch:
TestSearch:
test_search_acquires_multiple: OK
test_search_acquires_nested_paren: OK
test_search_acquires_simple: OK
test_search_must_hold: OK
test_search_must_hold_shared: OK
test_search_no_false_positive: OK
test_search_no_function: OK
test_search_no_macro_remains: OK
Ran 8 tests
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <119712b5bc53b4c6dda6a81b4a783dcbfd1d970d.1773770483.git.mchehab+huawei@kernel.org>
The NextMatch code is complex, and will become even more complex
if we add there support for arguments.
Now that we have a tokenizer, we can use a better solution,
easier to be understood.
Yet, to improve performance, it is better to make it use a
previously tokenized code, changing its ABI.
So, reimplement NextMatch using the CTokener class. Once it
is done, we can drop NestedMatch.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <fa818ea164216b17520b588e3f12b81499b76dd7.1773770483.git.mchehab+huawei@kernel.org>
We'll soon have multiple unit tests, add a runner that will
discover all of them and execute all tests.
It was opted to discover only files that starts with "test",
as this way unittest discover won't try adding libraries or
other stuff that might not contain unittest classes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <2d9dd14f03d3d6394346fdaceeb3167d54d1dd0c.1773770483.git.mchehab+huawei@kernel.org>
Better handle comments inside structs. After those changes,
all unittests now pass:
test_private:
TestPublicPrivate:
test balanced_inner_private: OK
test balanced_non_greddy_private: OK
test balanced_private: OK
test no private: OK
test unbalanced_inner_private: OK
test unbalanced_private: OK
test unbalanced_struct_group_tagged_with_private: OK
test unbalanced_two_struct_group_tagged_first_with_private: OK
test unbalanced_without_end_of_line: OK
Ran 9 tests
This also solves a bug when handling STRUCT_GROUP() with a private
comment on it:
@@ -397134,7 +397134,7 @@ basic V4L2 device-level support.
unsigned int max_len;
unsigned int offset;
struct page_pool_params_slow slow;
- STRUCT_GROUP( struct net_device *netdev;
+ struct net_device *netdev;
unsigned int queue_idx;
unsigned int flags;
};
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Message-ID: <f83ee9e8c38407eaab6ad10d4ccf155fb36683cc.1773074166.git.mchehab+huawei@kernel.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <054763260f7b5459ad0738ed906d7c358d640692.1773770483.git.mchehab+huawei@kernel.org>