Eric Dumazet 3411506550 x86/csum: Rewrite/optimize csum_partial()
With more NICs supporting CHECKSUM_COMPLETE, and IPv6 being widely
used csum_partial() is heavily used with small amount of bytes, and is
consuming many cycles.

IPv6 header size, for instance, is 40 bytes.

Another thing to consider is that NET_IP_ALIGN is 0 on x86, meaning
that network headers are not word-aligned, unless the driver forces
this.

This means that csum_partial() fetches one u16 to 'align the buffer',
then performs three u64 additions with carry in a loop, then a
remaining u32, then a remaining u16.

With this new version, it performs a loop only for the 64 bytes blocks,
then the remaining is bisected.

Testing on various CPUs, all of them show a big reduction in
csum_partial() cost (by 50 to 80 %)

Before:
	4.16%  [kernel]       [k] csum_partial
After:
	0.83%  [kernel]       [k] csum_partial

If run in a loop 1,000,000 times:

Before:
	26,922,913      cycles                    # 3846130.429 GHz
	80,302,961      instructions              #    2.98  insn per cycle
	21,059,816      branches                  # 3008545142.857 M/sec
	     2,896      branch-misses             #    0.01% of all branches
After:
	17,960,709      cycles                    # 3592141.800 GHz
	41,292,805      instructions              #    2.30  insn per cycle
	11,058,119      branches                  # 2211623800.000 M/sec
	     2,997      branch-misses             #    0.03% of all branches

 [ bp: Massage, merge in subsequent fixes into a single patch:
   - um compilation error due to missing load_unaligned_zeropad():
	- Reported-by: kernel test robot <lkp@intel.com>
	- Link: https://lkml.kernel.org/r/20211118175239.1525650-1-eric.dumazet@gmail.com
   - Fix initial seed for odd buffers
	- Reported-by: Noah Goldstein <goldstein.w.n@gmail.com>
	- Link: https://lkml.kernel.org/r/20211125141817.3541501-1-eric.dumazet@gmail.com
  ]

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20211112161950.528886-1-eric.dumazet@gmail.com
2021-12-08 11:26:09 +01:00
2021-12-05 14:08:22 -08:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
Description
No description provided
Readme 3.4 GiB
Languages
C 97%
Assembly 1%
Shell 0.6%
Rust 0.5%
Python 0.4%
Other 0.3%