Commit Graph

115823 Commits

Author SHA1 Message Date
Eric Biggers
fdfad1fffc crypto: shash - introduce crypto_grab_shash()
Currently, shash spawns are initialized by using shash_attr_alg() or
crypto_alg_mod_lookup() to look up the shash algorithm, then calling
crypto_init_shash_spawn().

This is different from how skcipher, aead, and akcipher spawns are
initialized (they use crypto_grab_*()), and for no good reason.  This
difference introduces unnecessary complexity.

The crypto_grab_*() functions used to have some problems, like not
holding a reference to the algorithm and requiring the caller to
initialize spawn->base.inst.  But those problems are fixed now.

So, let's introduce crypto_grab_shash() so that we can convert all
templates to the same way of initializing their spawns.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:54 +08:00
Eric Biggers
de95c95741 crypto: algapi - pass instance to crypto_grab_spawn()
Currently, crypto_spawn::inst is first used temporarily to pass the
instance to crypto_grab_spawn().  Then crypto_init_spawn() overwrites it
with crypto_spawn::next, which shares the same union.  Finally,
crypto_spawn::inst is set again when the instance is registered.

Make this less convoluted by just passing the instance as an argument to
crypto_grab_spawn() instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:54 +08:00
Eric Biggers
73bed26f73 crypto: akcipher - pass instance to crypto_grab_akcipher()
Initializing a crypto_akcipher_spawn currently requires:

1. Set spawn->base.inst to point to the instance.
2. Call crypto_grab_akcipher().

But there's no reason for these steps to be separate, and in fact this
unneeded complication has caused at least one bug, the one fixed by
commit 6db4341017 ("crypto: adiantum - initialize crypto_spawn::inst")

So just make crypto_grab_akcipher() take the instance as an argument.

To keep the function call from getting too unwieldy due to this extra
argument, also introduce a 'mask' variable into pkcs1pad_create().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:54 +08:00
Eric Biggers
cd900f0cac crypto: aead - pass instance to crypto_grab_aead()
Initializing a crypto_aead_spawn currently requires:

1. Set spawn->base.inst to point to the instance.
2. Call crypto_grab_aead().

But there's no reason for these steps to be separate, and in fact this
unneeded complication has caused at least one bug, the one fixed by
commit 6db4341017 ("crypto: adiantum - initialize crypto_spawn::inst")

So just make crypto_grab_aead() take the instance as an argument.

To keep the function calls from getting too unwieldy due to this extra
argument, also introduce a 'mask' variable into the affected places
which weren't already using one.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:54 +08:00
Eric Biggers
b9f76dddb1 crypto: skcipher - pass instance to crypto_grab_skcipher()
Initializing a crypto_skcipher_spawn currently requires:

1. Set spawn->base.inst to point to the instance.
2. Call crypto_grab_skcipher().

But there's no reason for these steps to be separate, and in fact this
unneeded complication has caused at least one bug, the one fixed by
commit 6db4341017 ("crypto: adiantum - initialize crypto_spawn::inst")

So just make crypto_grab_skcipher() take the instance as an argument.

To keep the function calls from getting too unwieldy due to this extra
argument, also introduce a 'mask' variable into the affected places
which weren't already using one.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:54 +08:00
Eric Biggers
77f7e94d72 crypto: ahash - make struct ahash_instance be the full size
Define struct ahash_instance in a way analogous to struct
skcipher_instance, struct aead_instance, and struct akcipher_instance,
where the struct is defined to include both the algorithm structure at
the beginning and the additional crypto_instance fields at the end.

This is needed to allow allocating ahash instances directly using
kzalloc(sizeof(*inst) + sizeof(*ictx), ...) in the same way as skcipher,
aead, and akcipher instances.  In turn, that's needed to make spawns be
initialized in a consistent way everywhere.

Also take advantage of the addition of the base instance to struct
ahash_instance by simplifying the ahash_crypto_instance() and
ahash_instance() functions.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:54 +08:00
Eric Biggers
1b84e7d01d crypto: shash - make struct shash_instance be the full size
Define struct shash_instance in a way analogous to struct
skcipher_instance, struct aead_instance, and struct akcipher_instance,
where the struct is defined to include both the algorithm structure at
the beginning and the additional crypto_instance fields at the end.

This is needed to allow allocating shash instances directly using
kzalloc(sizeof(*inst) + sizeof(*ictx), ...) in the same way as skcipher,
aead, and akcipher instances.  In turn, that's needed to make spawns be
initialized in a consistent way everywhere.

Also take advantage of the addition of the base instance to struct
shash_instance by simplifying the shash_crypto_instance() and
shash_instance() functions.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:54 +08:00
Eric Biggers
af5034e8e4 crypto: remove propagation of CRYPTO_TFM_RES_* flags
The CRYPTO_TFM_RES_* flags were apparently meant as a way to make the
->setkey() functions provide more information about errors.  But these
flags weren't actually being used or tested, and in many cases they
weren't being set correctly anyway.  So they've now been removed.

Also, if someone ever actually needs to start better distinguishing
->setkey() errors (which is somewhat unlikely, as this has been unneeded
for a long time), we'd be much better off just defining different return
values, like -EINVAL if the key is invalid for the algorithm vs.
-EKEYREJECTED if the key was rejected by a policy like "no weak keys".
That would be much simpler, less error-prone, and easier to test.

So just remove CRYPTO_TFM_RES_MASK and all the unneeded logic that
propagates these flags around.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:53 +08:00
Eric Biggers
c4c4db0d59 crypto: remove CRYPTO_TFM_RES_WEAK_KEY
The CRYPTO_TFM_RES_WEAK_KEY flag was apparently meant as a way to make
the ->setkey() functions provide more information about errors.

However, no one actually checks for this flag, which makes it pointless.
There are also no tests that verify that all algorithms actually set (or
don't set) it correctly.

This is also the last remaining CRYPTO_TFM_RES_* flag, which means that
it's the only thing still needing all the boilerplate code which
propagates these flags around from child => parent tfms.

And if someone ever needs to distinguish this error in the future (which
is somewhat unlikely, as it's been unneeded for a long time), it would
be much better to just define a new return value like -EKEYREJECTED.
That would be much simpler, less error-prone, and easier to test.

So just remove this flag.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:53 +08:00
Eric Biggers
674f368a95 crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN
The CRYPTO_TFM_RES_BAD_KEY_LEN flag was apparently meant as a way to
make the ->setkey() functions provide more information about errors.

However, no one actually checks for this flag, which makes it pointless.

Also, many algorithms fail to set this flag when given a bad length key.
Reviewing just the generic implementations, this is the case for
aes-fixed-time, cbcmac, echainiv, nhpoly1305, pcrypt, rfc3686, rfc4309,
rfc7539, rfc7539esp, salsa20, seqiv, and xcbc.  But there are probably
many more in arch/*/crypto/ and drivers/crypto/.

Some algorithms can even set this flag when the key is the correct
length.  For example, authenc and authencesn set it when the key payload
is malformed in any way (not just a bad length), the atmel-sha and ccree
drivers can set it if a memory allocation fails, and the chelsio driver
sets it for bad auth tag lengths, not just bad key lengths.

So even if someone actually wanted to start checking this flag (which
seems unlikely, since it's been unused for a long time), there would be
a lot of work needed to get it working correctly.  But it would probably
be much better to go back to the drawing board and just define different
return values, like -EINVAL if the key is invalid for the algorithm vs.
-EKEYREJECTED if the key was rejected by a policy like "no weak keys".
That would be much simpler, less error-prone, and easier to test.

So just remove this flag.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:53 +08:00
Eric Biggers
5c925e8b10 crypto: remove CRYPTO_TFM_RES_BAD_BLOCK_LEN
The flag CRYPTO_TFM_RES_BAD_BLOCK_LEN is never checked for, and it's
only set by one driver.  And even that single driver's use is wrong
because the driver is setting the flag from ->encrypt() and ->decrypt()
with no locking, which is unsafe because ->encrypt() and ->decrypt() can
be executed by many threads in parallel on the same tfm.

Just remove this flag.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:52 +08:00
Eric Biggers
f9d89b853e crypto: remove unused tfm result flags
The tfm result flags CRYPTO_TFM_RES_BAD_KEY_SCHED and
CRYPTO_TFM_RES_BAD_FLAGS are never used, so remove them.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:52 +08:00
Eric Biggers
70ffa8fd72 crypto: skcipher - remove skcipher_walk_aead()
skcipher_walk_aead() is unused and is identical to
skcipher_walk_aead_encrypt(), so remove it.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:51 +08:00
Rijo Thomas
bade7e1fbd tee: amdtee: check TEE status during driver initialization
The AMD-TEE driver should check if TEE is available before
registering itself with TEE subsystem. This ensures that
there is a TEE which the driver can talk to before proceeding
with tee device node allocation.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Jens Wiklander <jens.wiklander@linaro.org>
Co-developed-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Reviewed-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-04 13:49:51 +08:00
Rijo Thomas
757cc3e9ff tee: add AMD-TEE driver
Adds AMD-TEE driver.
* targets AMD APUs which has AMD Secure Processor with software-based
  Trusted Execution Environment (TEE) support
* registers with TEE subsystem
* defines tee_driver_ops function callbacks
* kernel allocated memory is used as shared memory between normal
  world and secure world.
* acts as REE (Rich Execution Environment) communication agent, which
  uses the services of AMD Secure Processor driver to submit commands
  for processing in TEE environment

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Jens Wiklander <jens.wiklander@linaro.org>
Co-developed-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Reviewed-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-04 13:49:51 +08:00
Herbert Xu
b3c16bfc6a crypto: skcipher - Add skcipher_ialg_simple helper
This patch introduces the skcipher_ialg_simple helper which fetches
the crypto_alg structure from a simple skcipher instance's spawn.

This allows us to remove the third argument from the function
skcipher_alloc_instance_simple.

In doing so the reference count to the algorithm is now maintained
by the Crypto API and the caller no longer needs to drop the alg
refcount.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-27 18:18:04 +08:00
Herbert Xu
5f567fffaa crypto: api - Retain alg refcount in crypto_grab_spawn
This patch changes crypto_grab_spawn to retain the reference count
on the algorithm.  This is because the caller needs to access the
algorithm parameters and without the reference count the algorithm
can be freed at any time.

The reference count will be subsequently dropped by the crypto API
once the instance has been registered.  The helper crypto_drop_spawn
will also conditionally drop the reference count depending on whether
it has been registered.

Note that the code is actually added to crypto_init_spawn.  However,
unless the caller activates this by setting spawn->dropref beforehand
then nothing happens.  The only caller that sets dropref is currently
crypto_grab_spawn.

Once all legacy users of crypto_init_spawn disappear, then we can
kill the dropref flag.

Internally each instance will maintain a list of its spawns prior
to registration.  This memory used by this list is shared with
other fields that are only used after registration.  In order for
this to work a new flag spawn->registered is added to indicate
whether spawn->inst can be used.

Fixes: d6ef2f198d ("crypto: api - Add crypto_grab_spawn primitive")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-27 18:18:03 +08:00
Chen Zhou
c782937e92 crypto: api - remove unneeded semicolon
Fixes coccicheck warning:

./include/linux/crypto.h:573:2-3: Unneeded semicolon

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-27 18:18:02 +08:00
Eric Biggers
c6d633a927 crypto: algapi - make unregistration functions return void
Some of the algorithm unregistration functions return -ENOENT when asked
to unregister a non-registered algorithm, while others always return 0
or always return void.  But no users check the return value, except for
two of the bulk unregistration functions which print a message on error
but still always return 0 to their caller, and crypto_del_alg() which
calls crypto_unregister_instance() which always returns 0.

Since unregistering a non-registered algorithm is always a kernel bug
but there isn't anything callers should do to handle this situation at
runtime, let's simplify things by making all the unregistration
functions return void, and moving the error message into
crypto_unregister_alg() and upgrading it to a WARN().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-20 14:58:35 +08:00
Rijo Thomas
632b0b5301 crypto: ccp - provide in-kernel API to submit TEE commands
Extend the functionality of AMD Secure Processor (SP) driver by
providing an in-kernel API to submit commands to TEE ring buffer for
processing by Trusted OS running on AMD Secure Processor.

Following TEE commands are supported by Trusted OS:

* TEE_CMD_ID_LOAD_TA : Load Trusted Application (TA) binary into
  TEE environment
* TEE_CMD_ID_UNLOAD_TA : Unload TA binary from TEE environment
* TEE_CMD_ID_OPEN_SESSION : Open session with loaded TA
* TEE_CMD_ID_CLOSE_SESSION : Close session with loaded TA
* TEE_CMD_ID_INVOKE_CMD : Invoke a command with loaded TA
* TEE_CMD_ID_MAP_SHARED_MEM : Map shared memory
* TEE_CMD_ID_UNMAP_SHARED_MEM : Unmap shared memory

Linux AMD-TEE driver will use this API to submit command buffers
for processing in Trusted Execution Environment. The AMD-TEE driver
shall be introduced in a separate patch.

Cc: Jens Wiklander <jens.wiklander@linaro.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Co-developed-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-20 14:58:32 +08:00
Herbert Xu
d9e1670b80 crypto: hmac - Use init_tfm/exit_tfm interface
This patch switches hmac over to the new init_tfm/exit_tfm interface
as opposed to cra_init/cra_exit.  This way the shash API can make
sure that descsize does not exceed the maximum.

This patch also adds the API helper shash_alg_instance.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:48:39 +08:00
Herbert Xu
fbce6be5ae crypto: shash - Add init_tfm/exit_tfm and verify descsize
The shash interface supports a dynamic descsize field because of
the presence of fallbacks (it's just padlock-sha actually, perhaps
we can remove it one day).  As it is the API does not verify the
setting of descsize at all.  It is up to the individual algorithms
to ensure that descsize does not exceed the specified maximum value
of HASH_MAX_DESCSIZE (going above would cause stack corruption).

In order to allow the API to impose this limit directly, this patch
adds init_tfm/exit_tfm hooks to the shash_alg structure.  We can
then verify the descsize setting in the API directly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:48:39 +08:00
Herbert Xu
4f87ee118d crypto: api - Do not zap spawn->alg
Currently when a spawn is removed we will zap its alg field.
This is racy because the spawn could belong to an unregistered
instance which may dereference the spawn->alg field.

This patch fixes this by keeping spawn->alg constant and instead
adding a new spawn->dead field to indicate that a spawn is going
away.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:48:39 +08:00
Valdis Klētnieks
579d705cd6 crypto: chacha - fix warning message in header file
Building with W=1 causes a warning:

  CC [M]  arch/x86/crypto/chacha_glue.o
In file included from arch/x86/crypto/chacha_glue.c:10:
./include/crypto/internal/chacha.h:37:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
   37 | static int inline chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key,
      | ^~~~~~

Straighten out the order to match the rest of the header file.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:48:38 +08:00
Daniel Jordan
bfcdcef8c8 padata: update documentation
Remove references to unused functions, standardize language, update to
reflect new functionality, migrate to rst format, and fix all kernel-doc
warnings.

Fixes: 815613da6a ("kernel/padata.c: removed unused code")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:02 +08:00
Daniel Jordan
3facced7ae padata: remove reorder_objects
reorder_objects is unused since the rework of padata's flushing, so
remove it.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:02 +08:00
Daniel Jordan
91a71d6121 padata: remove cpumask change notifier
Since commit 63d3578892 ("crypto: pcrypt - remove padata cpumask
notifier") this feature is unused, so get rid of it.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:02 +08:00
Daniel Jordan
894c9ef978 padata: validate cpumask without removed CPU during offline
Configuring an instance's parallel mask without any online CPUs...

  echo 2 > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
  echo 0 > /sys/devices/system/cpu/cpu1/online

...makes tcrypt mode=215 crash like this:

  divide error: 0000 [#1] SMP PTI
  CPU: 4 PID: 283 Comm: modprobe Not tainted 5.4.0-rc8-padata-doc-v2+ #2
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191013_105130-anatol 04/01/2014
  RIP: 0010:padata_do_parallel+0x114/0x300
  Call Trace:
   pcrypt_aead_encrypt+0xc0/0xd0 [pcrypt]
   crypto_aead_encrypt+0x1f/0x30
   do_mult_aead_op+0x4e/0xdf [tcrypt]
   test_mb_aead_speed.constprop.0.cold+0x226/0x564 [tcrypt]
   do_test+0x28c2/0x4d49 [tcrypt]
   tcrypt_mod_init+0x55/0x1000 [tcrypt]
   ...

cpumask_weight() in padata_cpu_hash() returns 0 because the mask has no
CPUs.  The problem is __padata_remove_cpu() checks for valid masks too
early and so doesn't mark the instance PADATA_INVALID as expected, which
would have made padata_do_parallel() return error before doing the
division.

Fix by introducing a second padata CPU hotplug state before
CPUHP_BRINGUP_CPU so that __padata_remove_cpu() sees the online mask
without @cpu.  No need for the second argument to padata_replace() since
@cpu is now already missing from the online mask.

Fixes: 33e5445068 ("padata: Handle empty padata cpumasks")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:02 +08:00
Eric Biggers
e8cfed5e4e crypto: cipher - remove crt_u.cipher (struct cipher_tfm)
Of the three fields in crt_u.cipher (struct cipher_tfm), ->cit_setkey()
is pointless because it always points to setkey() in crypto/cipher.c.

->cit_decrypt_one() and ->cit_encrypt_one() are slightly less pointless,
since if the algorithm doesn't have an alignmask, they are set directly
to ->cia_encrypt() and ->cia_decrypt().  However, this "optimization"
isn't worthwhile because:

- The "cipher" algorithm type is the only algorithm still using crt_u,
  so it's bloating every struct crypto_tfm for every algorithm type.

- If the algorithm has an alignmask, this "optimization" actually makes
  things slower, as it causes 2 indirect calls per block rather than 1.

- It adds extra code complexity.

- Some templates already call ->cia_encrypt()/->cia_decrypt() directly
  instead of going through ->cit_encrypt_one()/->cit_decrypt_one().

- The "cipher" algorithm type never gives optimal performance anyway.
  For that, a higher-level type such as skcipher needs to be used.

Therefore, just remove the extra indirection, and make
crypto_cipher_setkey(), crypto_cipher_encrypt_one(), and
crypto_cipher_decrypt_one() be direct calls into crypto/cipher.c.

Also remove the unused function crypto_cipher_cast().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:01 +08:00
Eric Biggers
c441a909c6 crypto: compress - remove crt_u.compress (struct compress_tfm)
crt_u.compress (struct compress_tfm) is pointless because its two
fields, ->cot_compress() and ->cot_decompress(), always point to
crypto_compress() and crypto_decompress().

Remove this pointless indirection, and just make crypto_comp_compress()
and crypto_comp_decompress() be direct calls to what used to be
crypto_compress() and crypto_decompress().

Also remove the unused function crypto_comp_cast().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:01 +08:00
Eric Biggers
7bada03311 crypto: skcipher - add crypto_skcipher_min_keysize()
Add a helper function crypto_skcipher_min_keysize() to mirror
crypto_skcipher_max_keysize().

This will be used by the self-tests.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:00 +08:00
Eric Biggers
095be695e5 crypto: aead - move crypto_aead_maxauthsize() to <crypto/aead.h>
Move crypto_aead_maxauthsize() to <crypto/aead.h> so that it's available
to users of the API, not just AEAD implementations.

This will be used by the self-tests.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:37:00 +08:00
Eric Biggers
c288178954 crypto: shash - allow essiv and hmac to use OPTIONAL_KEY algorithms
The essiv and hmac templates refuse to use any hash algorithm that has a
->setkey() function, which includes not just algorithms that always need
a key, but also algorithms that optionally take a key.

Previously the only optionally-keyed hash algorithms in the crypto API
were non-cryptographic algorithms like crc32, so this didn't really
matter.  But that's changed with BLAKE2 support being added.  BLAKE2
should work with essiv and hmac, just like any other cryptographic hash.

Fix this by allowing the use of both algorithms without a ->setkey()
function and algorithms that have the OPTIONAL_KEY flag set.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:57 +08:00
Eric Biggers
7e1c109918 crypto: skcipher - remove crypto_skcipher::decrypt
Due to the removal of the blkcipher and ablkcipher algorithm types,
crypto_skcipher::decrypt is now redundant since it always equals
crypto_skcipher_alg(tfm)->decrypt.

Remove it and update crypto_skcipher_decrypt() accordingly.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:56 +08:00
Eric Biggers
848755e315 crypto: skcipher - remove crypto_skcipher::encrypt
Due to the removal of the blkcipher and ablkcipher algorithm types,
crypto_skcipher::encrypt is now redundant since it always equals
crypto_skcipher_alg(tfm)->encrypt.

Remove it and update crypto_skcipher_encrypt() accordingly.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:56 +08:00
Eric Biggers
15252d9427 crypto: skcipher - remove crypto_skcipher::setkey
Due to the removal of the blkcipher and ablkcipher algorithm types,
crypto_skcipher::setkey now always points to skcipher_setkey().

Simplify by removing this function pointer and instead just making
skcipher_setkey() be crypto_skcipher_setkey() directly.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:56 +08:00
Eric Biggers
9ac0d13693 crypto: skcipher - remove crypto_skcipher::keysize
Due to the removal of the blkcipher and ablkcipher algorithm types,
crypto_skcipher::keysize is now redundant since it always equals
crypto_skcipher_alg(tfm)->max_keysize.

Remove it and update crypto_skcipher_default_keysize() accordingly.

Also rename crypto_skcipher_default_keysize() to
crypto_skcipher_max_keysize() to clarify that it specifically returns
the maximum key size, not some unspecified "default".

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:56 +08:00
Eric Biggers
140734d371 crypto: skcipher - remove crypto_skcipher::ivsize
Due to the removal of the blkcipher and ablkcipher algorithm types,
crypto_skcipher::ivsize is now redundant since it always equals
crypto_skcipher_alg(tfm)->ivsize.

Remove it and update crypto_skcipher_ivsize() accordingly.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:56 +08:00
Kees Cook
9c1e8836ed crypto: x86 - Regularize glue function prototypes
The crypto glue performed function prototype casting via macros to make
indirect calls to assembly routines. Instead of performing casts at the
call sites (which trips Control Flow Integrity prototype checking), switch
each prototype to a common standard set of arguments which allows the
removal of the existing macros. In order to keep pointer math unchanged,
internal casting between u128 pointers and u8 pointers is added.

Co-developed-by: João Moreira <joao.moreira@intel.com>
Signed-off-by: João Moreira <joao.moreira@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:54 +08:00
Herbert Xu
bbefa1dd6a crypto: pcrypt - Avoid deadlock by using per-instance padata queues
If the pcrypt template is used multiple times in an algorithm, then a
deadlock occurs because all pcrypt instances share the same
padata_instance, which completes requests in the order submitted.  That
is, the inner pcrypt request waits for the outer pcrypt request while
the outer request is already waiting for the inner.

This patch fixes this by allocating a set of queues for each pcrypt
instance instead of using two global queues.  In order to maintain
the existing user-space interface, the pinst structure remains global
so any sysfs modifications will apply to every pcrypt instance.

Note that when an update occurs we have to allocate memory for
every pcrypt instance.  Should one of the allocations fail we
will abort the update without rolling back changes already made.

The new per-instance data structure is called padata_shell and is
essentially a wrapper around parallel_data.

Reproducer:

	#include <linux/if_alg.h>
	#include <sys/socket.h>
	#include <unistd.h>

	int main()
	{
		struct sockaddr_alg addr = {
			.salg_type = "aead",
			.salg_name = "pcrypt(pcrypt(rfc4106-gcm-aesni))"
		};
		int algfd, reqfd;
		char buf[32] = { 0 };

		algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
		bind(algfd, (void *)&addr, sizeof(addr));
		setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, 20);
		reqfd = accept(algfd, 0, 0);
		write(reqfd, buf, 32);
		read(reqfd, buf, 16);
	}

Reported-by: syzbot+56c7151cad94eec37c521f0e47d2eee53f9361c4@syzkaller.appspotmail.com
Fixes: 5068c7a883 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:45 +08:00
Linus Torvalds
95e6ba5133 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) More jumbo frame fixes in r8169, from Heiner Kallweit.

 2) Fix bpf build in minimal configuration, from Alexei Starovoitov.

 3) Use after free in slcan driver, from Jouni Hogander.

 4) Flower classifier port ranges don't work properly in the HW offload
    case, from Yoshiki Komachi.

 5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin.

 6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk.

 7) Fix flow dissection in dsa TX path, from Alexander Lobakin.

 8) Stale syncookie timestampe fixes from Guillaume Nault.

[ Did an evil merge to silence a warning introduced by this pull - Linus ]

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
  r8169: fix rtl_hw_jumbo_disable for RTL8168evl
  net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()
  r8169: add missing RX enabling for WoL on RTL8125
  vhost/vsock: accept only packets with the right dst_cid
  net: phy: dp83867: fix hfs boot in rgmii mode
  net: ethernet: ti: cpsw: fix extra rx interrupt
  inet: protect against too small mtu values.
  gre: refetch erspan header from skb->data after pskb_may_pull()
  pppoe: remove redundant BUG_ON() check in pppoe_pernet
  tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
  tcp: tighten acceptance of ACKs not matching a child socket
  tcp: fix rejected syncookies due to stale timestamps
  lpc_eth: kernel BUG on remove
  tcp: md5: fix potential overestimation of TCP option space
  net: sched: allow indirect blocks to bind to clsact in TC
  net: core: rename indirect block ingress cb function
  net-sysfs: Call dev_hold always in netdev_queue_add_kobject
  net: dsa: fix flow dissection on Tx path
  net/tls: Fix return values to avoid ENOTSUPP
  net: avoid an indirect call in ____sys_recvmsg()
  ...
2019-12-08 13:28:11 -08:00
Linus Torvalds
737214515d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull more input updates from Dmitry Torokhov:

 - fixups for Synaptics RMI4 driver

 - a quirk for Goodinx touchscreen on Teclast tablet

 - a new keycode definition for activating privacy screen feature found
   on a few "enterprise" laptops

 - updates to snvs_pwrkey driver

 - polling uinput device for writing (which is always allowed) now works

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers
  Input: synaptics-rmi4 - re-enable IRQs in f34v7_do_reflash
  Input: goodix - add upside-down quirk for Teclast X89 tablet
  Input: add privacy screen toggle keycode
  Input: uinput - fix returning EPOLLOUT from uinput_poll
  Input: snvs_pwrkey - remove gratuitous NULL initializers
  Input: snvs_pwrkey - send key events for i.MX6 S, DL and Q
2019-12-07 18:33:01 -08:00
Linus Torvalds
911d137ab0 Merge tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 "This is a relatively quiet cycle for nfsd, mainly various bugfixes.

  Possibly most interesting is Trond's fixes for some callback races
  that were due to my incomplete understanding of rpc client shutdown.
  Unfortunately at the last minute I've started noticing a new
  intermittent failure to send callbacks. As the logic seems basically
  correct, I'm leaving Trond's patches in for now, and hope to find a
  fix in the next week so I don't have to revert those patches"

* tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux: (24 commits)
  nfsd: depend on CRYPTO_MD5 for legacy client tracking
  NFSD fixing possible null pointer derefering in copy offload
  nfsd: check for EBUSY from vfs_rmdir/vfs_unink.
  nfsd: Ensure CLONE persists data and metadata changes to the target file
  SUNRPC: Fix backchannel latency metrics
  nfsd: restore NFSv3 ACL support
  nfsd: v4 support requires CRYPTO_SHA256
  nfsd: Fix cld_net->cn_tfm initialization
  lockd: remove __KERNEL__ ifdefs
  sunrpc: remove __KERNEL__ ifdefs
  race in exportfs_decode_fh()
  nfsd: Drop LIST_HEAD where the variable it declares is never used.
  nfsd: document callback_wq serialization of callback code
  nfsd: mark cb path down on unknown errors
  nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()
  nfsd: minor 4.1 callback cleanup
  SUNRPC: Fix svcauth_gss_proxy_init()
  SUNRPC: Trace gssproxy upcall results
  sunrpc: fix crash when cache_head become valid before update
  nfsd: remove private bin2hex implementation
  ...
2019-12-07 16:56:00 -08:00
Linus Torvalds
fb9bf40cf0 Merge tag 'nfs-for-5.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Features:

   - NFSv4.2 now supports cross device offloaded copy (i.e. offloaded
     copy of a file from one source server to a different target
     server).

   - New RDMA tracepoints for debugging congestion control and Local
     Invalidate WRs.

  Bugfixes and cleanups

   - Drop the NFSv4.1 session slot if nfs4_delegreturn_prepare waits for
     layoutreturn

   - Handle bad/dead sessions correctly in nfs41_sequence_process()

   - Various bugfixes to the delegation return operation.

   - Various bugfixes pertaining to delegations that have been revoked.

   - Cleanups to the NFS timespec code to avoid unnecessary conversions
     between timespec and timespec64.

   - Fix unstable RDMA connections after a reconnect

   - Close race between waking an RDMA sender and posting a receive

   - Wake pending RDMA tasks if connection fails

   - Fix MR list corruption, and clean up MR usage

   - Fix another RPCSEC_GSS issue with MIC buffer space"

* tag 'nfs-for-5.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
  SUNRPC: Capture completion of all RPC tasks
  SUNRPC: Fix another issue with MIC buffer space
  NFS4: Trace lock reclaims
  NFS4: Trace state recovery operation
  NFSv4.2 fix memory leak in nfs42_ssc_open
  NFSv4.2 fix kfree in __nfs42_copy_file_range
  NFS: remove duplicated include from nfs4file.c
  NFSv4: Make _nfs42_proc_copy_notify() static
  NFS: Fallocate should use the nfs4_fattr_bitmap
  NFS: Return -ETXTBSY when attempting to write to a swapfile
  fs: nfs: sysfs: Remove NULL check before kfree
  NFS: remove unneeded semicolon
  NFSv4: add declaration of current_stateid
  NFSv4.x: Drop the slot if nfs4_delegreturn_prepare waits for layoutreturn
  NFSv4.x: Handle bad/dead sessions correctly in nfs41_sequence_process()
  nfsv4: Move NFSPROC4_CLNT_COPY_NOTIFY to end of list
  SUNRPC: Avoid RPC delays when exiting suspend
  NFS: Add a tracepoint in nfs_fh_to_dentry()
  NFSv4: Don't retry the GETATTR on old stateid in nfs4_delegreturn_done()
  NFSv4: Handle NFS4ERR_OLD_STATEID in delegreturn
  ...
2019-12-07 16:50:55 -08:00
Linus Torvalds
a28c8b9db8 pipe: remove 'waiting_writers' merging logic
This code is ancient, and goes back to when we only had a single page
for the pipe buffers.  The exact history is hidden in the mists of time
(ie "before git", and in fact predates the BK repository too).

At that long-ago point in time, it actually helped to try to merge big
back-and-forth pipe reads and writes, and not limit pipe reads to the
single pipe buffer in length just because that was all we had at a time.

However, since then we've expanded the pipe buffers to multiple pages,
and this logic really doesn't seem to make sense.  And a lot of it is
somewhat questionable (ie "hmm, the user asked for a non-blocking read,
but we see that there's a writer pending, so let's wait anyway to get
the extra data that the writer will have").

But more importantly, it makes the "go to sleep" logic much less
obvious, and considering the wakeup issues we've had, I want to make for
less of those kinds of things.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-07 13:21:01 -08:00
Eric Dumazet
501a90c945 inet: protect against too small mtu values.
syzbot was once again able to crash a host by setting a very small mtu
on loopback device.

Let's make inetdev_valid_mtu() available in include/net/ip.h,
and use it in ip_setup_cork(), so that we protect both ip_append_page()
and __ip_append_data()

Also add a READ_ONCE() when the device mtu is read.

Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(),
even if other code paths might write over this field.

Add a big comment in include/linux/netdevice.h about dev->mtu
needing READ_ONCE()/WRITE_ONCE() annotations.

Hopefully we will add the missing ones in followup patches.

[1]

refcount_t: saturated; leaking memory.
WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 panic+0x2e3/0x75c kernel/panic.c:221
 __warn.cold+0x2f/0x3e kernel/panic.c:582
 report_bug+0x289/0x300 lib/bug.c:195
 fixup_bug arch/x86/kernel/traps.c:174 [inline]
 fixup_bug arch/x86/kernel/traps.c:169 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89
RSP: 0018:ffff88809689f550 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c
RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1
R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001
R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40
 refcount_add include/linux/refcount.h:193 [inline]
 skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999
 sock_wmalloc+0xf1/0x120 net/core/sock.c:2096
 ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383
 udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276
 inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821
 kernel_sendpage+0x92/0xf0 net/socket.c:3794
 sock_sendpage+0x8b/0xc0 net/socket.c:936
 pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458
 splice_from_pipe_feed fs/splice.c:512 [inline]
 __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636
 splice_from_pipe+0x108/0x170 fs/splice.c:671
 generic_splice_sendpage+0x3c/0x50 fs/splice.c:842
 do_splice_from fs/splice.c:861 [inline]
 direct_splice_actor+0x123/0x190 fs/splice.c:1035
 splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990
 do_splice_direct+0x1da/0x2a0 fs/splice.c:1078
 do_sendfile+0x597/0xd00 fs/read_write.c:1464
 __do_sys_sendfile64 fs/read_write.c:1525 [inline]
 __se_sys_sendfile64 fs/read_write.c:1511 [inline]
 __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441409
Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409
RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005
RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010
R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180
R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000
Kernel Offset: disabled
Rebooting in 86400 seconds..

Fixes: 1470ddf7f8 ("inet: Remove explicit write references to sk/inet in ip_append_data")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-07 11:55:11 -08:00
Guillaume Nault
721c8dafad tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
Syncookies borrow the ->rx_opt.ts_recent_stamp field to store the
timestamp of the last synflood. Protect them with READ_ONCE() and
WRITE_ONCE() since reads and writes aren't serialised.

Use of .rx_opt.ts_recent_stamp for storing the synflood timestamp was
introduced by a0f82f64e2 ("syncookies: remove last_synq_overflow from
struct tcp_sock"). But unprotected accesses were already there when
timestamp was stored in .last_synq_overflow.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-06 21:05:14 -08:00
Guillaume Nault
cb44a08f86 tcp: tighten acceptance of ACKs not matching a child socket
When no synflood occurs, the synflood timestamp isn't updated.
Therefore it can be so old that time_after32() can consider it to be
in the future.

That's a problem for tcp_synq_no_recent_overflow() as it may report
that a recent overflow occurred while, in fact, it's just that jiffies
has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31.

Spurious detection of recent overflows lead to extra syncookie
verification in cookie_v[46]_check(). At that point, the verification
should fail and the packet dropped. But we should have dropped the
packet earlier as we didn't even send a syncookie.

Let's refine tcp_synq_no_recent_overflow() to report a recent overflow
only if jiffies is within the
[last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This
way, no spurious recent overflow is reported when jiffies wraps and
'last_overflow' becomes in the future from the point of view of
time_after32().

However, if jiffies wraps and enters the
[last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with
'last_overflow' being a stale synflood timestamp), then
tcp_synq_no_recent_overflow() still erroneously reports an
overflow. In such cases, we have to rely on syncookie verification
to drop the packet. We unfortunately have no way to differentiate
between a fresh and a stale syncookie timestamp.

In practice, using last_overflow as lower bound is problematic.
If the synflood timestamp is concurrently updated between the time
we read jiffies and the moment we store the timestamp in
'last_overflow', then 'now' becomes smaller than 'last_overflow' and
tcp_synq_no_recent_overflow() returns true, potentially dropping a
valid syncookie.

Reading jiffies after loading the timestamp could fix the problem,
but that'd require a memory barrier. Let's just accommodate for
potential timestamp growth instead and extend the interval using
'last_overflow - HZ' as lower bound.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-06 21:05:14 -08:00
Guillaume Nault
04d26e7b15 tcp: fix rejected syncookies due to stale timestamps
If no synflood happens for a long enough period of time, then the
synflood timestamp isn't refreshed and jiffies can advance so much
that time_after32() can't accurately compare them any more.

Therefore, we can end up in a situation where time_after32(now,
last_overflow + HZ) returns false, just because these two values are
too far apart. In that case, the synflood timestamp isn't updated as
it should be, which can trick tcp_synq_no_recent_overflow() into
rejecting valid syncookies.

For example, let's consider the following scenario on a system
with HZ=1000:

  * The synflood timestamp is 0, either because that's the timestamp
    of the last synflood or, more commonly, because we're working with
    a freshly created socket.

  * We receive a new SYN, which triggers synflood protection. Let's say
    that this happens when jiffies == 2147484649 (that is,
    'synflood timestamp' + HZ + 2^31 + 1).

  * Then tcp_synq_overflow() doesn't update the synflood timestamp,
    because time_after32(2147484649, 1000) returns false.
    With:
      - 2147484649: the value of jiffies, aka. 'now'.
      - 1000: the value of 'last_overflow' + HZ.

  * A bit later, we receive the ACK completing the 3WHS. But
    cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
    says that we're not under synflood. That's because
    time_after32(2147484649, 120000) returns false.
    With:
      - 2147484649: the value of jiffies, aka. 'now'.
      - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.

    Of course, in reality jiffies would have increased a bit, but this
    condition will last for the next 119 seconds, which is far enough
    to accommodate for jiffie's growth.

Fix this by updating the overflow timestamp whenever jiffies isn't
within the [last_overflow, last_overflow + HZ] range. That shouldn't
have any performance impact since the update still happens at most once
per second.

Now we're guaranteed to have fresh timestamps while under synflood, so
tcp_synq_no_recent_overflow() can safely use it with time_after32() in
such situations.

Stale timestamps can still make tcp_synq_no_recent_overflow() return
the wrong verdict when not under synflood. This will be handled in the
next patch.

For 64 bits architectures, the problem was introduced with the
conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
cca9bab1b7 ("tcp: use monotonic timestamps for PAWS").
The problem has always been there on 32 bits architectures.

Fixes: cca9bab1b7 ("tcp: use monotonic timestamps for PAWS")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-06 21:05:14 -08:00
John Hurley
dbad340889 net: core: rename indirect block ingress cb function
With indirect blocks, a driver can register for callbacks from a device
that is does not 'own', for example, a tunnel device. When registering to
or unregistering from a new device, a callback is triggered to generate
a bind/unbind event. This, in turn, allows the driver to receive any
existing rules or to properly clean up installed rules.

When first added, it was assumed that all indirect block registrations
would be for ingress offloads. However, the NFP driver can, in some
instances, support clsact qdisc binds for egress offload.

Change the name of the indirect block callback command in flow_offload to
remove the 'ingress' identifier from it. While this does not change
functionality, a follow up patch will implement a more more generic
callback than just those currently just supporting ingress offload.

Fixes: 4d12ba4278 ("nfp: flower: allow offloading of matches on 'internal' ports")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-06 20:45:09 -08:00