linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-09 03:10:30 -04:00

Go to file

Coly Li 7ba0d830dc bcache: set error_limit correctly

Struct cache uses io_errors for two purposes,
- Error decay: when cache set error_decay is set, io_errors is used to
  generate a small piece of delay when I/O error happens.
- I/O errors counter: in order to generate big enough value for error
  decay, I/O errors counter value is stored by left shifting 20 bits (a.k.a
  IO_ERROR_SHIFT).

In function bch_count_io_errors(), if I/O errors counter reaches cache set
error limit, bch_cache_set_error() will be called to retire the whold cache
set. But current code is problematic when checking the error limit, see the
following code piece from bch_count_io_errors(),

 90     if (error) {
 91             char buf[BDEVNAME_SIZE];
 92             unsigned errors = atomic_add_return(1 << IO_ERROR_SHIFT,
 93                                                 &ca->io_errors);
 94             errors >>= IO_ERROR_SHIFT;
 95
 96             if (errors < ca->set->error_limit)
 97                     pr_err("%s: IO error on %s, recovering",
 98                            bdevname(ca->bdev, buf), m);
 99             else
100                     bch_cache_set_error(ca->set,
101                                         "%s: too many IO errors %s",
102                                         bdevname(ca->bdev, buf), m);
103     }

At line 94, errors is right shifting IO_ERROR_SHIFT bits, now it is real
errors counter to compare at line 96. But ca->set->error_limit is initia-
lized with an amplified value in bch_cache_set_alloc(),
1545         c->error_limit  = 8 << IO_ERROR_SHIFT;

It means by default, in bch_count_io_errors(), before 8<<20 errors happened
bch_cache_set_error() won't be called to retire the problematic cache
device. If the average request size is 64KB, it means bcache won't handle
failed device until 512GB data is requested. This is too large to be an I/O
threashold. So I believe the correct error limit should be much less.

This patch sets default cache set error limit to 8, then in
bch_count_io_errors() when errors counter reaches 8 (if it is default
value), function bch_cache_set_error() will be called to retire the whole
cache set. This patch also removes bits shifting when store or show
io_error_limit value via sysfs interface.

Nowadays most of SSDs handle internal flash failure automatically by LBA
address re-indirect mapping. If an I/O error can be observed by upper layer
code, it will be a notable error because that SSD can not re-indirect
map the problematic LBA address to an available flash block. This situation
indicates the whole SSD will be failed very soon. Therefore setting 8 as
the default io error limit value makes sense, it is enough for most of
cache devices.

Changelog:
v2: add reviewed-by from Hannes.
v1: initial version for review.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Junhui Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2018-02-07 12:50:01 -07:00

arch

Merge tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

2018-02-06 11:27:48 -08:00

block

block: Add should_fail_bio() for bpf error injection

2018-02-06 15:09:51 -07:00

certs

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

2017-11-02 11:10:55 +01:00

crypto

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

2018-01-31 14:22:45 -08:00

Documentation

Merge tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

2018-02-06 11:27:48 -08:00

drivers

bcache: set error_limit correctly

2018-02-07 12:50:01 -07:00

firmware

kbuild: remove all dummy assignments to obj-

2017-11-18 11:46:06 +09:00

Merge tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

2018-02-06 11:27:48 -08:00

include

Merge tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

2018-02-06 11:27:48 -08:00

init

Merge tag 'init_task-20180117' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

2018-01-29 09:08:34 -08:00

ipc

Merge branch 'work.mqueue' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

2018-01-30 18:32:21 -08:00

kernel

Merge tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

2018-02-06 10:41:33 -08:00

lib

Merge tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

2018-02-06 09:59:40 -08:00

LICENSES

LICENSES: Add MPL-1.1 license

2018-01-06 10:59:44 -07:00

Merge tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

2018-02-06 10:41:33 -08:00

net

Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2018-02-04 11:45:55 -08:00

samples

Merge tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

2018-02-01 10:00:28 -08:00

scripts

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk

2018-02-01 13:36:15 -08:00

security

Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

2018-02-03 16:25:42 -08:00

sound

Merge tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

2018-02-01 10:00:28 -08:00

tools

Merge tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

2018-02-06 10:41:33 -08:00

usr

initramfs: fix initramfs rebuilds w/ compression after disabling

2017-11-03 07:39:19 -07:00

virt

Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

2018-02-03 16:25:42 -08:00

.cocciconfig

scripts: add Linux .cocciconfig for coccinelle

2016-07-22 12:13:39 +02:00

.get_maintainer.ignore

Add hch to .get_maintainer.ignore

2015-08-21 14:30:10 -07:00

.gitattributes

.gitattributes: set git diff driver for C source code files

2016-10-07 18:46:30 -07:00

.gitignore

scripts/package: snap-pkg target

2017-12-13 00:00:18 +09:00

.mailmap

mailmap: update Mark Yao's email address

2018-01-04 16:45:09 -08:00

COPYING

…

CREDITS

MAINTAINERS: update TPM driver infrastructure changes

2017-11-09 17:58:40 -08:00

Kbuild

Merge tag 'kbuild-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

2017-11-17 17:45:29 -08:00

Kconfig

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

2017-11-02 11:10:55 +01:00

MAINTAINERS

Merge tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

2018-02-06 11:27:48 -08:00

Makefile

Merge tag 'kconfig-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

2018-02-01 11:45:49 -08:00

README

README: add a new README file, pointing to the Documentation/

2016-10-24 08:12:35 -02:00

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.

Languages

C 97%

Assembly 1%

Shell 0.6%

Rust 0.5%

Python 0.4%

Other 0.3%