linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-03-17 03:59:34 -04:00

Author	SHA1	Message	Date
Paul Burton	1fad12cd5e	irqchip: mips-gic: Remove linux/irqchip/mips-gic.h The linux/irqchip/mips-gic.h header is no longer used. Remove it. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17049/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	16ae123e89	MIPS: VDSO: Avoid use of linux/irqchip/mips-gic.h Our VDSO code makes use of macros from linux/irqchip/mips-gic.h to provide offsets to register values, but these are trivial offsets to the two 32 bit halves of a 64 bit value. Replace use of the macros with zero (ie. omit adding an offset) and the size of the low 32 bit of the value. This removes our need for linux/irqchip/mips-gic.h & prepares us for it to be removed. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17047/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	dd0163508c	irqchip: mips-gic: Move gic_get_c0_*_int() to asm/mips-gic.h The linux/irqchip/mips-gic.h header is now almost empty. Move the declarations of gic_get_c0_compare_int(), gic_get_c0_perfcount_int() & gic_get_c0_fdc_int() to asm/mips-gic.h in order to close in on being able to delete the former header. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17046/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	56d7b61dc6	irqchip: mips-gic: Remove gic_present Nothing uses the global gic_present variable anymore; mips_gic_present() should be used instead. Remove the dead code. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17045/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	85eec73ce4	irqchip: mips-gic: Remove gic_init() All in-tree platforms now probe the GIC driver using device tree, and as such nothing calls gic_init() any longer. Remove the dead code. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17043/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	84103814a2	irqchip: mips-gic: Remove gic_get_usm_range() The MIPS VDSO code is no longer reliant upon the irqchip driver to provide the address of the GIC's user-visible section via gic_get_usm_range(). Remove the now-dead code. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17041/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	b11d4c1f5a	irqchip: mips-gic: Move various definitions to the driver Move the definitions of macros used to convert between hardware IRQ numbers & shared or local interrupt numbers into the irqchip driver, which is all that should ever need to care about them. Remove GIC_CPU_TO_VEC_OFFSET() in the process since it's never used. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17039/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	3ee50dcbef	irqchip: mips-gic: Remove GIC_CPU_INT* macros The GIC_CPU_INT* macros are never used. Remove the dead code. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17038/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	ba9cc4352e	MIPS: GIC: Move GIC_LOCAL_INT_* to asm/mips-gic.h Move the definition of VP-local interrupts provided by the MIPS Global Interrupt Controller to the new asm/mips-gic.h header to be alongside the new accessor functions. Whilst at it, convert to an enum which lends itself more easily to expansion & documentation. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17037/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	0d0cf58cd6	irqchip: mips-gic: Convert remaining local reg access to new accessors Convert the remaining accesses to registers in the GIC VP-local & VP-other register blocks to use the new accessor functions provided by asm/mips-gic.h, resulting in code which is often shorter & easier to read. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17036/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	9da3c64589	irqchip: mips-gic: Convert local int mask access to new accessors Use the new accessor functions provided by asm/mips-gic.h to access masks controlling local interrupts, resulting in code which is often shorter & easier to read. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17035/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	3680746abd	irqchip: mips-gic: Convert remaining shared reg access to new accessors Convert the remaining accesses to registers in the GIC shared register block to use the new accessor functions provided by asm/mips-gic.h, resulting in code which is often shorter & easier to read. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17034/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	0efe3cbf15	irqchip: mips-gic: Remove gic_map_to_vpe() Remove the gic_map_to_vpe() function in favour of using the new write_gic_map_vp() accessor function which isn't any more complex to use & allows us to drop a level of abstraction. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17033/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	d3e8cf4479	irqchip: mips-gic: Remove gic_map_to_pin() Remove the gic_map_to_pin() function in favour of using the new write_gic_map_pin() accessor function which isn't any more complex to use & allows us to drop a level of abstraction. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17032/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	c26ba670cd	irqchip: mips-gic: Remove gic_set_dual_edge() Remove the gic_set_dual_edge() function in favour of using the new change_gic_dual() accessor function which provides equivalent functionality. This also allows us to remove the gic_update_bits() function which gic_set_dual_edge() was the last user of, along with the GIC_INTR_OFS() & GIC_INTR_BIT() macros. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17031/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	471aa962a6	irqchip: mips-gic: Remove gic_set_trigger() Remove the gic_set_trigger() function in favour of using the new change_gic_trig() accessor function which provides equivalent functionality. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17030/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	80e5f9c9e2	irqchip: mips-gic: Remove gic_set_polarity() Remove the gic_set_polarity() function in favour of using the new change_gic_pol() accessor function which provides equivalent functionality. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17029/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	87554b0ef3	irqchip: mips-gic: Drop gic_(re)set_mask() functions The gic_set_mask() & gic_reset_mask() functions are now no more convenient to call than the write_gic_smask() or write_gic_rmask() accessor functions. Remove the layer of abstraction. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17028/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	a0dc5cb5e3	irqchip: mips-gic: Simplify gic_local_irq_domain_map() Simplify gic_local_irq_domain_map() by: - Moving the check for invalid IRQs outside of the loop. - Moving the decision about whether to use gic_cpu_pin or timer_cpu_pin outside of the loop. - Using the new write_gic_vo_map() accessor function to avoid the need to handle each map register separately. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17027/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	e98fcb2a8c	irqchip: mips-gic: Simplify shared interrupt pending/mask reads Simplify the reads of the bitmaps indicating pending & masked interrupts in gic_handle_shared_int() using the __ioread32_copy() & __ioread64_copy() helper functions. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17026/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	9762d2e6d3	irqchip: mips-gic: Remove gic_read_local_vp_id() Nothing needs gic_read_local_vp_id() any longer, so remove the dead code. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17024/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Paul Burton	095a7e388b	irqchip: mips-gic: Remove counter access functions The MIPS GIC clocksource driver is no longer using the accessor functions provided by the irqchip driver, so remove them. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17022/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-09-04 13:53:14 +02:00
Linus Torvalds	2c25833c42	Merge tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fix from Joerg Roedel: "Another fix, this time in common IOMMU sysfs code. In the conversion from the old iommu sysfs-code to the iommu_device_register interface, I missed to update the release path for the struct device associated with an IOMMU. It freed the 'struct device', which was a pointer before, but is now embedded in another struct. Freeing from the middle of allocated memory had all kinds of nasty side effects when an IOMMU was unplugged. Unfortunatly nobody unplugged and IOMMU until now, so this was not discovered earlier. The fix is to make the 'struct device' a pointer again" * tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Fix wrong freeing of iommu_device->dev	2017-08-27 17:10:34 -07:00
Linus Torvalds	c3c162635f	Merge tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/iio fixes from Greg KH: "Here are few small staging driver fixes, and some more IIO driver fixes for 4.13-rc7. Nothing major, just resolutions for some reported problems. All of these have been in linux-next with no reported problems" * tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: magnetometer: st_magn: remove ihl property for LSM303AGR iio: magnetometer: st_magn: fix status register address for LSM303AGR iio: hid-sensor-trigger: Fix the race with user space powering up sensors iio: trigger: stm32-timer: fix get trigger mode iio: imu: adis16480: Fix acceleration scale factor for adis16480 PATCH] iio: Fix some documentation warnings staging: rtl8188eu: add RNX-N150NUB support Revert "staging: fsl-mc: be consistent when checking strcmp() return" iio: adc: stm32: fix common clock rate iio: adc: ina219: Avoid underflow for sleeping time iio: trigger: stm32-timer: add enable attribute iio: trigger: stm32-timer: fix get/set down count direction iio: trigger: stm32-timer: fix write_raw return value iio: trigger: stm32-timer: fix quadrature mode get routine iio: bmp280: properly initialize device for humidity reading	2017-08-27 17:03:33 -07:00
Linus Torvalds	0cc3b0ec23	Clarify (and fix) MAX_LFS_FILESIZE macros We have a MAX_LFS_FILESIZE macro that is meant to be filled in by filesystems (and other IO targets) that know they are 64-bit clean and don't have any 32-bit limits in their IO path. It turns out that our 32-bit value for that limit was bogus. On 32-bit, the VM layer is limited by the page cache to only 32-bit index values, but our logic for that was confusing and actually wrong. We used to define that value to (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1) which is actually odd in several ways: it limits the index to 31 bits, and then it limits files so that they can't have data in that last byte of a page that has the highest 31-bit index (ie page index 0x7fffffff). Neither of those limitations make sense. The index is actually the full 32 bit unsigned value, and we can use that whole full page. So the maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG". However, we do wan tto avoid the maximum index, because we have code that iterates over the page indexes, and we don't want that code to overflow. So the maximum size of a file on a 32-bit host should actually be one page less than the full 32-bit index. So the actual limit is ULONG_MAX << PAGE_SHIFT. That means that we will not actually be using the page of that last index (ULONG_MAX), but we can grow a file up to that limit. The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5 volume. It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one byte less), but the actual true VM limit is one page less than 16TiB. This was invisible until commit `c2a9737f45` ("vfs,mm: fix a dead loop in truncate_inode_pages_range()"), which started applying that MAX_LFS_FILESIZE limit to block devices too. NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is actually just the offset type itself (loff_t), which is signed. But for clarity, on 64-bit, just use the maximum signed value, and don't make people have to count the number of 'f' characters in the hex constant. So just use LLONG_MAX for the 64-bit case. That was what the value had been before too, just written out as a hex constant. Fixes: `c2a9737f45` ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") Reported-and-tested-by: Doug Nazar <nazard@nazar.ca> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-27 12:12:25 -07:00
Linus Torvalds	0b31c3ec1b	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A small batch of fixes that should be included for the 4.13 release. This contains: - Revert of the 4k loop blocksize support. Even with a recent batch of 4 fixes, we're still not really happy with it. Rather than be stuck with an API issue, let's revert it and get it right for 4.14. - Trivial patch from Bart, adding a few flags to the blk-mq debugfs exports that were added in this release, but not to the debugfs parts. - Regression fix for bsg, fixing a potential kernel panic. From Benjamin. - Tweak for the blk throttling, improving how we account discards. From Shaohua" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq-debugfs: Add names for recently added flags bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer Revert "loop: support 4k physical blocksize" blk-throttle: cap discard request size	2017-08-25 17:02:59 -07:00
Linus Torvalds	90a6cd5039	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull more rdma fixes from Doug Ledford: "Well, I thought we were going to be done for this -rc cycle. I should have known better than to say so though. We have four additional items that trickled in. One was a simple mistake on my part. I took a patch into my for-next thinking that the issue was less severe than it was. I was then notified that it needed to be in my -rc area instead. The other three were just found late in testing. Summary: - One core fix accidentally applied first to for-next and then cherry picked back because it needed to be in the -rc cycles instead - Another core fix - Two mlx5 fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/mlx5: Always return success for RoCE modify port IB/mlx5: Fix Raw Packet QP event handler assignment IB/core: Avoid accessing non-allocated memory when inferring port type RDMA/uverbs: Initialize cq_context appropriately	2017-08-24 15:48:38 -07:00
Linus Torvalds	f7bbf0754b	Merge tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix linker script regression caused by dead code elimination support - fix typos and outdated comments - specify kselftest-clean as a PHONY target - fix "make dtbs_install" when $(srctree) includes shell special characters like '~' - Move -fshort-wchar to the global option list because defining it partially emits warnings * tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: update comments of Makefile.asm-generic kbuild: Do not use hyphen in exported variable name Makefile: add kselftest-clean to PHONY target list Kbuild: use -fshort-wchar globally fixdep: trivial: typo fix and correction kbuild: trivial cleanups on the comments kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured	2017-08-24 14:22:27 -07:00
Eric W. Biederman	311fc65c9f	pty: Repair TIOCGPTPEER The implementation of TIOCGPTPEER has two issues. When /dev/ptmx (as opposed to /dev/pts/ptmx) is opened the wrong vfsmount is passed to dentry_open. Which results in the kernel displaying the wrong pathname for the peer. The second is simply by caching the vfsmount and dentry of the peer it leaves them open, in a way they were not previously Which because of the inreased reference counts can cause unnecessary behaviour differences resulting in regressions. To fix these move the ioctl into tty_io.c at a generic level allowing the ioctl to have access to the struct file on which the ioctl is being called. This allows the path of the slave to be derived when opening the slave through TIOCGPTPEER instead of requiring the path to the slave be cached. Thus removing the need for caching the path. A new function devpts_ptmx_path is factored out of devpts_acquire and used to implement a function devpts_mntget. The new function devpts_mntget takes a filp to perform the lookup on and fsi so that it can confirm that the superblock that is found by devpts_ptmx_path is the proper superblock. v2: Lots of fixes to make the code actually work v3: Suggestions by Linus - Removed the unnecessary initialization of filp in ptm_open_peer - Simplified devpts_ptmx_path as gotos are no longer required [ This is the fix for the issue that was reverted in commit `143c97cc65`, but this time without breaking 'pbuilder' due to increased reference counts - Linus ] Fixes: `54ebbfb160` ("tty: add TIOCGPTPEER ioctl") Reported-by: Christian Brauner <christian.brauner@canonical.com> Reported-and-tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-24 13:23:03 -07:00
Noa Osherovich	498ca3c82a	IB/core: Avoid accessing non-allocated memory when inferring port type Commit `44c58487d5` ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") introduced the concept of type in ah_attr: * During ib_register_device, each port is checked for its type which is stored in ib_device's port_immutable array. * During uverbs' modify_qp, the type is inferred using the port number in ib_uverbs_qp_dest struct (address vector) by accessing the relevant port_immutable array and the type is passed on to providers. IB spec (version 1.3) enforces a valid port value only in Reset to Init. During Init to RTR, the address vector must be valid but port number is not mentioned as a field in the address vector, so its value is not validated, which leads to accesses to a non-allocated memory when inferring the port type. Save the real port number in ib_qp during modify to Init (when the comp_mask indicates that the port number is valid) and use this value to infer the port type. Avoid copying the address vector fields if the matching bit is not set in the attr_mask. Address vector can't be modified before the port, so no valid flow is affected. Fixes: `44c58487d5` ('IB/core: Define 'ib' and 'roce' rdma_ah_attr types') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-24 15:33:33 -04:00
Benjamin Block	50b4d48552	bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer Since we split the scsi_request out of struct request bsg fails to provide a reply-buffer for the drivers. This was done via the pointer for sense-data, that is not preallocated anymore. Failing to allocate/assign it results in illegal dereferences because LLDs use this pointer unquestioned. An example panic on s390x, using the zFCP driver, looks like this (I had debugging on, otherwise NULL-pointer dereferences wouldn't even panic on s390x): Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6403 Fault in home space mode while using kernel ASCE. AS:0000000001590007 R3:0000000000000024 Oops: 0038 ilc:2 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: <Long List> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-bsg-regression+ #3 Hardware name: IBM 2964 N96 702 (z/VM 6.4.0) task: 0000000065cb0100 task.stack: 0000000065cb4000 Krnl PSW : 0704e00180000000 000003ff801e4156 (zfcp_fc_ct_els_job_handler+0x16/0x58 [zfcp]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000001 000000005fa9d0d0 000000005fa9d078 0000000000e16866 000003ff00000290 6b6b6b6b6b6b6b6b 0000000059f78f00 000000000000000f 00000000593a0958 00000000593a0958 0000000060d88800 000000005ddd4c38 0000000058b50100 07000000659cba08 000003ff801e8556 00000000659cb9a8 Krnl Code: 000003ff801e4146: e31020500004 lg %r1,80(%r2) 000003ff801e414c: 58402040 l %r4,64(%r2) #000003ff801e4150: e35020200004 lg %r5,32(%r2) >000003ff801e4156: 50405004 st %r4,4(%r5) 000003ff801e415a: e54c50080000 mvhi 8(%r5),0 000003ff801e4160: e33010280012 lt %r3,40(%r1) 000003ff801e4166: a718fffb lhi %r1,-5 000003ff801e416a: 1803 lr %r0,%r3 Call Trace: ([<000003ff801e8556>] zfcp_fsf_req_complete+0x726/0x768 [zfcp]) [<000003ff801ea82a>] zfcp_fsf_reqid_check+0x102/0x180 [zfcp] [<000003ff801eb980>] zfcp_qdio_int_resp+0x230/0x278 [zfcp] [<00000000009b91b6>] qdio_kick_handler+0x2ae/0x2c8 [<00000000009b9e3e>] __tiqdio_inbound_processing+0x406/0xc10 [<00000000001684c2>] tasklet_action+0x15a/0x1d8 [<0000000000bd28ec>] __do_softirq+0x3ec/0x848 [<00000000001675a4>] irq_exit+0x74/0xf8 [<000000000010dd6a>] do_IRQ+0xba/0xf0 [<0000000000bd19e8>] io_int_handler+0x104/0x2d4 [<00000000001033b6>] enabled_wait+0xb6/0x188 ([<000000000010339e>] enabled_wait+0x9e/0x188) [<000000000010396a>] arch_cpu_idle+0x32/0x50 [<0000000000bd0112>] default_idle_call+0x52/0x68 [<00000000001cd0fa>] do_idle+0x102/0x188 [<00000000001cd41e>] cpu_startup_entry+0x3e/0x48 [<0000000000118c64>] smp_start_secondary+0x11c/0x130 [<0000000000bd2016>] restart_int_handler+0x62/0x78 [<0000000000000000>] (null) INFO: lockdep is turned off. Last Breaking-Event-Address: [<000003ff801e41d6>] zfcp_fc_ct_job_handler+0x3e/0x48 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt This patch moves bsg-lib to allocate and setup struct bsg_job ahead of time, including the allocation of a buffer for the reply-data. This means, struct bsg_job is not allocated separately anymore, but as part of struct request allocation - similar to struct scsi_cmd. Reflect this in the function names that used to handle creation/destruction of struct bsg_job. Reported-by: Steffen Maier <maier@linux.vnet.ibm.com> Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Fixes: `82ed4db499` ("block: split scsi_request out of struct request") Cc: <stable@vger.kernel.org> #4.11+ Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-24 08:22:10 -06:00
Linus Torvalds	143c97cc65	Revert "pty: fix the cached path of the pty slave file descriptor in the master" This reverts commit `c8c03f1858`. It turns out that while fixing the ptmx file descriptor to have the correct 'struct path' to the associated slave pty is a really good thing, it breaks some user space tools for a very annoying reason. The problem is that /dev/ptmx and its associated slave pty (/dev/pts/X) are on different mounts. That was what caused us to have the wrong path in the first place (we would mix up the vfsmount of the 'ptmx' node, with the dentry of the pty slave node), but it also means that now while we use the right vfsmount, having the pty master open also keeps the pts mount busy. And it turn sout that that makes 'pbuilder' very unhappy, as noted by Stefan Lippers-Hollmann: "This patch introduces a regression for me when using pbuilder 0.228.7[2] (a helper to build Debian packages in a chroot and to create and update its chroots) when trying to umount /dev/ptmx (inside the chroot) on Debian/ unstable (full log and pbuilder configuration file[3] attached). [...] Setting up build-essential (12.3) ... Processing triggers for libc-bin (2.24-15) ... I: unmounting dev/ptmx filesystem W: Could not unmount dev/ptmx: umount: /var/cache/pbuilder/build/1340/dev/ptmx: target is busy (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1).)" apparently pbuilder tries to unmount the /dev/pts filesystem while still holding at least one master node open, which is arguably not very nice, but we don't break user space even when fixing other bugs. So this commit has to be reverted. I'll try to figure out a way to avoid caching the path to the slave pty in the master pty. The only thing that actually wants that slave pty path is the "TIOCGPTPEER" ioctl, and I think we could just recreate the path at that time. Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Cc: Eric W Biederman <ebiederm@xmission.com> Cc: Christian Brauner <christian.brauner@canonical.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-23 18:16:11 -07:00
Omar Sandoval	1e6ec9ea89	Revert "loop: support 4k physical blocksize" There's some stuff still up in the air, let's not get stuck with a subpar ABI. I'll follow up with something better for 4.14. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-23 15:57:55 -06:00
Linus Torvalds	55652400fd	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Six minor and error leg fixes, plus one major change: the reversion of scsi-mq as the default. We're doing the latter temporarily (with a backport to stable) to give us time to fix all the issues that turned up with this default before trying again" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: cxgb4i: call neigh_event_send() to update MAC address Revert "scsi: default to scsi-mq" scsi: sd_zbc: Write unlock zone from sd_uninit_cmnd() scsi: aacraid: Fix out of bounds in aac_get_name_resp scsi: csiostor: fail probe if fw does not support FCoE scsi: megaraid_sas: fix error handle in megasas_probe_one	2017-08-23 11:34:40 -07:00
Linus Torvalds	e3181f2c0c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix IGMP handling wrt VRF, from David Ahern. 2) Fix timer access to freed object in dccp, from Eric Dumazet. 3) Use kmalloc_array() in ptr_ring to avoid overflow cases which are triggerable by userspace. Also from Eric Dumazet. 4) Fix infinite loop in unmapping cleanup of nfp driver, from Colin Ian King. 5) Correct datagram peek handling of empty SKBs, from Matthew Dawson. 6) Fix use after free in TIPC, from Eric Dumazet. 7) When replacing a route in ipv6 we need to reset the round robin pointer, from Wei Wang. 8) Fix bug in pci_find_pcie_root_port() which was unearthed by the relaxed ordering changes, from Thierry Redding. I made sure to get an explicit ACK from Bjorn this time around :-) * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits) ipv6: repair fib6 tree in failure case net_sched: fix order of queue length updates in qdisc_replace() tools lib bpf: improve warning switchdev: documentation: minor typo fixes bpf, doc: also add s390x as arch to sysctl description net: sched: fix NULL pointer dereference when action calls some targets rxrpc: Fix oops when discarding a preallocated service call irda: do not leak initialized list.dev to userspace net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled PCI: Allow PCI express root ports to find themselves tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set ipv6: reset fn->rr_ptr when replacing route sctp: fully initialize the IPv6 address in sctp_v6_to_addr() tipc: fix use-after-free tun: handle register_netdevice() failures properly datagram: When peeking datagrams with offset < 0 don't skip empty skbs bpf, doc: improve sysctl knob description netxen: fix incorrect loop counter decrement nfp: fix infinite loop on umapping cleanup ...	2017-08-21 13:16:27 -07:00
Oleg Nesterov	dd1c1f2f20	pids: make task_tgid_nr_ns() safe This was reported many times, and this was even mentioned in commit `52ee2dfdd4` ("pids: refactor vnr/nr_ns helpers to make them safe") but somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is not safe because task->group_leader points to nowhere after the exiting task passes exit_notify(), rcu_read_lock() can not help. We really need to change __unhash_process() to nullify group_leader, parent, and real_parent, but this needs some cleanups. Until then we can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and fix the problem. Reported-by: Troy Kensinger <tkensinger@google.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-21 12:47:31 -07:00
Konstantin Khlebnikov	68a66d149a	net_sched: fix order of queue length updates in qdisc_replace() This important to call qdisc_tree_reduce_backlog() after changing queue length. Parent qdisc should deactivate class in ->qlen_notify() called from qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero. Missed class deactivations leads to crashes/warnings at picking packets from empty qdisc and corrupting state at reactivating this class in future. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: `86a7996cc8` ("net_sched: introduce qdisc_replace() helper") Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-20 20:02:00 -07:00
Greg Kroah-Hartman	2c68888f1d	Merge tag 'fixes-for-4.13b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus Jonathan writes: Second set of IIO fixes for the 4.13 cycle. Given the late stage of this series, some more involved fixes have been held back for the upcoming merge window. The hid-sensor issue has been causing problems for a long time so it is great to have that one finally fixed! No more bug reports for the userspace guys (well about that anyway). * documentation - some warning fixes due to missing colons in kernel-doc. * adis16480 - fix accel scale factor. * bmp280 - properly initialize the device for humidity readings - without this the humidity readings may be skipped and a magic value of 0x8000 returned. * hid-sensor-strigger - fix a race with user space when powering up the sensor. * ina291 - Avoid an underflow for the sleeping time as a result of supporting the fastest rates. * st-magnetometer - Fix the status register address for hte LSM303AGR, - Remove the ihl property for LSM303AGR as the sensor doesn't support active low for the dataready line. * stm32-adc - Fix use of a common clock rate. * stm32-timer - fix the quadrature mode get routine to account for the magic 0 value. set on boot. - fix the return value of write_raw, - fix the get/set down count direction as the enum value was not being converted to the relevant bit field, - add an enable attribute to actually turn it on when in encoder mode, - missing mask when reading the trigger mode.	2017-08-20 10:45:54 -07:00
Linus Torvalds	e46db8d2ef	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two fixes for the perf subsystem: - Fix an inconsistency of RDPMC mm struct tagging across exec() which causes RDPMC to fault. - Correct the timestamp mechanics across IOC_DISABLE/ENABLE which causes incorrect timestamps and total time calculations" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix time on IOC_ENABLE perf/x86: Fix RDPMC vs. mm_struct tracking	2017-08-20 09:20:57 -07:00
Linus Torvalds	e18a5ebc2d	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull watchdog fix from Thomas Gleixner: "A fix for the hardlockup watchdog to prevent false positives with extreme Turbo-Modes which make the perf/NMI watchdog fire faster than the hrtimer which is used to verify. Slightly larger than the minimal fix, which just would increase the hrtimer frequency, but comes with extra overhead of more watchdog timer interrupts and thread wakeups for all users. With this change we restrict the overhead to the extreme Turbo-Mode systems" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kernel/watchdog: Prevent false positives with turbo modes	2017-08-20 08:54:30 -07:00
Jonathan Corbet	f00fd7ae4f	PATCH] iio: Fix some documentation warnings The kerneldoc description for the trig_readonly field of struct iio_dev lacked a colon, leading to this doc build warning: ./include/linux/iio/iio.h:603: warning: No description found for parameter 'trig_readonly' A similar issue for iio_trigger_set_immutable() in trigger.h yielded: ./include/linux/iio/trigger.h:151: warning: No description found for parameter 'indio_dev' ./include/linux/iio/trigger.h:151: warning: No description found for parameter 'trig' Fix the formatting and silence the warnings. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2017-08-20 15:21:46 +01:00
Michal Hocko	6b31d5955c	mm, oom: fix potential data corruption when oom_reaper races with writer Wenwei Tao has noticed that our current assumption that the oom victim is dying and never doing any visible changes after it dies, and so the oom_reaper can tear it down, is not entirely true. __task_will_free_mem consider a task dying when SIGNAL_GROUP_EXIT is set but do_group_exit sends SIGKILL to all threads _after_ the flag is set. So there is a race window when some threads won't have fatal_signal_pending while the oom_reaper could start unmapping the address space. Moreover some paths might not check for fatal signals before each PF/g-u-p/copy_from_user. We already have a protection for oom_reaper vs. PF races by checking MMF_UNSTABLE. This has been, however, checked only for kernel threads (use_mm users) which can outlive the oom victim. A simple fix would be to extend the current check in handle_mm_fault for all tasks but that wouldn't be sufficient because the current check assumes that a kernel thread would bail out after EFAULT from get_user*/copy_from_user and never re-read the same address which would succeed because the PF path has established page tables already. This seems to be the case for the only existing use_mm user currently (virtio driver) but it is rather fragile in general. This is even more fragile in general for more complex paths such as generic_perform_write which can re-read the same address more times (e.g. iov_iter_copy_from_user_atomic to fail and then iov_iter_fault_in_readable on retry). Therefore we have to implement MMF_UNSTABLE protection in a robust way and never make a potentially corrupted content visible. That requires to hook deeper into the PF path and check for the flag _every time_ before a pte for anonymous memory is established (that means all !VM_SHARED mappings). The corruption can be triggered artificially (http://lkml.kernel.org/r/201708040646.v746kkhC024636@www262.sakura.ne.jp) but there doesn't seem to be any real life bug report. The race window should be quite tight to trigger most of the time. Link: http://lkml.kernel.org/r/20170807113839.16695-3-mhocko@kernel.org Fixes: `aac4536355` ("mm, oom: introduce oom reaper") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Wenwei Tao <wenwei.tww@alibaba-inc.com> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Andrea Argangeli <andrea@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-18 15:32:01 -07:00
Pavel Tatashin	3010f87650	mm: discard memblock data later There is existing use after free bug when deferred struct pages are enabled: The memblock_add() allocates memory for the memory array if more than 128 entries are needed. See comment in e820__memblock_setup(): * The bootstrap memblock region count maximum is 128 entries * (INIT_MEMBLOCK_REGIONS), but EFI might pass us more E820 entries * than that - so allow memblock resizing. This memblock memory is freed here: free_low_memory_core_early() We access the freed memblock.memory later in boot when deferred pages are initialized in this path: deferred_init_memmap() for_each_mem_pfn_range() __next_mem_pfn_range() type = &memblock.memory; One possible explanation for why this use-after-free hasn't been hit before is that the limit of INIT_MEMBLOCK_REGIONS has never been exceeded at least on systems where deferred struct pages were enabled. Tested by reducing INIT_MEMBLOCK_REGIONS down to 4 from the current 128, and verifying in qemu that this code is getting excuted and that the freed pages are sane. Link: http://lkml.kernel.org/r/1502485554-318703-2-git-send-email-pasha.tatashin@oracle.com Fixes: `7e18adb4f8` ("mm: meminit: initialise remaining struct pages in parallel with kswapd") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-18 15:32:01 -07:00
Luis R. Rodriguez	8ada92799e	wait: add wait_event_killable_timeout() These are the few pending fixes I have queued up for v4.13-final. One is a a generic regression fix for recursive loops on kmod and the other one is a trivial print out correction. During the v4.13 development we assumed that recursive kmod loops were no longer possible. Clearly that is not true. The regression fix makes use of a new killable wait. We use a killable wait to be paranoid in how signals might be sent to modprobe and only accept a proper SIGKILL. The signal will only be available to userspace to issue iff a thread has already entered a wait state, and that happens only if we've already throttled after 50 kmod threads have been hit. Note that although it may seem excessive to trigger a failure afer 5 seconds if all kmod thread remain busy, prior to the series of changes that went into v4.13 we would actually always fatally fail any request which came in if the limit was already reached. The new waiting implemented in v4.13 actually gives us more breathing room -- the wait for 5 seconds is a wait for any kmod thread to finish. We give up and fail iff no kmod thread has finished and they're all running straight for 5 consecutive seconds. If 50 kmod threads are running consecutively for 5 seconds something else must be really bad. Recursive loops with kmod are bad but they're also hard to implement properly as a selftest without currently fooling current userspace tools like kmod [1]. For instance kmod will complain when you run depmod if it finds a recursive loop with symbol dependency between modules as such this type of recursive loop cannot go upstream as the modules_install target will fail after running depmod. These tests already exist on userspace kmod upstream though (refer to the testsuite/module-playground/mod-loop-*.c files). The same is not true if request_module() is used though, or worst if aliases are used. Likewise the issue with 64-bit kernels booting 32-bit userspace without a binfmt handler built-in is also currently not detected and proactively avoided by userspace kmod tools, or kconfig for all architectures. Although we could complain in the kernel when some of these individual recursive issues creep up, proactively avoiding these situations in userspace at build time is what we should keep striving for. Lastly, since recursive loops could happen with kmod it may mean recursive loops may also be possible with other kernel usermode helpers, this should be investigated and long term if we can come up with a more sensible generic solution even better! [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20170809-kmod-for-v4.13-final [1] https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git This patch (of 3): This wait is similar to wait_event_interruptible_timeout() but only accepts SIGKILL interrupt signal. Other signals are ignored. Link: http://lkml.kernel.org/r/20170809234635.13443-2-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Jessica Yu <jeyu@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michal Marek <mmarek@suse.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Matt Redfearn <matt.redfearn@imgetc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-18 15:32:01 -07:00
Johannes Weiner	739f79fc9d	mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback() Jaegeuk and Brad report a NULL pointer crash when writeback ending tries to update the memcg stats: BUG: unable to handle kernel NULL pointer dereference at 00000000000003b0 IP: test_clear_page_writeback+0x12e/0x2c0 [...] RIP: 0010:test_clear_page_writeback+0x12e/0x2c0 Call Trace: <IRQ> end_page_writeback+0x47/0x70 f2fs_write_end_io+0x76/0x180 [f2fs] bio_endio+0x9f/0x120 blk_update_request+0xa8/0x2f0 scsi_end_request+0x39/0x1d0 scsi_io_completion+0x211/0x690 scsi_finish_command+0xd9/0x120 scsi_softirq_done+0x127/0x150 __blk_mq_complete_request_remote+0x13/0x20 flush_smp_call_function_queue+0x56/0x110 generic_smp_call_function_single_interrupt+0x13/0x30 smp_call_function_single_interrupt+0x27/0x40 call_function_single_interrupt+0x89/0x90 RIP: 0010:native_safe_halt+0x6/0x10 (gdb) l (test_clear_page_writeback+0x12e) 0xffffffff811bae3e is in test_clear_page_writeback (./include/linux/memcontrol.h:619). 614 mod_node_page_state(page_pgdat(page), idx, val); 615 if (mem_cgroup_disabled() \|\| !page->mem_cgroup) 616 return; 617 mod_memcg_state(page->mem_cgroup, idx, val); 618 pn = page->mem_cgroup->nodeinfo[page_to_nid(page)]; 619 this_cpu_add(pn->lruvec_stat->count[idx], val); 620 } 621 622 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t pgdat, int order, 623 gfp_t gfp_mask, The issue is that writeback doesn't hold a page reference and the page might get freed after PG_writeback is cleared (and the mapping is unlocked) in test_clear_page_writeback(). The stat functions looking up the page's node or zone are safe, as those attributes are static across allocation and free cycles. But page->mem_cgroup is not, and it will get cleared if we race with truncation or migration. It appears this race window has been around for a while, but less likely to trigger when the memcg stats were updated first thing after PG_writeback is cleared. Recent changes reshuffled this code to update the global node stats before the memcg ones, though, stretching the race window out to an extent where people can reproduce the problem. Update test_clear_page_writeback() to look up and pin page->mem_cgroup before clearing PG_writeback, then not use that pointer afterward. It is a partial revert of `62cccb8c8e` ("mm: simplify lock_page_memcg()") but leaves the pageref-holding callsites that aren't affected alone. Link: http://lkml.kernel.org/r/20170809183825.GA26387@cmpxchg.org Fixes: `62cccb8c8e` ("mm: simplify lock_page_memcg()") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Jaegeuk Kim <jaegeuk@kernel.org> Tested-by: Jaegeuk Kim <jaegeuk@kernel.org> Reported-by: Bradley Bolen <bradleybolen@gmail.com> Tested-by: Brad Bolen <bradleybolen@gmail.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-18 15:32:01 -07:00
Matthew Dawson	a0917e0bc6	datagram: When peeking datagrams with offset < 0 don't skip empty skbs Due to commit `e6afc8ace6` ("udp: remove headers from UDP packets before queueing"), when udp packets are being peeked the requested extra offset is always 0 as there is no need to skip the udp header. However, when the offset is 0 and the next skb is of length 0, it is only returned once. The behaviour can be seen with the following python script: from socket import *; f=socket(AF_INET6, SOCK_DGRAM \| SOCK_NONBLOCK, 0); g=socket(AF_INET6, SOCK_DGRAM \| SOCK_NONBLOCK, 0); f.bind(('::', 0)); addr=('::1', f.getsockname()[1]); g.sendto(b'', addr) g.sendto(b'b', addr) print(f.recvfrom(10, MSG_PEEK)); print(f.recvfrom(10, MSG_PEEK)); Where the expected output should be the empty string twice. Instead, make sk_peek_offset return negative values, and pass those values to __skb_try_recv_datagram/__skb_try_recv_from_queue. If the passed offset to __skb_try_recv_from_queue is negative, the checked skb is never skipped. __skb_try_recv_from_queue will then ensure the offset is reset back to 0 if a peek is requested without an offset, unless no packets are found. Also simplify the if condition in __skb_try_recv_from_queue. If _off is greater then 0, and off is greater then or equal to skb->len, then (_off \|\| skb->len) must always be true assuming skb->len >= 0 is always true. Also remove a redundant check around a call to sk_peek_offset in af_unix.c, as it double checked if MSG_PEEK was set in the flags. V2: - Moved the negative fixup into __skb_try_recv_from_queue, and remove now redundant checks - Fix peeking in udp{,v6}_recvmsg to report the right value when the offset is 0 V3: - Marked new branch in __skb_try_recv_from_queue as unlikely. Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-18 15:12:54 -07:00
Thomas Gleixner	7edaeb6841	kernel/watchdog: Prevent false positives with turbo modes The hardlockup detector on x86 uses a performance counter based on unhalted CPU cycles and a periodic hrtimer. The hrtimer period is about 2/5 of the performance counter period, so the hrtimer should fire 2-3 times before the performance counter NMI fires. The NMI code checks whether the hrtimer fired since the last invocation. If not, it assumess a hard lockup. The calculation of those periods is based on the nominal CPU frequency. Turbo modes increase the CPU clock frequency and therefore shorten the period of the perf/NMI watchdog. With extreme Turbo-modes (3x nominal frequency) the perf/NMI period is shorter than the hrtimer period which leads to false positives. A simple fix would be to shorten the hrtimer period, but that comes with the side effect of more frequent hrtimer and softlockup thread wakeups, which is not desired. Implement a low pass filter, which checks the perf/NMI period against kernel time. If the perf/NMI fires before 4/5 of the watchdog period has elapsed then the event is ignored and postponed to the next perf/NMI. That solves the problem and avoids the overhead of shorter hrtimer periods and more frequent softlockup thread wakeups. Fixes: `58687acba5` ("lockup_detector: Combine nmi_watchdog and softlockup detector") Reported-and-tested-by: Kan Liang <Kan.liang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: dzickus@redhat.com Cc: prarit@redhat.com Cc: ak@linux.intel.com Cc: babu.moger@oracle.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: acme@redhat.com Cc: stable@vger.kernel.org Cc: atomlin@redhat.com Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708150931310.1886@nanos	2017-08-18 12:35:02 +02:00
Linus Torvalds	c8c03f1858	pty: fix the cached path of the pty slave file descriptor in the master Christian Brauner reported that if you use the TIOCGPTPEER ioctl() to get a slave pty file descriptor, the resulting file descriptor doesn't look right in /proc/<pid>/fd/<fd>. In particular, he wanted to use readlink() on /proc/self/fd/<fd> to get the pathname of the slave pty (basically implementing "ptsname{_r}()"). The reason for that was that we had generated the wrong 'struct path' when we create the pty in ptmx_open(). In particular, the dentry was correct, but the vfsmount pointed to the mount of the ptmx node. That _can_ be correct - in case you use "/dev/pts/ptmx" to open the master - but usually is not. The normal case is to use /dev/ptmx, which then looks up the pts/ directory, and then the vfsmount of the ptmx node is obviously the /dev directory, not the /dev/pts/ directory. We actually did have the right vfsmount available, but in the wrong place (it gets looked up in 'devpts_acquire()' when we get a reference to the pts filesystem), and so ptmx_open() used the wrong mnt pointer. The end result of this confusion was that the pty worked fine, but when if you did TIOCGPTPEER to get the slave side of the pty, end end result would also work, but have that dodgy 'struct path'. And then when doing "d_path()" on to get the pathname, the vfsmount would not match the root of the pts directory, and d_path() would return an empty pathname thinking that the entry had escaped a bind mount into another mount. This fixes the problem by making devpts_acquire() return the vfsmount for the pts filesystem, allowing ptmx_open() to trivially just use the right mount for the pts dentry, and create the proper 'struct path'. Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-17 09:10:48 -07:00
Damien Le Moal	70e42fd02c	scsi: sd_zbc: Write unlock zone from sd_uninit_cmnd() Releasing a zone write lock only when the write commnand that acquired the lock completes can cause deadlocks due to potential command reordering if the lock owning request is requeued and not executed. This problem exists only with the scsi-mq path as, unlike the legacy path, requests are moved out of the dispatch queue before being prepared and so before locking a zone for a write command. Since sd_uninit_cmnd() is now always called when a request is requeued, call sd_zbc_write_unlock_zone() from that function for write requests that acquired a zone lock instead of from sd_done(). Acquisition of a zone lock by a write command is indicated using the new command flag SCMD_ZONE_WRITE_LOCK. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-16 20:01:31 -04:00
Eric Dumazet	c780a049f9	ipv4: better IP_MAX_MTU enforcement While working on yet another syzkaller report, I found that our IP_MAX_MTU enforcements were not properly done. gcc seems to reload dev->mtu for min(dev->mtu, IP_MAX_MTU), and final result can be bigger than IP_MAX_MTU :/ This is a problem because device mtu can be changed on other cpus or threads. While this patch does not fix the issue I am working on, it is probably worth addressing it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-16 16:28:47 -07:00

1 2 3 4 5 ...

94276 Commits