Commit graph

68382 commits

Author SHA1 Message Date
Matthew Wilcox (Oracle)
0fb2d7209d iomap: Convert write_count to write_bytes_pending
Instead of counting bio segments, count the number of bytes submitted.
This insulates us from the block layer's definition of what a 'same page'
is, which is not necessarily clear once THPs are involved.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-21 08:59:26 -07:00
Matthew Wilcox (Oracle)
7d636676d2 iomap: Convert read_count to read_bytes_pending
Instead of counting bio segments, count the number of bytes submitted.
This insulates us from the block layer's definition of what a 'same page'
is, which is not necessarily clear once THPs are involved.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-09-21 08:59:26 -07:00
Matthew Wilcox (Oracle)
0a195b91e8 iomap: Support arbitrarily many blocks per page
Size the uptodate array dynamically to support larger pages in the
page cache.  With a 64kB page, we're only saving 8 bytes per page today,
but with a 2MB maximum page size, we'd have to allocate more than 4kB
per page.  Add a few debugging assertions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-09-21 08:59:26 -07:00
Matthew Wilcox (Oracle)
b21866f514 iomap: Use bitmap ops to set uptodate bits
Now that the bitmap is protected by a spinlock, we can use the
more efficient bitmap ops instead of individual test/set bit ops.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-21 08:59:26 -07:00
Matthew Wilcox (Oracle)
a6901d4d14 iomap: Use kzalloc to allocate iomap_page
We can skip most of the initialisation, although spinlocks still
need explicit initialisation as architectures may use a non-zero
value to indicate unlocked.  The comment is no longer useful as
attach_page_private() handles the refcount now.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-21 08:59:26 -07:00
Matthew Wilcox (Oracle)
24addd848a fs: Introduce i_blocks_per_page
This helper is useful for both THPs and for supporting block size larger
than page size.  Convert all users that I could find (we have a few
different ways of writing this idiom, and I may have missed some).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2020-09-21 08:59:26 -07:00
Matthew Wilcox (Oracle)
7ed3cd1a69 iomap: Fix misplaced page flushing
If iomap_unshare_actor() unshares to an inline iomap, the page was
not being flushed.  block_write_end() and __iomap_write_end() already
contain flushes, so adding it to iomap_write_end_inline() seems like
the best place.  That means we can remove it from iomap_write_actor().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-21 08:59:26 -07:00
Nikolay Borisov
6cc19c5fad iomap: Use round_down/round_up macros in __iomap_write_begin
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-21 08:59:25 -07:00
Trond Myklebust
c0a1d129d3 pNFS/flexfiles: Ensure we initialise the mirror bsizes correctly on read
While it is true that reading from an unmirrored source always uses
index 0, that is no longer true for mirrored sources when we fail over.

Fixes: 563c53e73b ("NFS: Fix flexfiles read failover")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 11:57:26 -04:00
Frank van der Linden
68274f97ae NFSv4.2: xattr cache: remove unused cache struct field
The hash_lock field of the cache structure was a leftover
of a previous iteration of the code. It is now unused,
so remove it.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:10 -04:00
Miaohe Lin
cf65e49f89 nfs: Convert to use the preferred fallthrough macro
Convert the uses of fallthrough comments to fallthrough macro. Please see
commit 294f69e662 ("compiler_attributes.h: Add 'fallthrough' pseudo
keyword for switch/case use") for detail.

Signed-off-by: Hongxiang Lou <louhongxiang@huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:10 -04:00
Dave Wysochanski
d8a6ad913c NFS4: Fix oops when copy_file_range is attempted with NFS4.0 source
The following oops is seen during xfstest/565 when the 'test'
(source of the copy) is NFS4.0 and 'scratch' (destination) is NFS4.2
[   59.692458] run fstests generic/565 at 2020-08-01 05:50:35
[   60.613588] BUG: kernel NULL pointer dereference, address: 0000000000000008
[   60.624970] #PF: supervisor read access in kernel mode
[   60.627671] #PF: error_code(0x0000) - not-present page
[   60.630347] PGD 0 P4D 0
[   60.631853] Oops: 0000 [#1] SMP PTI
[   60.634086] CPU: 6 PID: 2828 Comm: xfs_io Kdump: loaded Not tainted 5.8.0-rc3 #1
[   60.637676] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[   60.639901] RIP: 0010:nfs4_check_serverowner_major_id+0x5/0x30 [nfsv4]
[   60.642719] Code: 89 ff e8 3e b3 b8 e1 e9 71 fe ff ff 41 bc da d8 ff ff e9 c3 fe ff ff e8 e9 9d 08 e2 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 <8b> 57 08 31 c0 3b 56 08 75 12 48 83 c6 0c 48 83 c7 0c e8 c4 97 bb
[   60.652629] RSP: 0018:ffffc265417f7e10 EFLAGS: 00010287
[   60.655379] RAX: ffffa0664b066400 RBX: 0000000000000000 RCX: 0000000000000001
[   60.658754] RDX: ffffa066725fb000 RSI: ffffa066725fd000 RDI: 0000000000000000
[   60.662292] RBP: 0000000000020000 R08: 0000000000020000 R09: 0000000000000000
[   60.666189] R10: 0000000000000003 R11: 0000000000000000 R12: ffffa06648258d00
[   60.669914] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa06648258100
[   60.673645] FS:  00007faa9fb35800(0000) GS:ffffa06677d80000(0000) knlGS:0000000000000000
[   60.677698] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   60.680773] CR2: 0000000000000008 CR3: 0000000203f14000 CR4: 00000000000406e0
[   60.684476] Call Trace:
[   60.685809]  nfs4_copy_file_range+0xfc/0x230 [nfsv4]
[   60.688704]  vfs_copy_file_range+0x2ee/0x310
[   60.691104]  __x64_sys_copy_file_range+0xd6/0x210
[   60.693527]  do_syscall_64+0x4d/0x90
[   60.695512]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   60.698006] RIP: 0033:0x7faa9febc1bd

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:10 -04:00
Alexander A. Klimov
0bdd4cea12 Replace HTTP links with HTTPS ones: NFS, SUNRPC, and LOCKD clients
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
          If both the HTTP and HTTPS versions
          return 200 OK and serve the same content:
            Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:10 -04:00
Chengguang Xu
82c596ebaa nfs4: strengthen error check to avoid unexpected result
The variable error is ssize_t, which is signed and will
cast to unsigned when comapre with variable size, so add
a check to avoid unexpected result in case of negative
value of error.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Colin Ian King
48bb6ec17c NFS: remove redundant pointer clnt
The pointer clnt is being initialized with a value that is never
read and so this is assignment redundant and can be removed. The
pointer can removed because it is being used as a temporary
variable and it is clearer to make the direct assignment and remove
it completely.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Jens Axboe
4eb8dded6b io_uring: fix openat/openat2 unified prep handling
A previous commit unified how we handle prep for these two functions,
but this means that we check the allowed context (SQPOLL, specifically)
later than we should. Move the ring type checking into the two parent
functions, instead of doing it after we've done some setup work.

Fixes: ec65fea5a8 ("io_uring: deduplicate io_openat{,2}_prep()")
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-21 07:51:03 -06:00
Jens Axboe
6ca56f8459 io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
These will naturally fail when attempted through SQPOLL, but either
with -EFAULT or -EBADF. Make it explicit that these are not workable
through SQPOLL and return -EINVAL, just like other ops that need to
use ->files.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-21 07:51:00 -06:00
Jens Axboe
f5cac8b156 io_uring: don't use retry based buffered reads for non-async bdev
Some block devices, like dm, bubble back -EAGAIN through the completion
handler. We check for this in io_read(), but don't honor it for when
we have copied the iov. Return -EAGAIN for this case before retrying,
to force punt to io-wq.

Fixes: bcf5a06304 ("io_uring: support true async buffered reads, if file provides it")
Reported-by: Zorro Lang <zlang@redhat.com>
Tested-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-21 07:50:56 -06:00
Jens Axboe
8f3d749685 io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there
If we already have mapped the necessary data for retry, then don't set
it up again. It's a pointless operation, and we leak the iovec if it's
a large (non-stack) vec.

Fixes: b63534c41e ("io_uring: re-issue block requests that failed because of resources")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-21 07:50:54 -06:00
Johannes Thumshirn
35be8851d1 btrfs: fix overflow when copying corrupt csums for a message
Syzkaller reported a buffer overflow in btree_readpage_end_io_hook()
when loop mounting a crafted image:

  detected buffer overflow in memcpy
  ------------[ cut here ]------------
  kernel BUG at lib/string.c:1129!
  invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  CPU: 1 PID: 26 Comm: kworker/u4:2 Not tainted 5.9.0-rc4-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Workqueue: btrfs-endio-meta btrfs_work_helper
  RIP: 0010:fortify_panic+0xf/0x20 lib/string.c:1129
  RSP: 0018:ffffc90000e27980 EFLAGS: 00010286
  RAX: 0000000000000022 RBX: ffff8880a80dca64 RCX: 0000000000000000
  RDX: ffff8880a90860c0 RSI: ffffffff815dba07 RDI: fffff520001c4f22
  RBP: ffff8880a80dca00 R08: 0000000000000022 R09: ffff8880ae7318e7
  R10: 0000000000000000 R11: 0000000000077578 R12: 00000000ffffff6e
  R13: 0000000000000008 R14: ffffc90000e27a40 R15: 1ffff920001c4f3c
  FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000557335f440d0 CR3: 000000009647d000 CR4: 00000000001506e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   memcpy include/linux/string.h:405 [inline]
   btree_readpage_end_io_hook.cold+0x206/0x221 fs/btrfs/disk-io.c:642
   end_bio_extent_readpage+0x4de/0x10c0 fs/btrfs/extent_io.c:2854
   bio_endio+0x3cf/0x7f0 block/bio.c:1449
   end_workqueue_fn+0x114/0x170 fs/btrfs/disk-io.c:1695
   btrfs_work_helper+0x221/0xe20 fs/btrfs/async-thread.c:318
   process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
   worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
   kthread+0x3b5/0x4a0 kernel/kthread.c:292
   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
  Modules linked in:
  ---[ end trace b68924293169feef ]---
  RIP: 0010:fortify_panic+0xf/0x20 lib/string.c:1129
  RSP: 0018:ffffc90000e27980 EFLAGS: 00010286
  RAX: 0000000000000022 RBX: ffff8880a80dca64 RCX: 0000000000000000
  RDX: ffff8880a90860c0 RSI: ffffffff815dba07 RDI: fffff520001c4f22
  RBP: ffff8880a80dca00 R08: 0000000000000022 R09: ffff8880ae7318e7
  R10: 0000000000000000 R11: 0000000000077578 R12: 00000000ffffff6e
  R13: 0000000000000008 R14: ffffc90000e27a40 R15: 1ffff920001c4f3c
  FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f95b7c4d008 CR3: 000000009647d000 CR4: 00000000001506e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

The overflow happens, because in btree_readpage_end_io_hook() we assume
that we have found a 4 byte checksum instead of the real possible 32
bytes we have for the checksums.

With the fix applied:

[   35.726623] BTRFS: device fsid 815caf9a-dc43-4d2a-ac54-764b8333d765 devid 1 transid 5 /dev/loop0 scanned by syz-repro (215)
[   35.738994] BTRFS info (device loop0): disk space caching is enabled
[   35.738998] BTRFS info (device loop0): has skinny extents
[   35.743337] BTRFS warning (device loop0): loop0 checksum verify failed on 1052672 wanted 0xf9c035fc8d239a54 found 0x67a25c14b7eabcf9 level 0
[   35.743420] BTRFS error (device loop0): failed to read chunk root
[   35.745899] BTRFS error (device loop0): open_ctree failed

Reported-by: syzbot+e864a35d361e1d4e29a5@syzkaller.appspotmail.com
Fixes: d5178578bc ("btrfs: directly call into crypto framework for checksumming")
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-21 12:39:21 +02:00
Tobias Klauser
9ca48e20ec fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype
Commit 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
definition of dirtytime_interval_handler to match its prototype in
linux/writeback.h which fixes the following sparse error/warning:

fs/fs-writeback.c:2189:50: warning: incorrect type in argument 3 (different address spaces)
fs/fs-writeback.c:2189:50:    expected void *
fs/fs-writeback.c:2189:50:    got void [noderef] __user *buffer
fs/fs-writeback.c:2184:5: error: symbol 'dirtytime_interval_handler' redeclared with different type (incompatible argument 3 (different address spaces)):
fs/fs-writeback.c:2184:5:    int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )
fs/fs-writeback.c: note: in included file:
./include/linux/writeback.h:374:5: note: previously declared as:
./include/linux/writeback.h:374:5:    int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20200907093140.13434-1-tklauser@distanz.ch
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-19 13:13:39 -07:00
Gao Xiang
6ea5aad32d erofs: add REQ_RAHEAD flag to readahead requests
Let's add REQ_RAHEAD flag so it'd be easier to identify
readahead I/O requests in blktrace.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20200919072730.24989-3-hsiangkao@redhat.com
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
2020-09-19 15:38:14 +08:00
Gao Xiang
bf9a123b9c erofs: fold in should_decompress_synchronously()
should_decompress_synchronously() has one single condition
for now, so fold it instead.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20200919072730.24989-2-hsiangkao@redhat.com
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
2020-09-19 15:35:57 +08:00
Gao Xiang
6c3e485ea3 erofs: avoid unnecessary variable `err'
variable `err' in z_erofs_submit_queue() isn't useful
here, remove it instead.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20200919072730.24989-1-hsiangkao@redhat.com
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
2020-09-19 15:35:17 +08:00
Al Viro
6d1349c769 [PATCH] reduce boilerplate in fsid handling
Get rid of boilerplate in most of ->statfs()
instances...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-18 16:45:50 -04:00
Chao Yu
e3f78d5e7e erofs: remove unneeded parameter
After commit 0615090c50 ("erofs: convert compressed files from
readpages to readahead"), add_to_page_cache_lru() was moved to mm
code, so that in below call path, no page will be cached into
@pagepool list or grabbed from @pagepool list:
- z_erofs_readpage
 - z_erofs_do_read_page
  - preload_compressed_pages
  - erofs_allocpage

Let's get rid of this unneeded @pagepool parameter.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20200917011821.22767-1-yuchao0@huawei.com
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
2020-09-18 22:17:44 +08:00
Gao Xiang
d578b46db6 erofs: avoid duplicated permission check for "trusted." xattrs
Don't recheck it since xattr_permission() already
checks CAP_SYS_ADMIN capability.

Just follow 5d3ce4f701 ("f2fs: avoid duplicated permission check for "trusted." xattrs")

Reported-by: Hongyu Jin <hongyu.jin@unisoc.com>
[ Gao Xiang: since it could cause some complex Android overlay
  permission issue as well on android-5.4+, it'd be better to
  backport to 5.4+ rather than pure cleanup on mainline. ]
Cc: <stable@vger.kernel.org> # 5.4+
Link: https://lore.kernel.org/r/20200811070020.6339-1-hsiangkao@redhat.com
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
2020-09-18 22:11:13 +08:00
Trond Myklebust
b9df46d08a pNFS/flexfiles: Be consistent about mirror index types
A mirror index is always of type u32.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-09-18 09:25:33 -04:00
Trond Myklebust
ee15c7b53e pNFS/flexfiles: Ensure we initialise the mirror bsizes correctly on read
While it is true that reading from an unmirrored source always uses
index 0, that is no longer true for mirrored sources when we fail over.

Fixes: 563c53e73b ("NFS: Fix flexfiles read failover")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-09-18 09:21:10 -04:00
Max Reitz
1866d779d5 fuse: Allow fuse_fill_super_common() for submounts
Submounts have their own superblock, which needs to be initialized.
However, they do not have a fuse_fs_context associated with them, and
the root node's attributes should be taken from the mountpoint's node.

Extend fuse_fill_super_common() to work for submounts by making the @ctx
parameter optional, and by adding a @submount_finode parameter.

(There is a plain "unsigned" in an existing code block that is being
indented by this commit.  Extend it to "unsigned int" so checkpatch does
not complain.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18 15:17:41 +02:00
Max Reitz
fcee216beb fuse: split fuse_mount off of fuse_conn
We want to allow submounts for the same fuse_conn, but with different
superblocks so that each of the submounts has its own device ID.  To do
so, we need to split all mount-specific information off of fuse_conn
into a new fuse_mount structure, so that multiple mounts can share a
single fuse_conn.

We need to take care only to perform connection-level actions once (i.e.
when the fuse_conn and thus the first fuse_mount are established, or
when the last fuse_mount and thus the fuse_conn are destroyed).  For
example, fuse_sb_destroy() must invoke fuse_send_destroy() until the
last superblock is released.

To do so, we keep track of which fuse_mount is the root mount and
perform all fuse_conn-level actions only when this fuse_mount is
involved.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18 15:17:41 +02:00
Max Reitz
8f622e9497 fuse: drop fuse_conn parameter where possible
With the last commit, all functions that handle some existing fuse_req
no longer need to be given the associated fuse_conn, because they can
get it from the fuse_req object.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18 15:17:41 +02:00
Max Reitz
24754db272 fuse: store fuse_conn in fuse_req
Every fuse_req belongs to a fuse_conn.  Right now, we always know which
fuse_conn that is based on the respective device, but we want to allow
multiple (sub)mounts per single connection, and then the corresponding
filesystem is not going to be so trivial to obtain.

Storing a pointer to the associated fuse_conn in every fuse_req will
allow us to trivially find any request's superblock (and thus
filesystem) even then.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18 15:17:40 +02:00
Miklos Szeredi
d78092e493 fuse: fix page dereference after free
After unlock_request() pages from the ap->pages[] array may be put (e.g. by
aborting the connection) and the pages can be freed.

Prevent use after free by grabbing a reference to the page before calling
unlock_request().

The original patch was created by Pradeep P V K.

Reported-by: Pradeep P V K <ppvk@codeaurora.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18 10:36:50 +02:00
Al Viro
933a3752ba fuse: fix the ->direct_IO() treatment of iov_iter
the callers rely upon having any iov_iter_truncate() done inside
->direct_IO() countered by iov_iter_reexpand().

Reported-by: Qian Cai <cai@redhat.com>
Tested-by: Qian Cai <cai@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-17 17:26:56 -04:00
Wang Hai
b30e2238b7 ubifs: Fix some kernel-doc warnings in tnc.c
Fixes the following W=1 kernel build warning(s):

fs/ubifs/tnc.c:3479: warning: Excess function parameter 'inum' description in 'dbg_check_inode_size'
fs/ubifs/tnc.c:366: warning: Excess function parameter 'node' description in 'lnc_free'

@inum in 'dbg_check_inode_size' should be @inode, fix it.
@node in 'lnc_free' is not in use, Remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 23:06:53 +02:00
Wang Hai
f279e5a491 ubifs: Fix some kernel-doc warnings in replay.c
Fixes the following W=1 kernel build warning(s):

fs/ubifs/replay.c:942: warning: Excess function parameter 'ref_lnum' description in 'validate_ref'
fs/ubifs/replay.c:942: warning: Excess function parameter 'ref_offs' description in 'validate_ref'

They're not in use. Remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 23:06:53 +02:00
Wang Hai
7889042b65 ubifs: Fix some kernel-doc warnings in gc.c
Fixes the following W=1 kernel build warning(s):

fs/ubifs/gc.c:70: warning: Excess function parameter 'buf' description in 'switch_gc_head'
fs/ubifs/gc.c:70: warning: Excess function parameter 'len' description in 'switch_gc_head'
fs/ubifs/gc.c:70: warning: Excess function parameter 'lnum' description in 'switch_gc_head'
fs/ubifs/gc.c:70: warning: Excess function parameter 'offs' description in 'switch_gc_head'

They're not in use. Remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 23:06:53 +02:00
Wang Hai
f39d9f4cb9 ubifs: Fix 'hash' kernel-doc warning in auth.c
Fixes the following W=1 kernel build warning(s):

fs/ubifs/auth.c:66: warning: Excess function parameter 'hash' description in 'ubifs_prepare_auth_node'

Rename hash to inhash.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 23:06:53 +02:00
Zhihao Cheng
121b8fcbf9 ubifs: setflags: Don't show error message when vfs_ioc_setflags_prepare() fails
Following process will trigger ubifs_err:
  1. useradd -m freg                                        (Under root)
  2. cd /home/freg && mkdir mp                              (Under freg)
  3. mount -t ubifs /dev/ubi0_0 /home/freg/mp               (Under root)
  4. cd /home/freg && echo 123 > mp/a			    (Under root)
  5. cd mp && chown freg a && chgrp freg a && chmod 777 a   (Under root)
  6. chattr +i a                                            (Under freg)

UBIFS error (ubi0:0 pid 1723): ubifs_ioctl [ubifs]: can't modify inode
65 attributes
chattr: Operation not permitted while setting flags on a

This is not an UBIFS problem, it was caused by task priviliage checking
on file operations. Remove error message printing from kernel just like
other filesystems (eg. ext4), since we already have enough information
from userspace tools.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 22:56:19 +02:00
Zhihao Cheng
dd7db149bc ubifs: ubifs_jnl_change_xattr: Remove assertion 'nlink > 0' for host inode
Changing xattr of a temp file will trigger following assertion failed
and make ubifs turn into readonly filesystem:
  ubifs_assert_failed [ubifs]: UBIFS assert failed: host->i_nlink > 0,
  in fs/ubifs/journal.c:1801

Reproducer:
  1. fd = open(__O_TMPFILE)
  2. fsetxattr(fd, key, value2, XATTR_CREATE)
  3. fsetxattr(fd, key, value2, XATTR_REPLACE)

Fix this by removing assertion 'nlink > 0' for host inode.

Reported-by: Chengsong Ke <kechengsong@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 22:56:09 +02:00
Zhihao Cheng
58f6e78a65 ubifs: dent: Fix some potential memory leaks while iterating entries
Fix some potential memory leaks in error handling branches while
iterating dent entries. For example, function dbg_check_dir()
forgets to free pdent if it exists.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: <stable@vger.kernel.org>
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 22:55:51 +02:00
Zhihao Cheng
f2aae745b8 ubifs: xattr: Fix some potential memory leaks while iterating entries
Fix some potential memory leaks in error handling branches while
iterating xattr entries. For example, function ubifs_tnc_remove_ino()
forgets to free pxent if it exists. Similar problems also exist in
ubifs_purge_xattrs(), ubifs_add_orphan() and ubifs_jnl_write_inode().

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: <stable@vger.kernel.org>
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
2020-09-17 22:55:41 +02:00
Christoph Hellwig
80bdad3d7e quota: simplify the quotactl compat handling
Fold the misaligned u64 workarounds into the main quotactl flow instead
of implementing a separate compat syscall handler.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-17 13:00:46 -04:00
Olga Kornievskaia
16abd2a0c1 NFSv4.2: fix client's attribute cache management for copy_file_range
After client is done with the COPY operation, it needs to invalidate
its pagecache (as it did no reading or writing of the data locally)
and it needs to invalidate it's attributes just like it would have
for a read on the source file and write on the destination file.

Once the linux server started giving out read delegations to
read+write opens, the destination file of the copy_file range
started having delegations and not doing syncup on close of the
file leading to xfstest failures for generic/430,431,432,433,565.

v2: changing cache_validity needs to be protected by the i_lock.

Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Fixes: 2e72448b07 ("NFS: Add COPY nfs operation")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-09-16 12:25:14 -04:00
Jeffrey Mitchell
d33030e2ee nfs: Fix security label length not being reset
nfs_readdir_page_filler() iterates over entries in a directory, reusing
the same security label buffer, but does not reset the buffer's length.
This causes decode_attr_security_label() to return -ERANGE if an entry's
security label is longer than the previous one's. This error, in
nfs4_decode_dirent(), only gets passed up as -EAGAIN, which causes another
failed attempt to copy into the buffer. The second error is ignored and
the remaining entries do not show up in ls, specifically the getdents64()
syscall.

Reproduce by creating multiple files in NFS and giving one of the later
files a longer security label. ls will not see that file nor any that are
added afterwards, though they will exist on the backend.

In nfs_readdir_page_filler(), reset security label buffer length before
every reuse

Signed-off-by: Jeffrey Mitchell <jeffrey.mitchell@starlab.io>
Fixes: b4487b9354 ("nfs: Fix getxattr kernel panic and memory overflow")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-09-16 12:25:14 -04:00
Eric Biggers
8859bf2b12 reiserfs: only call unlock_new_inode() if I_NEW
unlock_new_inode() is only meant to be called after a new inode has
already been inserted into the hash table.  But reiserfs_new_inode() can
call it even before it has inserted the inode, triggering the WARNING in
unlock_new_inode().  Fix this by only calling unlock_new_inode() if the
inode has the I_NEW flag set, indicating that it's in the table.

This addresses the syzbot report "WARNING in unlock_new_inode"
(https://syzkaller.appspot.com/bug?extid=187510916eb6a14598f7).

Link: https://lore.kernel.org/r/20200628070057.820213-1-ebiggers@kernel.org
Reported-by: syzbot+187510916eb6a14598f7@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-09-16 12:51:24 +02:00
Darrick J. Wong
b96cb835e3 xfs: deprecate the V4 format
The V4 filesystem format contains known weaknesses in the on-disk format
that make metadata verification diffiult.  In addition, the format does
not support dates past 2038 and will not be upgraded to do so.  We
should start the process of retiring the old format to close off attack
surfaces and to encourage users to migrate onto V5.

Therefore, make XFS V4 support a configurable option.  For the first
period it will be default Y in case some distributors want to withdraw
support early; for the second period it will be default N so that anyone
who wishes to continue support can do so; and after that, support will
be removed from the kernel.  Dates for these events have been added to
the upstream kernel.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2020-09-15 20:52:43 -07:00
Darrick J. Wong
d4f2c14cc9 xfs: don't propagate RTINHERIT -> REALTIME when there is no rtdev
While running generic/042 with -drtinherit=1 set in MKFS_OPTIONS, I
observed that the kernel will gladly set the realtime flag on any file
created on the loopback filesystem even though that filesystem doesn't
actually have a realtime device attached.  This leads to verifier
failures and doesn't make any sense, so be smarter about this.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-09-15 20:52:43 -07:00
Darrick J. Wong
fe341eb151 xfs: ensure that fpunch, fcollapse, and finsert operations are aligned to rt extent size
Make sure that any fallocate operation that requires the range to be
block-aligned also checks that the range is aligned to the realtime
extent size.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-09-15 20:52:42 -07:00