bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ext2fs getting stuck


From: Samuel Thibault
Subject: ext2fs getting stuck
Date: Mon, 14 Mar 2016 23:50:02 +0100
User-agent: Mutt/1.5.21+34 (58baf7c9f32f) (2010-12-30)

Hello,

I've catched an ext2fs that is blocked in fsync & similar. This is
apparently not a deadlock, but a livelock.

- Threads 10, 5, 4, 3, 1 seem to be in normal waiting state
- Threads 9 and 7 are both waiting for the 0x89a6b6c mutex, held by thread 8
- Threads 6 and 8 are stuck in disk_cache_block_ref: they keep seeing
disk_cache_info_free_pop() return NULL, and thus call
disk_cache_return_unused(), which doesn't actually manage to free
anything, apparently: I see its 'i' loop going on repeatedly.
- Thread 2 is apparently waiting for a page-in for 'entry' (I don't see
it progress)

BTW, I see that DISK_CACHE_BLOCKS is hardcoded to 65536 blocks, this
seems like an arbitrary value.

vmstat shows 16000 cached objects only.

My rough guess would be that the gnumach kernel is perhaps caching too
many objects compared to the block cache of ext2fs, but when I have
a look at disk_cache_info, I see that blocks 4-50910 actually have
flags=DC_INCORE and ref_count==0, so they should be releasable! I don't
see why they shouldn't be getting released. Blocks 50911-65536 even have
flags=0 and ref_count=0 so they should be in the cache; that is however
a bug I see in disk_pager_notify_evict which doesn't add blocks to the
cache when DC_INCORE is reset and there is no reference, I'll fix that
one. The first one is still concerning, I'm not sure where to look at.

Samuel

(gdb) thread apply all bt

Thread 10 (Thread 12674.10):
#0  0x080f970c in mach_msg_trap ()
#1  0x080b9c6c in mach_msg ()
#2  0x080ba139 in mach_msg_server_timeout ()
#3  0x08078872 in thread_function (arg=0x0)
    at ../../libports/manage-multithread.c:254
#4  0x0807ca13 in entry_point (self=0x200480, start_routine=0x3005f80, arg=0x0)
    at ./pthread/pt-create.c:64
#5  0x00000000 in ?? ()

Thread 9 (Thread 12674.9):
#0  0x080f970c in mach_msg_trap ()
#1  0x080b9cb6 in mach_msg ()
#2  0x0807f301 in __pthread_block (thread=0x9247180)
    at ../libpthread/sysdeps/mach/pt-block.c:35
#3  0x0807d9ce in __pthread_mutex_timedlock_internal (mutex=0x89a6b6c, 
    abstime=0x0)
    at ../sysdeps/../libpthread/sysdeps/generic/pt-mutex-timedlock.c:136
#4  0x0807d41e in __pthread_mutex_lock (mutex=0x89a6b6c)
    at ../sysdeps/../libpthread/sysdeps/generic/pt-mutex-lock.c:33
#5  0x08055a26 in diskfs_node_iterate (fun=0x5bfbbdf0)
    at ../../libdiskfs/node-cache.c:220
#6  0x0806a9c4 in diskfs_S_fsys_syncfs (pi=0x81ee080, reply=56281, 
    replytype=18, children=1) at ../../libdiskfs/fsys-syncfs.c:57
#7  0x0805fa76 in _Xfsys_syncfs (InHeadP=0x5bfbdf10, OutHeadP=0x5bfbbf00)
    at fsysServer.c:672
#8  0x08059285 in diskfs_demuxer (inp=0x5bfbdf10, outp=0x5bfbbf00)
    at ../../libdiskfs/demuxer.c:46
#9  0x08078704 in internal_demuxer (outheadp=0x5bfbbf00, inp=0x5bfbdf10)
    at ../../libports/manage-multithread.c:207
#10 synchronized_demuxer (inp=0x5bfbdf10, outheadp=0x5bfbbf00)
    at ../../libports/manage-multithread.c:234
#11 0x080ba14a in mach_msg_server_timeout ()
#12 0x08078872 in thread_function (arg=0x0)
    at ../../libports/manage-multithread.c:254
#13 0x0807ca13 in entry_point (self=0x9247180, start_routine=0x3005f80, 
    arg=0x0) at ./pthread/pt-create.c:64
#14 0x00000000 in ?? ()

Thread 8 (Thread 12674.8):
#0  disk_cache_return_unused () at ../../ext2fs/pager.c:949
#1  disk_cache_block_ref (block=5799938) at ../../ext2fs/pager.c:1061
#2  0x0804e69f in dino_ref (inum=<optimized out>) at ../../ext2fs/ext2fs.h:408
#3  write_node (np=np@entry=0x89a6ae0) at ../../ext2fs/inode.c:359
#4  0x0804f420 in diskfs_write_disknode (np=0x89a6ae0, wait=0)
    at ../../ext2fs/inode.c:498
#5  0x080568da in diskfs_node_update (np=0x89a6ae0, wait=0)
    at ../../libdiskfs/node-update.c:31
#6  0x0805176e in diskfs_file_update (node=0x89a6ae0, wait=0)
    at ../../ext2fs/pager.c:761
#7  0x0804b101 in diskfs_dirremove_hard (dp=0x89a6ae0, ds=0x5bededb0)
    at ../../ext2fs/dir.c:734
#8  0x0805971e in diskfs_dirremove (dp=0x89a6ae0, np=0x293ffdf8, 
    name=0x5bedef1c "tmp.ci", ds=0x5bededb0) at ../../libdiskfs/dirremove.c:40
#9  0x080675fe in diskfs_S_dir_rmdir (dircred=0x115c18, 
    name=0x5bedef1c "tmp.ci") at ../../libdiskfs/dir-rmdir.c:79
#10 0x0805b735 in _Xdir_rmdir (InHeadP=0x5bedef00, OutHeadP=0x5bee0f10)
    at fsServer.c:2177
#11 0x08059285 in diskfs_demuxer (inp=0x5bedef00, outp=0x5bee0f10)
    at ../../libdiskfs/demuxer.c:46
#12 0x08078704 in internal_demuxer (outheadp=0x5bee0f10, inp=0x5bedef00)
    at ../../libports/manage-multithread.c:207
#13 synchronized_demuxer (inp=0x5bedef00, outheadp=0x5bee0f10)
    at ../../libports/manage-multithread.c:234
#14 0x080ba14a in mach_msg_server_timeout ()
#15 0x08078872 in thread_function (arg=0x0)
    at ../../libports/manage-multithread.c:254
#16 0x0807ca13 in entry_point (self=0x18c9f8, start_routine=0x3005f80, arg=0x0)
    at ./pthread/pt-create.c:64
#17 0x00000000 in ?? ()

Thread 7 (Thread 12674.7):
#0  0x080f970c in mach_msg_trap ()
#1  0x080b9cb6 in mach_msg ()
#2  0x0807f301 in __pthread_block (thread=0x81ee258)
    at ../libpthread/sysdeps/mach/pt-block.c:35
#3  0x0807d9ce in __pthread_mutex_timedlock_internal (mutex=0x89a6b6c, 
    abstime=0x0)
    at ../sysdeps/../libpthread/sysdeps/generic/pt-mutex-timedlock.c:136
#4  0x0807d41e in __pthread_mutex_lock (mutex=0x89a6b6c)
    at ../sysdeps/../libpthread/sysdeps/generic/pt-mutex-lock.c:33
#5  0x08055a26 in diskfs_node_iterate (fun=0x804e920 <write_one_disknode>)
    at ../../libdiskfs/node-cache.c:220
#6  0x0804f3fd in write_all_disknodes () at ../../ext2fs/inode.c:489
#7  0x080525b1 in diskfs_sync_everything () at ../../ext2fs/pager.c:1422
#8  0x080578ca in periodic_sync (arg=0x1e)
    at ../../libdiskfs/sync-interval.c:123
#9  0x0807ca13 in entry_point (self=0x81ee258, 
    start_routine=0x80577f0 <periodic_sync>, arg=0x1e)
    at ./pthread/pt-create.c:64
#10 0x00000000 in ?? ()

Thread 6 (Thread 12674.6):
#0  disk_cache_return_unused () at ../../ext2fs/pager.c:949
#1  disk_cache_block_ref (block=9011200) at ../../ext2fs/pager.c:1061
#2  0x080489bd in ext2_free_blocks (block=9013573, count=1)
    at ../../ext2fs/balloc.c:95
#3  0x0805348a in _free_block_run_flush (fbr=0x5009e84, fbr=0x5009e84, 
    count=<optimized out>) at ../../ext2fs/truncate.c:52
#4  free_block_run_finish (fbr=0x5009e84) at ../../ext2fs/truncate.c:94
#5  diskfs_truncate (node=0x9327640, length=0) at ../../ext2fs/truncate.c:352
#6  0x080648bb in diskfs_drop_node (np=0x9327640)
    at ../../libdiskfs/node-drop.c:63
#7  0x0805661f in diskfs_nrele_light (np=0x9327640)
    at ../../libdiskfs/node-nrelel.c:34
#8  0x080518c6 in pager_clear_user_data (upi=0x87822a8)
    at ../../ext2fs/pager.c:818
#9  0x0807901f in _ports_complete_deallocate (pi=0x91e7aa0)
    at ../../libports/complete-deallocate.c:63
#10 0x0806cedc in worker_func (arg=0x81ec4f0) at ../../libpager/demuxer.c:223
#11 0x0807ca13 in entry_point (self=0x81ed130, 
    start_routine=0x806ce00 <worker_func>, arg=0x81ec4f0)
    at ./pthread/pt-create.c:64
#12 0x00000000 in ?? ()

Thread 5 (Thread 12674.5):
#0  0x080f970c in mach_msg_trap ()
#1  0x080b9c6c in mach_msg ()
#2  0x080ba139 in mach_msg_server_timeout ()
#3  0x080784c5 in ports_manage_port_operations_one_thread (demuxer=0x4808fb0, 
    timeout=0) at ../../libports/manage-one-thread.c:120
#4  0x0806d069 in service_paging_requests (arg=0x81ec498)
    at ../../libpager/demuxer.c:300
#5  0x0807ca13 in entry_point (self=0x81ec508, 
    start_routine=0x806d020 <service_paging_requests>, arg=0x81ec498)
    at ./pthread/pt-create.c:64
#6  0x00000000 in ?? ()

Thread 4 (Thread 12674.4):
#0  0x080f970c in mach_msg_trap ()
#1  0x080b9c6c in mach_msg ()
#2  0x0807f301 in __pthread_block (thread=0x81eb448)
    at ../libpthread/sysdeps/mach/pt-block.c:35
#3  0x0807e9bb in __pthread_cond_timedwait_internal (cond=0x81ea7c0, 
    mutex=0x81ea7e8, abstime=0x0)
    at ../sysdeps/../libpthread/sysdeps/generic/pt-cond-timedwait.c:130
#4  0x0807e682 in __pthread_cond_wait (cond=0x81ea7c0, mutex=0x81ea7e8)
    at ../sysdeps/../libpthread/sysdeps/generic/pt-cond-wait.c:36
#5  0x0806cfd3 in worker_func (arg=0x81ea808) at ../../libpager/demuxer.c:200
#6  0x0807ca13 in entry_point (self=0x81eb448, 
    start_routine=0x806ce00 <worker_func>, arg=0x81ea808)
    at ./pthread/pt-create.c:64
#7  0x00000000 in ?? ()

Thread 3 (Thread 12674.3):
#0  0x080f970c in mach_msg_trap ()
#1  0x080b9c6c in mach_msg ()
#2  0x080ba139 in mach_msg_server_timeout ()
#3  0x080784c5 in ports_manage_port_operations_one_thread (demuxer=0x3806fb0, 
    timeout=0) at ../../libports/manage-one-thread.c:120
#4  0x0806d069 in service_paging_requests (arg=0x81ea7b0)
    at ../../libpager/demuxer.c:300
#5  0x0807ca13 in entry_point (self=0x81ea820, 
    start_routine=0x806d020 <service_paging_requests>, arg=0x81ea7b0)
    at ./pthread/pt-create.c:64
#6  0x00000000 in ?? ()

Thread 2 (Thread 12674.2):
#0  dirscanblock (inum=<synthetic pointer>, ds=<optimized out>, 
    type=<optimized out>, namelen=<optimized out>, name=<optimized out>, 
    idx=<optimized out>, dp=<optimized out>, blockaddr=<optimized out>)
    at ../../ext2fs/dir.c:409
#1  diskfs_lookup_hard (dp=0x8211ef0, name=0x3001e84 "cache", 
    type=<optimized out>, npp=0x3001d24, ds=0x0, cred=0x3581ccb8)
    at ../../ext2fs/dir.c:217
#2  0x08055036 in diskfs_lookup (dp=0x8211ef0, name=<optimized out>, 
    type=LOOKUP, np=0x3001d24, ds=0x0, cred=0x3581ccb8)
    at ../../libdiskfs/lookup.c:166
#3  0x0806553e in diskfs_S_dir_lookup (dircred=0x3581ccb8, 
    path=0x3001e84 "cache", flags=0, mode=0, retry=0x3003e94, 
    retryname=0x3003e9c "", returned_port=0x30042a0, 
    returned_port_poly=0x3001d98) at ../../libdiskfs/dir-lookup.c:138
#4  0x0805be33 in _Xdir_lookup (InHeadP=0x3001e60, OutHeadP=0x3003e70)
    at fsServer.c:1837
#5  0x08059285 in diskfs_demuxer (inp=0x3001e60, outp=0x3003e70)
    at ../../libdiskfs/demuxer.c:46
#7  0x08078704 in internal_demuxer (outheadp=0x3003e70, inp=0x3001e60)
    at ../../libports/manage-multithread.c:207
#7  synchronized_demuxer (inp=0x3001e60, outheadp=0x3003e70)
    at ../../libports/manage-multithread.c:234
#8  0x080ba14a in mach_msg_server_timeout ()
#9  0x08078872 in thread_function (arg=0x1)
    at ../../libports/manage-multithread.c:254
#10 0x08078a89 in ports_manage_port_operations_multithread (bucket=0x81e99d0, 
    demuxer=0x80591f0 <diskfs_demuxer>, thread_timeout=120000, hook=0x0)
    at ../../libports/manage-multithread.c:285
#11 0x0806136b in master_thread_function (demuxer=0x80591f0 <diskfs_demuxer>)
    at ../../libdiskfs/init-first.c:37
#12 0x0807ca13 in entry_point (self=0x81e9ab0, 
    start_routine=0x8061340 <master_thread_function>, 
    arg=0x80591f0 <diskfs_demuxer>) at ./pthread/pt-create.c:64
#13 0x00000000 in ?? ()

Thread 1 (Thread 12674.1):
#0  0x080f970c in mach_msg_trap ()
#1  0x080b9c6c in mach_msg ()
#2  0x080ba19e in mach_msg_server_timeout ()
#3  0x080ba258 in mach_msg_server ()
#4  0x080fa7c1 in _hurd_msgport_receive ()
#5  0x0807ca13 in entry_point (self=0x81e8470, 
    start_routine=0x80fa780 <_hurd_msgport_receive>, arg=0x0)
    at ./pthread/pt-create.c:64
#6  0x00000000 in ?? ()


Samuel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]