Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel ext4 lockup causing nginx slowdown #177

Open
2 tasks
staticfloat opened this issue Aug 28, 2023 · 11 comments
Open
2 tasks

kernel ext4 lockup causing nginx slowdown #177

staticfloat opened this issue Aug 28, 2023 · 11 comments

Comments

@staticfloat
Copy link
Member

We had a discussion about the us-east loadbalancer getting slow. Initial inspection showed that the network interface was rarely achieving more than 180Mbps out. Diving deeper, it was found that some nginx processes had become stuck in "uninterruptible sleep" (D in ps output). Looking in dmesg after a echo w > /proc/sysrq-trigger showed that they were stuck in the kernel during an fsync.

This is a fairly pathological failure, but there are a few things we could do to ameliorate it:

  • Oversubscribe the number of nginx processes. Right now we only have 2 processes on the loadbalancer; we could probably double this and go up to 2x per core (so 4 processes total) without any harmful effects, which would at least delay the problem in the future. This needs a templating step on the "optimized" nginx config to insert $(($(nproc) * 2)) into the worker_processes directive.
  • Better monitoring on the instances; our grafana server has bitrotted, we should resurrect that so that we can notice the lowered network throughput and increased CPU time as shown in this lightsail graph: image

I am loathe to do something drastic like auto-reboot the loadbalancer because it is supposed to be the piece that you don't have to reboot. If this happens again, I'll consider it.

@StefanKarpinski
Copy link
Collaborator

Looks like if the average cpu utilization is above 20% for many sequential intervals then they need a reboot. FWIW, in my experience the only way to build fully reliable systems is to detect bad states and restart.

@StefanKarpinski
Copy link
Collaborator

Also, in this case nginx is not something we really need to be in the business of ensuring the reliability of...

@DilumAluthge
Copy link
Member

Should we switch from Nginx to something else?

@staticfloat
Copy link
Member Author

I don't think this is nginx's fault, I'm pretty sure this is a kernel bug.

@Keno
Copy link

Keno commented Aug 29, 2023

Looking in dmesg after a echo w > /proc/sysrq-trigger showed that they were stuck in the kernel during an fsync.

Do you have a full backtrace and also what's the filesytem?

@staticfloat
Copy link
Member Author

staticfloat commented Aug 29, 2023

Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189515] task:nginx           state:D stack:    0 pid:1248331 ppid:1248210 flags:0x00000000
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189518] Call Trace:
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189519]  <TASK>
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189520]  __schedule+0x2cd/0x8a0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189523]  ? crypto_shash_update+0x1e/0x30
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189527]  schedule+0x4e/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189530]  schedule_timeout+0x18f/0x2d0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189533]  ? _raw_spin_unlock_irqrestore+0xe/0x30
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189536]  ? del_timer_sync+0x40/0x40
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189543]  io_schedule_timeout+0x1e/0x50
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189545]  balance_dirty_pages+0x35a/0xe60
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189549]  balance_dirty_pages_ratelimited+0x311/0x3e0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189552]  generic_perform_write+0x138/0x1f0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189556]  ext4_buffered_write_iter+0x98/0x160
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189561]  ext4_file_write_iter+0x53/0x230
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189565]  ? common_file_perm+0x72/0x180
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189567]  do_iter_readv_writev+0x152/0x1b0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189570]  do_iter_write+0x8c/0x1d0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189573]  vfs_iter_write+0x19/0x30
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189577]  ovl_write_iter+0x2d3/0x490 [overlay]
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189586]  do_iter_readv_writev+0x152/0x1b0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189588]  do_iter_write+0x8c/0x1d0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189590]  vfs_writev+0x83/0x150
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189593]  do_pwritev+0x97/0xe0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189595]  __x64_sys_pwritev+0x21/0x30
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189597]  do_syscall_64+0x5c/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189601]  ? __x64_sys_pwritev+0x21/0x30
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189602]  ? do_syscall_64+0x69/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189604]  ? do_syscall_64+0x69/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189607]  ? do_syscall_64+0x69/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189609]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189613] RIP: 0033:0x7fa00ea0c829
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189616] RSP: 002b:00007ffde9500e18 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189618] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa00ea0c829
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189620] RDX: 0000000000000002 RSI: 00007ffde9500e40 RDI: 00000000000001da
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189622] RBP: 000055e6541b7058 R08: 0000000000000000 R09: 0000000000002000
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189623] R10: 0000000015338000 R11: 0000000000000246 R12: 0000000000000000
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189625] R13: 0000000015338000 R14: 0000000000000000 R15: 00007ffde9500e20
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189627]  </TASK>

The root filesystem is ext4, (which is reinforced by the stack trace above) but since it's running inside of docker, overlayfs likely is involved as well. I unfortunately do not have multiple stack traces to verify at which point in the stack above it was stuck.

@Keno
Copy link

Keno commented Aug 29, 2023

I thought you said it was in fsync?

@staticfloat
Copy link
Member Author

Indeed I did, but I must have crossed some wires, because it was actually a dpkg task that was stuck in fsync:

Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189635] task:dpkg            state:D stack:    0 pid:1251567 ppid:1250276 flags:0x00004000
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189637] Call Trace:
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189638]  <TASK>
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189639]  __schedule+0x2cd/0x8a0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189642]  schedule+0x4e/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189644]  io_schedule+0x16/0x40
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189647]  wait_on_page_bit_common+0x16e/0x3b0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189651]  ? filemap_invalidate_unlock_two+0x50/0x50
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189654]  wait_on_page_bit+0x3f/0x50
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189657]  wait_on_page_writeback+0x26/0x80
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189659]  __filemap_fdatawait_range+0x9f/0x120
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189663]  ? do_writepages+0xc5/0x1c0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189665]  ? __ext4_mark_inode_dirty+0x200/0x200
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189668]  ? do_writepages+0xc5/0x1c0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189671]  file_write_and_wait_range+0xd3/0xf0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189675]  ext4_sync_file+0xf6/0x320
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189677]  vfs_fsync_range+0x49/0x90
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189679]  do_fsync+0x3d/0x80
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189681]  ? __x64_sys_ftruncate+0x1b/0x30
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189686]  __x64_sys_fsync+0x14/0x20
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189688]  do_syscall_64+0x5c/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189690]  ? do_syscall_64+0x69/0xc0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189692]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189695] RIP: 0033:0x7f123a0781a7
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189697] RSP: 002b:00007ffd029d10b8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189699] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f123a0781a7
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189700] RDX: 0000000000000000 RSI: 00000000000001ed RDI: 0000000000000004
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189701] RBP: 000056408be47fa0 R08: 0000000000000000 R09: 0000000000002800
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189702] R10: 000056408b9677b6 R11: 0000000000000246 R12: 000056408bde52a0
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189704] R13: 00000000000001ed R14: 000056408bdf6580 R15: 0000000000000000
Aug 28 18:32:38 loadbalancer-us-east kernel: [24799645.189706]  </TASK>

@Keno
Copy link

Keno commented Aug 29, 2023

Ok, there are a couple of kernel bugs with backtraces that look like this, but nothing particularly recent. I would recommend a kernel upgrade and if it still happens, let me know next time and i can try to diagnose on a live system.

@staticfloat
Copy link
Member Author

We upgraded from 5.15.0-1022-aws (the version where the bad stuff happened) to 5.15.0-1041-aws (the latest as of yesterday). If it happens again, I'll let you know.

@staticfloat staticfloat changed the title nginx processes can get "stuck" kernel ext4 lockup causing nginx slowdown Aug 29, 2023
@StefanKarpinski
Copy link
Collaborator

In my ideally world we would have monitoring that detects this, records whatever dmsg info it needs to diagnose this, notifies us (on #infrastructure?), and automatically reboots the machine. Or would it be better to keep this machine around for debugging and replace it with a new machine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants