Opscode-erchef ( beam.smp ) process is consuming too much CPU on the Production Chef 11.2.3 Server.
Below are some data from this issue. Rebooted the server didn't resolve the issue.
packages:
private-chef-11.2.3-1.el6.x86_64
opscode-manage-1.17.0-1.el6.x86_64
opscode-reporting-1.5.5-1.el6.x86_64
opscode-push-jobs-server-1.1.3-1.el6.x86_64
System info:
Total Node Counts: 3600
MemTotal: 24GB
CPU(s): 4
# free -m
total used free shared buffers cached
Mem: 24026 23606 419 6132 369 17754
-/+ buffers/cache: 5482 18544
Swap: 12287 81 12206
beam.smp process
beam.smp process taking whole CPU usage
Chef11Server # pidstat -p 10816 1
Linux 2.6.32-754.12.1.el6.x86_64 06/03/2019 _x86_64_ (4 CPU)
02:10:33 PM PID %usr %system %guest %CPU CPU Command
02:10:34 PM 10816 100.00 11.00 0.00 100.00 2 beam.smp
02:10:35 PM 10816 100.00 11.00 0.00 100.00 2 beam.smp
02:10:36 PM 10816 100.00 11.00 0.00 100.00 2 beam.smp
02:10:37 PM 10816 100.00 8.00 0.00 100.00 2 beam.smp
02:10:38 PM 10816 100.00 9.00 0.00 100.00 2 beam.smp
Network connection per second
about ~100 connection requests getting in a second on below log file
ChefServer # cat /var/log/opscode/opscode-erchef/requests.log.4 | cut -f1 -d' ' | sort |uniq -c
111 2019-06-02T09:28:41Z
128 2019-06-02T09:28:42Z
135 2019-06-02T09:28:43Z
133 2019-06-02T09:28:44Z
125 2019-06-02T09:28:45Z
107 2019-06-02T09:28:46Z
117 2019-06-02T09:28:47Z
109 2019-06-02T09:28:48Z
System uptime
14:18:14 up 6 days, 1:30, 10 users, load average: 5.19, 5.58, 6.27
TOP Command
TOP command output like as below
ChefServer # top
top - 14:18:49 up 6 days, 1:31, 10 users, load average: 5.40, 5.59, 6.25
Tasks: 425 total, 2 running, 423 sleeping, 0 stopped, 0 zombie
Cpu(s): 73.7%us, 7.9%sy, 0.0%ni, 17.2%id, 0.2%wa, 0.0%hi, 1.1%si, 0.0%st
Mem: 24603160k total, 24267960k used, 335200k free, 379692k buffers
Swap: 12582908k total, 83792k used, 12499116k free, 18266808k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10816 opscode 20 0 1025m 308m 3508 S 115.0 1.3 11355:58 beam.smp
3728 opscode 20 0 1325m 320m 3140 S 73.2 1.3 1964:26 beam.smp
11415 opscode 20 0 249m 113m 2140 S 38.4 0.5 1002:35 ruby
10717 opscode 20 0 98268 9988 2712 R 20.2 0.0 983:06.91 nginx
10719 opscode 20 0 98120 9884 2720 S 10.3 0.0 984:38.63 nginx
11407 opscode 20 0 243m 107m 2140 S 9.9 0.4 970:02.54 ruby
10920 opscode 20 0 3318m 1.1g 5440 S 8.9 4.7 588:25.92 java
3772 opscode- 20 0 6208m 51m 47m S 7.0 0.2 45:31.69 postgres
10917 opscode 20 0 1114m 86m 3164 S 6.0 0.4 420:12.62 beam.smp
10965 opscode 20 0 2264m 181m 2364 S 6.0 0.8 900:24.29 beam.smp
Network Connections
There is too much open UDP connection between opscode-erlang process and oc_bifrost process
# netstat -tulpn |grep 10816 | wc -l
352
# netstat -tulpn |grep 10816 |head
tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN 10816/beam.smp
tcp 0 0 0.0.0.0:36392 0.0.0.0:* LISTEN 10816/beam.smp
udp 0 0 0.0.0.0:51639 0.0.0.0:* 10816/beam.smp
udp 0 0 0.0.0.0:49975 0.0.0.0:* 10816/beam.smp
udp 0 0 0.0.0.0:49847 0.0.0.0:* 10816/beam.smp
udp 0 0 0.0.0.0:41911 0.0.0.0:* 10816/beam.smp
udp 0 0 0.0.0.0:41783 0.0.0.0:* 10816/beam.smp
udp 0 0 0.0.0.0:46391 0.0.0.0:* 10816/beam.smp
udp 0 0 0.0.0.0:37176 0.0.0.0:* 10816/beam.smp
udp 0 0 0.0.0.0:38840 0.0.0.0:* 10816/beam.smp
oc_bifrost process has created more UDP connections (352)
opscode-erchef processes and oc_bifrost processes are having the same number of count of UDP connections
# netstat -tulpn |grep 3728|wc -l
352
# private-chef-ctl status | grep 3728
run: oc_bifrost: (pid 3728) 128119s; run: log: (pid 1298) 524402s
# private-chef-ctl status | grep 10816
run: opscode-erchef: (pid 10816) 515140s; run: log: (pid 1290) 524530s
SYSCALL statistic
getsockopt and setsockopt are too much
# ./syscount -c -p 10816
CSYSCALL COUNT
write 2
newstat 3
access 6
connect 357
munmap 1227
getpeername 2662
bind 3008
socket 3008
read 3850
accept 4720
sendto 5303
close 5610
getsockname 5670
epoll_ctl 9925
fcntl 11340
futex 13962
epoll_wait 22926
setsockopt 29957
getsockopt 38310
recvfrom 46708
writev 62396
I would like to reduce CPU utilization.
Any help/advice would be appreciated.