升级程序收到SIGKILL

环境:

  • a.rpm : 包含apiserver
  • apiserver:提供api接口(使用gin), 包括上传升级包的接口
  • upgrader: 升级程序

目的:卸载a.rpm再安装新版a.rpm

步骤:

  1. 通过apiserver上传升级包tar.gz, 解压并执行nohup ./upgrader >/dev/null 2>&1 &开始升级
  2. upgrader里执行rpm -e asystemctl stop a kill apiserver时upgrader会收到SIGKILL(由systemd发给它)导致升级异常中止

通过ps查到upgrader的父进程是systemd, 此时upgrader应该和apiserver没有关系才对, apiserver被killed时, 为什么upgrader会收到SIGTERM?

原因:cgroup仍旧保持了apiserver和upgrader的关系.
解决方法:
systemd-run --unit=my_system_upgrade --scope --slice=my_system_upgrade_slice -E setsid nohup upgrader >/dev/null 2>&1 &

阅读 1.3k
1 个回答

这应该与 systemd 利用cgroup进行层级管理有关系,systemd停止一个服务时,默认的KillMode是基于cgroup来识别的,换句话说systemd中管理的服务,下面fork出来的子进程,即使被你丢入后台,父进程脱离了原先进程的关联,它的cgroup层级还是默认被关联在原来的服务下。

具体你可以看看man systemd.kill中的说明,然后调整一下apiserver的systemd配置,改下默认的KillMode配置,改为process试试。

OPTIONS
       KillMode=
           Specifies how processes of this unit shall be killed. One of control-group, process, mixed, none.

           If set to control-group, all remaining processes in the control group of this unit will be killed on unit
           stop (for services: after the stop command is executed, as configured with ExecStop=). If set to process,
           only the main process itself is killed. If set to mixed, the SIGTERM signal (see below) is sent to the
           main process while the subsequent SIGKILL signal (see below) is sent to all remaining processes of the
           unit's control group. If set to none, no process is killed. In this case, only the stop command will be
           executed on unit stop, but no process be killed otherwise. Processes remaining alive after stop are left
           in their control group and the control group continues to exist after stop unless it is empty.

           Processes will first be terminated via SIGTERM (unless the signal to send is changed via KillSignal=).
           Optionally, this is immediately followed by a SIGHUP (if enabled with SendSIGHUP=). If then, after a delay
           (configured via the TimeoutStopSec= option), processes still remain, the termination request is repeated
           with the SIGKILL signal or the signal specified via FinalKillSignal= (unless this is disabled via the
           SendSIGKILL= option). See kill(2) for more information.

           Defaults to control-group.
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题