[ale] PR_SET_PDEATHSIG wrapper

Chris Fowler cfowler at outpostsentinel.com
Sat May 29 19:29:18 EDT 2021


I've been messing around with executing some of my chroot systems inside namespaces.  I've now got a few of them using a network namespace with veth pairs.

Because of my use of one master process for each one, and now one master process for all, I keep running into problems where a child off the master. that is still in the foreground of that master, does not die when its parent terminates.

Ubuntu 18.04 on my desktop does not support the '--kill-child' option of unshare.  I'm using daemon to be the master of all and daemon for the master of each.

Looks like this:

/usr/bin/daemon -r --name=devel-master --chdir=/opt/devel -- /usr/bin/unshare -fpm --mount-proc --  /opt/devel/start-devel.sh -f

Daemon executes unshare.  It executes '/opt/deve/start-devel.sh -f' in its own namespace as pid 1.   When that script exits all children are terminated
-f is 'foreground'.   It loops through each target and starts the daemon and jchroot on each.  At the end is a trap 'exit 0' SIGTERM and a while true; do sleep 1;done loop. That's the only thing that makes it foreground.

To make all this work so that I could simply do 'daemon --name=devel-master --stop' I had to write a PR_SET_DEATHSIG wrapper in C.

Something like this:


#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/prctl.h>

#include <unistd.h>


#ifndef _POSIX_C_SOURCE
#define _POSIX_C_SOURCE 1
#endif
#include <limits.h>

extern char **environ;


int main(int argc, char **argv) {
  char *eptr = NULL;
  char *args[] = { "/bin/sh", "-i", "-l", 0 };
  argv++;
  argc--;

  if(argc < 1) {
    prctl(PR_SET_PDEATHSIG, 15);
    execvpe(args[0], args, environ);
    exit(127);
  }

  if(strcmp(argv[0], "--") == 0) {
    argv++;
    argc--;
    if(argc < 1) {
      prctl(PR_SET_PDEATHSIG, 15);
      execvpe(args[0], args, environ);
      exit(127);
    }
  }


  prctl(PR_SET_PDEATHSIG, 15);
  execvpe(argv[0], argv, environ);
  exit(127);
}

// vi: set ts=2 sw=2:


All this does is send the execvpe program SIGTERM when the parent dies.  When daemon send SIGTERM to unshare, unshare just dies.  It does not kill its child.  This fixes that.  What is the alternative?  Use a newer version of util-linux or write scripts to hunt down all those children in /proc and kill the process that will force them all to disappear?  That process is the start-devel.sh script ran by unshare.

/usr/bin/daemon -r --name=devel-master --chdir=/opt/devel -- /usr/bin/unshare -fpm --mount-proc -- /usr/local/bin/pdeathsig --  /opt/devel/start-devel.sh -f

Setting up the network namespaces with virtual addresses was a bit of an oddity.  I guess there is no way to do EVERYTHING from the parent's namespace that will allow the child namespace to ping 4.2.2.1 via vpeer1 of 10.200.1.2.   To automate that, I had to create a sub-shell child process that would wait for jchroot to create a pid file.  It creates the veth pair in the parent namespace.  It needs the pid so that it can use nsenter to run ip commands to assign an address to the vpeer1 interface.

jchroot is a chroot program that creates a new namespace, reads an fstab file to do bind mounts, and executes a program in that namespace.   In my chroots, it always executes /usr/bin/sshd -p N -D in the foreground since I use ssh to access these environments.  I had to modify it so that it would send SIGTERM down to sshd when it was terminated by daemon.

I can have 100 processes spread across 4 namespaces under 1 master namespace.  When I stop the master daemon process all of those are gone.  No inheritance by init of the host OS.  No grokking proc trying to chase down things to kill.   No lingering processes that you may have missed.  That's near perfection and provides a clean slate if you restart that master.    Is there another way to tell a process that they are not allowed to run to init when their parent dies?  If the parent dies, the child dies too.

A program that groks for things across 30 inputs using GNU tail will leave 30 GNU tails running if it dies abruptly. Start it back, you now have 60.  The first 30 are now worthless.  Consuming memory and consuming resources.  For some reason, SIGPIPE did not get their attention.




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.ale.org/pipermail/ale/attachments/20210529/9e7d8aca/attachment.htm>


More information about the Ale mailing list