Add basic stale pid detection heuristic

While any normal crashes should already remove the pid file
if one was created mumo can still be killed in ways which
make it unable to clean up after itself. This patch adds
a basic heuristic which checks if the pid mentioned in the
pid file belongs to a running process. If so we assume this
is actually mumo and terminate. If not we assume mumo isn't
running and break the lock. This approach is not 100% reliable
and it break the acquire timeout properties we had before
but should do the trick.
This commit is contained in:
Stefan Hacker
2015-06-24 05:48:21 +02:00
parent b0b2c1bae1
commit 5680d806d2

View File

@@ -542,6 +542,15 @@ if __name__ == '__main__':
ret = do_main_program()
else:
pidfile = daemon.pidlockfile.TimeoutPIDLockFile(cfg.system.pidfile, 5)
if pidfile.is_locked():
try:
os.kill(pidfile.read_pid(), 0)
print >> sys.stderr, 'Mumo already running as %s' % pidfile.read_pid()
sys.exit(1)
except OSError:
print >> sys.stderr, 'Found stale mumo pid file but no process, breaking lock'
pidfile.break_lock()
context = daemon.DaemonContext(working_directory=sys.path[0],
stderr=logfile,
pidfile=pidfile)