Discussion:
clustered qmail
(too old to reply)
x***@gmail.com
2007-12-10 12:00:17 UTC
Permalink
Hi all,

I have one question related to qmail and how it should be properly
stoped on clustered environment.

The cluster is desined as 1:1 (active/passive) qmail is always running
only on 1 node of the cluster and
failing over to the next node in case of problem.
We have this qmailctl script which is always called when the cluster
tries to stop the qmail:

stop)
echo "Stopping qmail: svscan"
kill `cat /var/qmail_lh2-1.03/svscan.pid`
rm /var/qmail_lh2-1.03/svscan.pid
echo " qmail"
svc -dx /var/qmail_lh2-1.03/supervise/*
echo " logging"
svc -dx /var/qmail_lh2-1.03/supervise/*/log
;;

I would like to also note that the both nodes are connected to shared
storage via scsi bus and qmail queue is
located on the shared partition. The cluster software is RHCS
(clumanager-1.2.31-1).

We frequently see from /var/log messages information that qmail was
not stopped correctly or was not able to stop
correctly leaving the state of service disabled.

I can see from living with qmail this :
G.16. qmail-send doesn't always exit immediately when killed.
Sending qmail-send a TERM signal doesn't cause it to exit immediately
if there are deliveries in progress. qmail-send will wait for all
qmail-local and qmail-remote processes to finish before it exits so it
can record the results of these deliveries.

So my question is how to make sure that qmail really stops in defined
time (3 seconds), the only way I can think off is to modify the
qmailctl script to call
svc -dk instead of svc -dx.

Good to mention why I am writing this here. We have (don't ask me why)
utilised ext3 file system for shared partitions this itself is not bad
idea if we
consider that there will be no situation when both nodes will mount
ext3 partition at the same moment. Anyhow we observed this situation
(double mounts) many times and currently this lead to serious data
corruption.



I appreciate your help


Regards,


Jorge Sanchez
Dave Sill
2007-12-13 20:36:10 UTC
Permalink
Post by x***@gmail.com
stop)
echo "Stopping qmail: svscan"
kill `cat /var/qmail_lh2-1.03/svscan.pid`
rm /var/qmail_lh2-1.03/svscan.pid
echo " qmail"
svc -dx /var/qmail_lh2-1.03/supervise/*
echo " logging"
svc -dx /var/qmail_lh2-1.03/supervise/*/log
;;
...
We frequently see from /var/log messages information that qmail was
not stopped correctly or was not able to stop
correctly leaving the state of service disabled.
G.16. qmail-send doesn't always exit immediately when killed.
Sending qmail-send a TERM signal doesn't cause it to exit immediately
if there are deliveries in progress. qmail-send will wait for all
qmail-local and qmail-remote processes to finish before it exits so it
can record the results of these deliveries.
So my question is how to make sure that qmail really stops in defined
time (3 seconds), the only way I can think off is to modify the
qmailctl script to call
svc -dk instead of svc -dx.
That won't help. I think you'll have to kill off and persistent
qmail-local or qmail-remote processes after the "svc -dx" on qmail,
but before the "svc -dx" on logging, e.g. using "killall
qmail-remote". That will cause duplicate deliveries if a qmail-remote
successfully delivers a message but isn't able to update the queue
before it's killed.
Post by x***@gmail.com
Good to mention why I am writing this here. We have (don't ask me why)
utilised ext3 file system for shared partitions this itself is not bad
idea if we
consider that there will be no situation when both nodes will mount
ext3 partition at the same moment. Anyhow we observed this situation
(double mounts) many times and currently this lead to serious data
corruption.
Sounds like you need to set up some kind of locking mechanism through
the shared filesystem so that qmail never starts on a system if it's
still running on the other system.
--
Dave Sill Oak Ridge National Lab, Workstation Support
Author, The qmail Handbook <http://web.infoave.net/~dsill>
<http://lifewithqmail.org/>: Almost everything you always wanted to know.
Loading...