





            


                 EETTMM ---- AA PPrrooggrraamm EExxcceeppttiioonn aanndd TTeerrmmiinnaattiioonn MMaannaaggeerr


                                    _P_a_u_l _D_u_B_o_i_s
                              _d_u_b_o_i_s_@_p_r_i_m_a_t_e_._w_i_s_c_._e_d_u
                     Wisconsin Regional Primate Research Center
                           Revision date:  10 April 1997


            11..  IInnttrroodduuccttiioonn

            This document describes Exception  and  Termination  Manager
            (ETM),  a simple(-minded) library to manage exceptional con-
            ditions that arise during program execution, and to  provide
            for orderly program shutdown.

            There  are at least a couple of approaches one may adopt for
            handling error conditions within an application:

            +o    Have functions always  return  a  value  and  have  all
                 callers  test the return value and respond accordingly.

            +o    Force the program to give up and exit early.

            Each approach has strengths and  weaknesses.   A  difficulty
            with  the  first is that actions composed of many subsidiary
            actions, each of which may themselves succeed or  fail,  can
            easily  become very unwieldy when an attempt is made to han-
            dle all possible outcomes.  However,  such  a  program  will
            also continue in the face of extreme adversity.

            An advantage of the second approach is that it is, conceptu-
            ally at least, simpler to let a program die when  a  serious
            error  occurs.   The difficulty lies in making sure the pro-
            gram cleans up and shuts  down  properly  before  it  exits.
            This  can be a problem especially when a program uses a num-
            ber of independent modules which can each  encounter  excep-
            tional  conditions  and  need to be shut down, and which may
            know nothing of each other.  ETM is  designed  to  alleviate
            the difficulties of this second approach.

            The general architecture assumed for this discussion is that
            of an application which uses zero or more  subsystems  which
            may be more or less independent of each other, and which may
            each require initialization and/or termination.  Also, other
            application-specific   initialization   and/or   termination
            actions may need to be  performed  which  are  unrelated  to
            those  of  the  subsystems, e.g., temporary files created at
            the beginning of the application need to be  removed  before
            final termination, network connections need to be shut down,
            terminal state needs to be restored.



            Revision date:  10 April 1997     Printed:  19 October 19125





            Exception and Termination Man-ag2er-


            Ideally, when an application executes normally, it will ini-
            tialize,  perform  the main processing, then shut down in an
            orderly fashion.  This does not always  occur.   Exceptional
            conditions may be detected which necessitate a ``panic'' (an
            immediate program exit) because processing  cannot  continue
            further,  or  because  it is judged too burdensome to try to
            continue.

            An individual subsystem may be easily written  such  that  a
            panic  within  itself  causes  its  own  shutdown code to be
            invoked.  It is more difficult to arrange for other  subsys-
            tems  to be notified of the panic so that they can shut down
            as well, since the subsystem in which the panic  occurs  may
            not even know about them.

            An  additional  difficulty is that some exceptions may occur
            for reasons not related to algorithmically detectable condi-
            tions.  For instance, the user of an application may cause a
            signal to be delivered to it at any time.  This has  nothing
            to do with normal execution and cannot be predicted.

            The goals of ETM are thus twofold:

            (1)  Panics  triggered anywhere within an application or any
                 of its subsystems should cause orderly shutdown of  all
                 subsystems and the application itself.

            (2)  Signals  that  normally  terminate  a program should be
                 caught and trigger a panic to  allow  shutdown  as  per
                 (1).

            22..  PPrroocceessssiinngg MMooddeell

            The  model  used  by ETM is that the application initializes
            subsystems in the order required by any  dependencies  among
            them,  and  then  terminates them in the reverse order.  The
            presumption here is that if subsystem  ss2  is  dependendent
            upon subsystem ss1, then ss1 should be initialized first and
            terminated last; the dependency is unlikely to make it  wise
            to shut down ss1 before ss2.

            ETM  must  itself  be initialized before any other subsystem
            which uses it.  The initialization call, ETMInit(), takes as
            an argument a pointer to a routine which performs any appli-
            cation-specific cleanup not related to  its  subsystems,  or
            NULL if there is no such routine.

            Each  of  the subsystems should then be initialized.  A sub-
            system's initialization routine should call  ETMAddShutdown-
            Proc()  to  register  its  own shutdown routine with ETM, if
            there is one.  (Some subsystems may require no explicit ini-
            tialization or termination.  However, if there is a shutdown
            routine, you should at least  call  ETMAddShutdownProc()  to
            register it.)



            Revision date:  10 April 1997     Printed:  19 October 19125





                                       E-xc3ep-tion and Termination Manager


            When  the program detects an exceptional condition, it calls
            ETMPanic() to describe the problem and exit.  ETMPanic()  is
            also  called  automatically when a signal is caught.  A mes-
            sage is printed, and all the  shutdown  routines  that  have
            been  registered  are  automatically executed, including the
            application-specific one.

            ETM is designed to handle shutting down under  unusual  cir-
            cumstances, but it also works well for terminating normally.
            Instead  of  calling  ETMPanic(),  the   application   calls
            ETMEnd().  This is much like calling ETMPanic(), except that
            no error message is printed, and  ETMEnd()  returns  to  the
            caller.   which  takes care of calling all the shutdown rou-
            tines that have been registered.

            It is evident that the  functionality  provided  by  ETM  is
            somewhat  like that of the atexit() routine provided on some
            systems.  Some differences between the two are:

            +o    atexit() is either built in or not available.  ETM  can
                 be  put on any system to which it can be ported (extent
                 unknown, but includes  at  least  SunOS,  Ultrix,  Mips
                 RISC/os and THINK C).

            +o    ETM is more suited for handling exceptional conditions.

            +o    ETM shutdown routines  can  be  installed  and  removed
                 later.    atexit()   provides   only  for  installation
                 (although you could simulate removal by setting a  flag
                 which  shutdown routines examine to see whether to exe-
                 cute or not).

            Here is a short example of how to set up and shut down using
            ETM.


            main ()
            {
               . . .
               ETMInit (Cleanup);  /* register application-specific cleanup */
               SS1Init ();         /* registers SS1End() for shutdown */
               SS2Init ();         /* registers SS2End() for shutdown */
               SS3Init ();         /* registers SS3End() for shutdown */
               ... main processing here ...
               ETMEnd ();          /* calls SS3End (), SS2End () and SS1End () */
               exit (0);
            }


            Subsystems that are themselves built on other subsystems may
            follow this model, except that they would not call ETMInit()
            or ETMEnd().





            Revision date:  10 April 1997     Printed:  19 October 19125





            Exception and Termination Man-ag4er-


            If  there is no special initialization or shutdown activity,
            and you don't care about catching signals, it is not  neces-
            sary  to  call  ETMInit() and ETMEnd().  The application may
            still call ETMPanic() to print error messages and terminate.
            (Even if the application does use ETMInit() and ETMEnd(), it
            is safe to call ETMPanic()  before  any  initialization  has
            been  done,  because  nothing  needs to be shut down at that
            point yet.)

            If ETM itself encounters an exceptional condition (e.g.,  it
            cannot  allocate  memory  when  it  needs  to),  it will--of
            course--trigger a panic.  This should be  rare,  but  if  it
            occurs,  ETM  will  generate  a  message indicating what the
            problem was.

            33..  CCaavveeaattss

            Shutdown routines  shouldn't  call  ETMPanic(),  since  ETM-
            Panic()  causes  shutdown  routines  to  be  executed.   ETM
            detects loops of this sort, but their occurrence indicate  a
            flaw  in  program  logic.  Similarly, if you install a print
            routine  to  redirect  ETM's  output  somewhere  other  than
            stderr,  the  routine  shouldn't  call ETM to print any mes-
            sages.

            kill -9 is uncatchable and there's nothing you can do  about
            it.

            44..  PPrrooggrraammmmiinngg IInntteerrffaaccee

            The  ETM library should be installed in /usr/lib/libetm.a or
            local equivalent, and applications should link  in  the  ETM
            library with the -letm flag.  Source files that use ETM rou-
            tines should include etm.h.  If you use ETM functions  in  a
            source  file without including etm.h, you will get undefined
            symbol errors at link time.

            The abstract types ETMProcRetType and ETMProcPtr may be used
            for  declaring  and  passing  pointers to functions that are
            passed to ETM routines.  By default these will be  void  and
            void(*)(), but on deficient systems with C compilers lacking
            void pointers they will be int and  int(*)(),  the  usual  C
            defaults for functions.

            These  types  make it easier to declare properly typed func-
            tions and NULL pointers.  For instance, if  you  don't  pass
            any shutdown routine to ETMInit(), use


                 ETMInit ((ETMProcPtr) NULL);


            If you do, use




            Revision date:  10 April 1997     Printed:  19 October 19125





                                       E-xc5ep-tion and Termination Manager


                 ETMProcRetType ShutdownProc () { . . . }
                 . . .
                 main ()
                 {
                      . . .
                      ETMInit (ShutdownProc);
                      . . .
                 }


            Descriptions of the ETM routines follow.

            ETMProcRetType ETMInit (p)
            ETMProcPtr     p;

                 Registers  the  application's  cleanup routine p (which
                 should be NULL if there is none) and registers  default
                 handlers  for  the following signals (all of which nor-
                 mally cause program  exit):  SIGHUP,  SIGINT,  SIGQUIT,
                 SIGILL,  SIGSYS, SIGTERM, SIGBUS, SIGSEGV, SIGFPE, SIG-
                 PIPE.  If p is not NULL, it should point to  a  routine
                 that takes no arguments and returns no value.

            ETMProcRetType ETMEnd ()

                 Causes all registered shutdown routines to be executed.
                 The application may then exit normally with exit(0).

            ETMProcRetType ETMPanic (fmt, ...)
            char *fmt;

                 ETMPanic() is called when a panic condition occurs, and
                 the  program  cannot  continue.   The  arguments are as
                 those for printf() and are  used  to  print  a  message
                 after  shutting  down  all subsystems and executing the
                 application's  cleanup  routine,  and  before   calling
                 exit().   ETMPanic()  adds  a newline to the end of the
                 message.

                 ETMPanic() may be called at any time,  including  prior
                 to  calling ETMInit(), but only those shutdown routines
                 which have been registered are invoked.

                 A  common  problem  with  applications  that  encounter
                 exceptional  conditions  such as segmentation faults is
                 that you often don't see all the output  your  applica-
                 tion  has  produced.   This  is because stdout is often
                 buffered.  To alleviate this problem, stdout is flushed
                 before  any  message  is  printed,  so that any pending
                 application output is flushed and  appears  before  the
                 error message.

                 By  default,  ETMPanic()  prints the message on stderr.
                 This behavior may be modified with ETMSetPrintProc().



            Revision date:  10 April 1997     Printed:  19 October 19125





            Exception and Termination Man-ag6er-


                 The default exit() value is 1.  This  may  be  modified
                 with ETMSetExitStatus().

            ETMProcRetType ETMMsg (fmt, ...)
            char *fmt;

                 ETMMsg()  is like ETMPanic() except that it just prints
                 the message and returns.  It is useful in that if panic
                 message output has been redirected somewhere other than
                 stderr (e.g., to the system log), ETMMsg()  will  write
                 its  output  there, too.  The application does not need
                 to know whether such redirection has taken place.

                 ETMMsg() may be called at any time, including prior  to
                 calling ETMInit().

            ETMProcRetType ETMAddShutdownProc (p)
            ETMProcPtr     p;

                 Register a shutdown routine with ETM.  This is normally
                 called within a subsystem's initialization routine.   p
                 should  point  to a routine that takes no arguments and
                 returns no value.

            ETMProcRetType ETMRemoveShutdownProc (p)
            ETMProcPtr     p;

                 Deregister  a  previously-registered  shutdown  routine
                 with  ETM.   This is useful for routines that only need
                 to be registered temporarily, e.g., during execution of
                 some  piece  of code that temporarily creates some file
                 that needs to be removed if the  program  crashes,  but
                 which removes it itself if execution proceeds normally.

            ETMProcRetType ETMSetSignalProc (signo, p)
            int  signo;
            ETMProcPtr     p;

                 Register a signal-catching routine  to  override  ETM's
                 default.  The routine will be called with one argument,
                 the signal number.  It should return no value,  _r_e_g_a_r_d_-
                 _l_e_s_s  _o_f  _t_h_e  _u_s_u_a_l _r_e_t_u_r_n _t_y_p_e _o_f _s_i_g_n_a_l _h_a_n_d_l_e_r _r_o_u_-
                 _t_i_n_e_s _o_n _y_o_u_r _s_y_s_t_e_m_.  (When ETM is configured on  your
                 system,  it  knows the proper return value for signal()
                 but hides differences among systems from your  applica-
                 tion so you don't have to think about it.)

                 To  return a signal to its default action or to cause a
                 signal to be ignored, pass the following values  for  p
                 (these are defined in etm.h):







            Revision date:  10 April 1997     Printed:  19 October 19125





                                       E-xc7ep-tion and Termination Manager


                      ETMSigIgnore        signal is ignored
                      ETMSigDefault       signal default action is restored


            ETMProcPtr ETMGetSignalProc (signo)
            int  signo;

                 Returns  the  function  current  used  to  catch signal
                 signo, or NULL  if  the  signal  is  handled  with  the
                 default  action  or being ignored (it's not possible to
                 distinguish between the last two cases).

            ETMProcRetType ETMSetPrintProc (p)
            ETMProcPtr     p;

                 This routine is used to register a procedure  that  ETM
                 can use to print messages.  The default is to send mes-
                 sages to stderr, which is  appropriate  for  most  pro-
                 grams.   Applications may prefer to send messages else-
                 where.  For  instance,  non-interactive  programs  like
                 network  servers  might  send them to syslog() instead.
                 Or a program may wish to send messages to multiple des-
                 tinations.

                 To  override the default, pass the address of an alter-
                 nate print routine to ETMSetPrintProc().   The  routine
                 should  take  one  argument,  a  pointer to a character
                 string, and return no value.  The argument will be  the
                 fully  formatted panic message, complete with a newline
                 on the end.  To restore the default, pass NULL.

                 The  printing  routine  shouldn't  call  ETMPanic()  or
                 ETMMsg() or a loop will be detected and ETM will conve-
                 niently panic as a service to let you know you  have  a
                 logic error in your program.

            ETMProcPtr ETMGetPrintProc ()

                 Returns  a  pointer  to  the current printing function,
                 NULL if the default is being used.

            ETMProcRetType ETMSetExitStatus (status)
            int  status;

                 This routine is used to register the status value  that
                 is  passed  to exit() when a panic occurs.  The default
                 is 1.  For some applications it is desirable to  return
                 a  different  value.   For instance, a mail server that
                 processes messages may send back a message to the  per-
                 son  who  sent  mail  when a request is erroneous, then
                 panic (perhaps by writing a message to the system log).
                 On  some  systems,  if a program invoked to handle mail
                 returns non-zero, the mailer will send another  message
                 to  that  person  stating  that  there  was  a  problem



            Revision date:  10 April 1997     Printed:  19 October 19125





            Exception and Termination Man-ag8er-


                 handling the mail.  This extra message is  unnecessary,
                 and  can be suppressed by registering an exit status of
                 0.

                 If ETMSetAbort() has been called to force an abort() on
                 a panic, the exit status is not returned.

            int ETMGetExitStatus ()

                 Returns  the current exit status which will be returned
                 if a panic occurs.

            ETMProcRetType ETMSetAbort (val)
            int  val;

                 Calling this function with  a  non-zero  value  of  val
                 causes  ETM  to  try to generate a core image when ETM-
                 Panic() is called (after the panic message is printed).
                 This  can sometimes be useful for debugging.  If val is
                 zero, image generation is suppressed.  The  default  is
                 no image.

                 ETMSetAbort() is meaningless on systems with no concept
                 of a core image.  Also, if you install a signal catcher
                 for SIGABRT, you may end up in a panic loop.

            int ETMGetAbort ()
            int  val;

                 Return current image generation value.



























            Revision date:  10 April 1997     Printed:  19 October 19125


