Uploaded image for project: 'JS - JobScheduler'
  1. JS - JobScheduler
  2. JS-1820

Agent kill_task.sh script should be customizable to terminate tasks before killing tasks

    XMLWordPrintable

Details

    Description

      Current Situation

      • The Agent's script kill_task.sh for Unix environments kills all running tasks in case that an Agent looses the connection to the Master or is instructed by the Master to terminate.
      • The script kill_task.sh kills all running tasks, see JS-1815.

      Desired Behavior

      • The setup provides a jobscheduler_agent_kill_task_within3s.sh script with the ./bin directory that
        • first tries to terminate a running task by use of a SIGTERM signal (kill -15 switch).
          • Users can use a "trap" command with their job's shell script to catch the SIGTERM signal and to perform some cleanup tasks, e.g. disconnect from a database before terminating the script.
          • If no "trap" is implemented with a job's shell script, then the script will be terminated immediately.
        • after a timeout of 3s it sends a SIGKILL signal to the running task (kill -9 switch) and to all of its child processes.
      • This behavior applies to the situation when an Agent is loosing the connection to a Master, it does not apply to operations performed e.g. by the JOC Cockpit to end/terminate/kill a running task.
      • If this script is to be used the it must be configured with the instance script like this:
        # Set the location of a script which is called by the 
        # JobScheduler Agent to kill a process and it's children.
        #
        SCHEDULER_KILL_SCRIPT=bin/jobscheduler_agent_kill_task_within3s.sh
        
      • At the top of the script find the environment variable KILL_WITHIN that can be used to change the timeout in seconds until a SIGKILL signal is sent after a SIGTERM signal.

      Delimitation

      • When an Agent is instructed by a Master to terminate or to kill a task then the behavior is controlled by the Master:
        • the Agent will terminate a task (via SIGTERM), optionally consider a timeout and kill a task (via SIGKILL).
        • the operations to terminate or to kill a task and to consider a timeout are specified by the Master.

      Attachments

        Issue Links

          Activity

            People

              oh Oliver Haufe
              ap Andreas PĆ¼schel
              Mahendra Patidar Mahendra Patidar
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: