Uploaded image for project: 'JITL - JobScheduler Integrated Template Library'
  1. JITL - JobScheduler Integrated Template Library
  2. JITL-534

JobScheduler Monitoring Interface - Send notifications for repeatedly failed job chain steps in JobScheduler releases starting from 1.12

    XMLWordPrintable

Details

    Description

      Current Situation

      • When an error reoccurrs in the same job node for which a notification has already been sent then this order state is considered being previously notified and no new notification will be sent.

      Desired Behaviour

      • In some cases, e.g. for long running orders, it is desirable that a reoccurring error results in repeated notifications.
      • The following new functionality should be provided:
      1. Send notifications for all errors that occur, do not suppress errors for repeatedly failed executions.
        • Introduce a new configuration item for the JobChain element.
          ...
          <JobChain name=...>
              <NotifyRepeatedError />
          </JobChain>
          ...
          
      1. Send notifications for errors that occur due to repeated failed executions if the restart was caused by manual intervention.
        • This functionality considers the audit log that was introduced with releases 1.11 to identify manually caused restarts of tasks.
        • Introduce a new configuration item for the JobChain element.
          ...
          <JobChain name=...>
              <NotifyRepeatedError>
                   <NotifyByIntervention />
              </NotifyRepeatedError>
          </JobChain>
          ...
          
      1. Send notifications for errors that occur due to repeatedly failed executions if a configurable period of time is exceeded. The period between notifications is calculated from the time of the last failed execution for which a notification has been sent and the time of the current failed execution.
        • Introduce a new configuration item for the JobChain element.
          ...
          <JobChain name=...>
              <NotifyRepeatedError>
                   <NotifyByPeriod period="2h 30m" />
              </NotifyRepeatedError>
          </JobChain>
          ...
          

      Test Instruction

      • Test Configuration
        • use TEST-JITL-534.zip file
        • use the config/notification/SystemMonitorNotification_v1.0.xsd schema file provided with the JobScheduler installation
        • config/notification/SystemMonitorNotification_MonitorSystem.xml
          • adjust the NotificationCommand
        • config/live/JITL-534-notification-repeated-errors/setback|suspend
          • job_100.job.xml and job_200.job.xml
            • create the files e.g. D:/my_file_100.txt and rename to D:/my_file_100.txt~
              • this jobs throw an error if no file is found
      • Test Execution
        • Testing of different scenarios:
          • config/notification/SystemMonitorNotification_MonitorSystem.xml JobChain element:
            • without NotifyRepeatedError
            • NotifyRepeatedError
              • without child nodes
            • NotifyRepeatedError -> NotifyByIntervention
            • NotifyRepeatedError -> NotifyByPeriod
            • NotifyRepeatedError -> NotifyByIntervention and NotifyByPeriod
          • test the setback case
          • test the suspend case
        • Run the order and simulate an error or success case (by renaming the files -> see D:/my_file_100.txt)
        • Use the Resume, Reset JOC operations to resume an order

      Attachments

        Issue Links

          Activity

            People

              re Robert Ehrlich
              re Robert Ehrlich
              Kanika Agrawal Kanika Agrawal
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 1 week
                  1w
                  Remaining:
                  Remaining Estimate - 1 week
                  1w
                  Logged:
                  Time Spent - Not Specified
                  Not Specified