Details
-
Feature
-
Status: Deferred (View Workflow)
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Current Situation
- If a JobScheduler instance or underlying server crashes then the behavior for orders is to be re-executed in the same job that was being processed during the crash.
- This behavior is fine for a couple of use cases, however, other use cases require more fine-grained capabilities for such a situation, e.g. not to re-execute an order.
Desired Behavior
- Capability to configure what should happen to the order that was being processed by a job during a server crash:
- repeat (default): the order is again executed in the same job. This is the current behaviour in Job Scheduler. However, it might be not be ok for some jobs to be run twice with the same order.
- suspend: after recovery the order is suspended at the job where it crashed.
- skip: after recovery the order is moved to the next node.
- error_state: after recovery the order is moved to the error_state.
- Similar capabilities for standalone jobs with possible values:
- none (default)
- repeat: start job again immediately.
- stop: set job to stopped state.
- On every recovery mark the task in the JobScheduler history as failed.
- Configure the values for the default behavior for job chains and jobs by global parameters.
Proposed Configuration
- Job Chains
- Attribute <job_chain_node on_recovery="..."/> with the possible values repeat, suspend, skip, error_state.
- Standalone Jobs
- Attribute <job on_recovery="..."/>
Maintainer Notes
- This feature might probalby not become available for standalone jobs as in future releases such jobs might be moved to job chains. A standalone job is mapped to a job chain with a single job node.
- Feel free to vote for this issue and to let us know your feedback.