- The Universal Agent continues execution of a task if the connection to its Master gets lost.
- This behavior does not comply with the behavior of the Classic Agent that would kill tasks in a similar situation.
- Due to the change from a TCP to an HTTP connection between Master and Agent an interception of the connection is not immediately detected.
- The Universal Agent will kill tasks if the connection to its Master gets lost. A connection loss is detected by missing heartbeats that are sent from Master to Agent and vice versa.
- This behavior is intended to prevent simultaneous duplicate execution of tasks: should the connection loss be due to failure of a JobScheduler Master and should the Master later on come up then it would request the Agent once more to start the respective task as it has no knowledge of the previous execution result.
- The Master and Agent send heartbeats to each other.
- The Agent receives HTTP POST requests from the Master and will respond within 5s, independently from the completion of the command that has been requested by the Master.
- The Master will repeat sending further HTTP POST requests and accepting acknowledgements until the Agent sends the final response, i.e. after completion of a task.
- If the Agent does not receive a heartbeat from the Master within the double period (10s) then the Agent will assume the connection to be lost and will kill the task.
- If the Master does not receive a heartbeat from the Agent then the Master will consider the task being lost and will assign the task an error state.