Details
-
Feature
-
Status: Released (View Workflow)
-
Minor
-
Resolution: Fixed
-
1.10
Description
Feature
- The Universal Agent assumes a short-term loss of its connection to the Master if two heartbeats are missing (see Heartbeat Implementation).
- Should a HTTP POST request by the Master not be received and acknowledged by the Agent then the Master will re-send the request up to five times.
- The Agent will handle possible duplicate requests from the Master and will acknowledge within 5s.
- If the attempts of the Master to establish the connection and to re-send the requests for a maximum number of five times
- are successful then this is considered a recoverable connection loss.
- are unsuccessful then this is identified as an unrecoverable error.
- In case of a recoverable connection loss
- the tasks are continued and completed with the Agent.
- the Agent stores log output of tasks in local files (see
JS-1521). - the Agent reports the log information of running and completed tasks back to the Master.
- the Agent reports the execution history of running and completed tasks back to the Master.
- the Master adds the information received from the re-connected Agent to its history.
- The Master will report running tasks of an Agent after re-connect.
- In case of an unrecoverable error of the connection the Agent will kill the task (
JS-1523)
Heartbeat Implementation
- The Master and Agent send heartbeats to each other.
- The Agent receives HTTP POST requests from the Master and will respond within 5s, independently from the completion of the command that has been requested by the Master.
- The Master will repeat sending further HTTP POST requests and accepting acknowledgements until the Agent sends the final response, i.e. after completion of a task.
- If the Agent does not receive a heartbeat from the Master within the double period (10s) then the Agent will assume the connection to be lost and will kill the task.
- If the Master does not receive a heartbeat from the Agent then the Master will consider the task being lost and will assign the task an error state.
Delimitation
- This feature covers the situation of a recoverable Network Connection Loss, not of an on-going network outage
- This feature does not cover the situation of an unrecoverable connection loss that is due to failure or restart of a Master (server).