Uploaded image for project: 'JS - JobScheduler'
  1. JS - JobScheduler
  2. JS-2018

Controller Cluster should support repeated fail-over in case of network isolation

    XMLWordPrintable

Details

    Description

      Current Situation

      • If a Controller Cluster is operated in a network environment that frequently isolates individual Controller instances from the network for a short period then the cluster will stop to work after repeated fail-over in short intervals.
        • Network isolation includes that a Controller instance cannot connect to other components and cannot be connected to. At the same time remaining components (Secondary Controller, Cluster Watch Agent) can continue to communicate with each other.
        • Repeated fail-over means that fail-over occurs in intervals that are shorter than the time required by the Controller Cluster to synchronize from a previous fail-over.
      • As a result of this situation both Primary and Secondary Controller instances can become active at the same point in time.

      Desired Behavior

      • The Controller Cluster should cope with a situation when individual Controller instances are repeatedly isolated from the network for a short period and when fail-over repeatedly occurs in short intervals.
      • No two Controller instances in a cluster should be active at the same point in time.

      Attachments

        Activity

          People

            jz Joacim Zschimmer
            ap Andreas Püschel
            Andreas Püschel Andreas Püschel
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: