Uploaded image for project: 'JOC - JobScheduler Operations Center'
  1. JOC - JobScheduler Operations Center
  2. JOC-1680

History Service stops if Cleanup Service removes data of the Controller and Agent History

    XMLWordPrintable

Details

    Description

      Current Situation

      • The Cleanup Service deletes data from the HISTORY_CONTROLLERS and HISTORY_AGENTS tables that are older than the retention period specified for the Cleanup Service.
      • When data from the tables are removed and Agents have not been restarted within the retention period of the Cleanup Service then an error occurs with the History Service that will make the service stop:
        o.h.e.j.s.SqlExceptionHelper       - ERROR: null value in column "TIMEZONE" violates not-null constraint
        
      • The problem is due to the fact information about Controllers/Agents is stored to the related tables in case of restart only. If the data have been removed by the Cleanup Service without restart of Controller/Agents then the time zone information is missing.

      Desired Behavior

      • The History Service should fall back to use of the Etc/UTC time zone in case that no information from Controller/Agent restarts is available.
      • The time zone is relevant for users only who did not specify to translate log timestamps to the time zone specified with their profile.

      Workaround

      • As an immediate workaround the following SQL statement should be executed with the database:
        • MySQL:
          insert into HISTORY_AGENTS(READY_EVENT_ID,CONTROLLER_ID,AGENT_ID,URI,TIMEZONE,READY_TIME,CREATED)
          select UNIX_TIMESTAMP(MODIFIED)*1000*1000, CONTROLLER_ID,AGENT_ID,URI,'Etc/UTC',MODIFIED,MODIFIED from INV_AGENT_INSTANCES
          
        • Oracle
          insert into HISTORY_AGENTS ("READY_EVENT_ID","CONTROLLER_ID","AGENT_ID","URI","TIMEZONE","READY_TIME","CREATED")
          select ("MODIFIED" - DATE'1970-01-01')*86400*1000*1000, "CONTROLLER_ID","AGENT_ID","URI",'Etc/UTC',"MODIFIED","MODIFIED" from INV_AGENT_INSTANCES
          
        • PostgreSQL
          insert into HISTORY_AGENTS ("READY_EVENT_ID","CONTROLLER_ID","AGENT_ID","URI","TIMEZONE","READY_TIME","CREATED")
          select EXTRACT(EPOCH FROM "MODIFIED")*1000*1000, "CONTROLLER_ID","AGENT_ID","URI",'Etc/UTC',"MODIFIED","MODIFIED" from INV_AGENT_INSTANCES
          
        • SQL Server
          insert into HISTORY_AGENTS ("READY_EVENT_ID","CONTROLLER_ID","AGENT_ID","URI","TIMEZONE","READY_TIME","CREATED")
          select DATEDIFF(s, '1970-01-01', "MODIFIED")*1000*1000, "CONTROLLER_ID","AGENT_ID","URI",'Etc/UTC',"MODIFIED","MODIFIED" from INV_AGENT_INSTANCES;
          
      • After applying the workaround the History Service has to be restarted, see screenshot restart_history_service.png.
      • The immediate workaround will make the History Service work, but it does not prevent the problem to occur at a later point in time when the Cleanup Service will delete related data.

       Patch

      Attachments

        Issue Links

          Activity

            People

              re Robert Ehrlich
              ap Andreas Püschel
              Kanika Agrawal Kanika Agrawal
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: