Uploaded image for project: 'JS - JobScheduler'
  1. JS - JobScheduler
  2. JS-1743

JobScheduler Master using PostgreSQL should not freeze after a longer pause

    XMLWordPrintable

Details

    Description

      Current Situation

      • When a JobScheduler Master is continued after a pause then it will
        • continue to serve XML commands, e.g. from a "classic" JOC and
        • run tasks for any orders that had a start time during the paused period.
      • This behavior works for shorter paused periods only. If a pause lasts longer, i.e. > 1 hour, then the above behavior does not apply and the Master freezes.
      • The problem is due to idle connection timeout settings with a PostgreSQL database.
      • The output of scheduler.log shows:
        2017-10-24 15:34:00.553+0200 [info] SCHEDULER-902 state=paused
        2017-10-24 17:11:03.382+0200 [info] SCHEDULER-902 state=running
        2017-10-24 17:26:54.821+0200 [info] (Database) SCHEDULER-957 Closing database
        2017-10-24 17:26:54.821+0200 [ERROR] SCHEDULER-303 Problem with database: SCHEDULER-360 Error when accessing database table SCHEDULER_ORDER_HISTORY [Z-JAVA-105 Java exception org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend. - caused by - java.net.SocketException: Connection timed out, method=executeQuery []] [sos::scheduler::order::Job_chain::dom_element]
        2017-10-24 17:26:54.821+0200 [info] (Database) SCHEDULER-907 Opening database: jdbc -id=spooler -class=org.postgresql.Driver jdbc:postgresql://vm-dbsked.csi.it:5432/jobscheduler_cluster -user=master_cluster
        2017-10-24 17:26:54.830+0200 [info] (Database) SCHEDULER-807 Using database product PostgreSQL
        2017-10-24 17:26:55.085+0200 [WARN] SCHEDULER-721 Scheduler is not responding quickly, a microstep took 00:15:26.919s
        

      Desired Behavior

      • The Master should work for longer pauses in the same way as for shorter pauses.

      Maintainer Notes

      • This problem can be observed with releases 1.10 using PostgreSQL 9.2, it is not observed with releases 1.11 using (a newer JDBC driver) and PostgreSQL 9.4 or later.
      • This problem is due to connection settings with the database server that would kill idle connections typically after 30 minutes.
      • It is recommended to use the tcpKeepAlive=true parameter with the JDBC connection URL.

      Attachments

        Activity

          People

            ap Andreas Püschel
            ap Andreas Püschel
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: