YADE - Yet Another Data Exchange Tool
  1. YADE - Yet Another Data Exchange Tool
  2. YADE-522

YADE running in parallel tasks should not be slower than running with a single task

    Details

      Description

      Current Situation

      • Consider the following situation:
        • a job chain with a file order source.
        • a YADE job in the job chain configured for a single task (<job tasks="1"/>.
        • 10 (small sized) files are copied to the incoming folder to start the job chain.
      • The following observations result from this scenario. Absolute values for file transfer duration are not relevant in this scenario as they depend on network latency, hardware capabilities and software configuration. However, relative values cause some surprising observations:
        • the first transfer takes 7s.
        • the next transfers takes 1-2s.
        • the complete transfer for 10 sequential task executions takes <20s.
      • When increasing the number of tasks to 10 (<job tasks="10"/> then the tasks will run in parallel. Consider the following observations:
        • each task takes >60s.
        • the complete transfer takes > 60s
      • This behavior is explained as follows:
        • each task that starts the YADE job has to load a Java Virtual Machine.
        • a task for the YADE job by default remains active for 5s after completion of an order. If additional orders arrive within this idle timeout then the same task is re-used to process the next order, i.e. no new task has to be started and the same Java Virtual Machine can be used. This causes performance improvements when running a single task sequentially for a number of orders.
        • however, when running 10 tasks in parallel then each task
          • has to load its individual Java Virtual Machine,
          • will send requests to the JobScheduler Master to retrieve job and order parameters.
        • the JobScheduler Master will handle such requests sequentially
          • which results in the fact that requests from 10 simultaneously started tasks are processed sequentially, i.e. they are queued.
          • as a result in a worst scenario the negative impact includes that 10 parallel tasks (each processing a single incoming file order) take more time than running a single task for 10 sequential orders.
          • in addition CPU consumption of the Master raises due to a high frequency of parallel requests.

      Desired Behavior

      • Basically 10 parallel tasks for 10 individual files should not take more time that running a single task for the same number of files.

      Maintainer Notes

      • Consider
      • Explanation
        • For performance optimization the scenario in question has to be clearly identified:
          • transferring small files (whatever this might exactly mean to you depending on your bandwidth) should be done with a small number of tasks.
          • transferring large files (some MB or GB) suggests to use a larger number of parallel tasks.
        • Existing performance optimizations for file transfer jobs include
          • to extend the idle timeout for tasks (<job idle_timeout="..."/> to allow the same task to be reused for subsequent orders. There is no harm in increasing the idle timeout to e.g. 300s.
          • to preload tasks <job min_tasks="..."/> in order to guarantee that tasks are already started when an order arrives. Preloaded tasks do not require CPU, however, they require memory. If you can spare some 64MB per preloaded task then performance improvements will be surprising.
        • The worst case scenario from the original use case is not realistic for all environments. If files arrive with a minimum delay (2-4s) then the observed behavior will not occur.
      • The improvements from this issue include to
        • reduce the number of API requests of a file transfer task to the Master. This reduces the Master's CPU consumption and makes parallelism of API requests by such jobs less probable.
        • optimize the Master's performance when monitoring such jobs with the JOC Cockpit GUI. The above worst case scenario includes frequent requests of the JOC Cockpit for status updates of an order that are optimized with the change from this issue to be performed in parallel and therefore will not delay API requests from jobs.

        Activity

        Hide
        Uwe Risse added a comment -

        For small files it is still slower but there is a much better performance now. For big files the parallel transfer becomes faster. It is recommended to let the job run in not to many parallel tasks but to work with the idle timeout to avoid unneccessare initializations e.g. connect to the database.

        Show
        Uwe Risse added a comment - For small files it is still slower but there is a much better performance now. For big files the parallel transfer becomes faster. It is recommended to let the job run in not to many parallel tasks but to work with the idle timeout to avoid unneccessare initializations e.g. connect to the database.

          People

          • Assignee:
            Santiago Aucejo Petzoldt
            Reporter:
            Uwe Risse
            Approver:
            Uwe Risse
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: