How Does CopyStorm/JobRunner Recover from a System Reboot?

When a system crashes, is rebooted, or if the CopyStorm/JobRunner application is manually terminated; the CopyStorm/Director database will contain data indicating that jobs are running when they were terminated unexpectedly. This article explains how CopyStorm/JobRunner recovers from this situation.

The answer is that all of these cases are handled the same way:

  • Each time CopyStorm/JobRunner starts a task, the system process id of the task is stored in the CopyStorm/Director database.
  • Each time CopyStorm/JobRunner starts it looks for tasks configured as running in the CopyStorm/Director database which have disappeared from the system.
    • If found, the jobs are marked as completed and given a log file status of “Process died without notifying Copystorm/Director.”
  • Any job detected as dying unexpectedly is rescheduled using the job’s scheduling parameters.

When job completion is detected using this approach the CopyStorm/Director database may indicate that the last status of an unusually high number of jobs was “Error”. This is likely to happen if a job was terminated during a reboot and will be recovered from automatically the next time the job is scheduled.