Verint Job keep leaving "Running since ..." ghost job entries after successful execution

This issue is related to a private support ticket #1160782

Introduction

- We had implemented multiple custom Verint Jobs for a extension called Video Manager, this extension is intended for managing and processing multiple videos on Verint instance

- This extension have two primary job that handles processing videos on server VideoManagerContentCheckJob and VideoManagerJob

- VideoManagerContentCheckJob - is running once a day + once at server restart, it re-checking all content on server for unprocessed video and schedules them for processing

- VideoManagerJob - is running every 1 minute to check is there any video scheduled for processing, if there no video it exits immediately

Server info:

- Verint version 11.1.8.16788

Issue

1. On server launch, as scheduled VideoManagerContentCheckJob perform checking of all videos on server

2. We have very detailed event logging that shows that main method of this job Execute(...) is exiting successfully 

3. After job is finished, very frequently, we keep seeing VideoManagerContentCheckJob in a list of active job, that stay there marked as active forever

4. Server restart is not helping to get rid of them, and new fake active items keep appearing

4.1. We can be totally sure that this active Job are "ghosts" because we have detailed log that shows that main job method is exiting and we have a named mutex lock inside every Job runs that will prevent multiple job instance to run in parallel

Screenshot from Job administration panel, all of these Job are actually not active and was finished successfully long time ago

  • A few notes from taking a look at your VideoManagerContentCheckJob code:

    1. Your Initialize() method attempts to Schedule the job manually - you do not need to do this. As long as the job is enabled and has a valid schedule configured, it will automatically follow that schedule. Scheduling manually is causing it to run as a dynamic job, resulting in the multiple instances you are seeing. Test this first and see if it resolves the issue before trying the other recommendations.
    2. We recommend that asynchronous patterns not be used. The job service is already managing background process usage according to the server resources available, and even with Cancellation and Task.WaitAll there is too much possibility of deadlock when blocking in a synchronous context. This is especially true when accessing an external resource like a SQL database, as you do. Two alternatives:
      1. Revert to synchronous/sequential method calls
      2. Separate this job into individual jobs for each "section" of content (MediaValuesChecker, ForumsChecker, etc) and let the job server manage scheduling and running them all individually. This could also assist in pinpointing any specific issues with particular content types. You could use a base class to avoid code repetition and only re-implement key components like JobTypeId, Name, and Checker logic.
    3. Similarly, we recommend that Mutex locking should not be used as it may have adverse effects on the job service's management of job execution and scheduling.