Continua doesn't fully recover after transient SQL Server outage

We’re running Continua 1.8.0.201 with a SQL Server 2014 database. Recently, we’ve been running into an issue where our SQL Server is sometimes slow to respond (which is being investigated separately by our IT). This state lasts for about 40 seconds and then normal operations resume.

When this happens, Continua does not seem to fully resume its operations. Some repositories are not checked for changes anymore until either the service is restarted or an admin user manually resets them. Usually, there is no good way to notice something went wrong other than people complaining that their builds are stuck in the queue  for hours waiting for the repository to be checked.

I’m attaching the application event log with the SQL errors reported by Continua while SQL was down.__37681__0__AppLog.zip (29.536 KB)

Hi Miruna,

Thank you for this report and the event log entries. We’re not sure what is preventing Continua CI from recovering from database faults at the moment, but we’ll do so further testing on service recovery at the start of next week.