CRON Scheduled WebJob abruptly quit being called

138 Views Asked by At

I have a WebJob scheduled to run every 10 minutes via Settings.job '0 0/10 * * *' that has been working fine then last night my job just quit being called. Looking around in eventlog.xml for the last call and I see the following

<EventData>
            <Data>7192</Data>
            <Data>LogCleanup</Data>
            <Data>Role environment . FAILED TO INITIALIZE. hr: -2147024891</Data>
 </EventData>

No more calls after this, I manually run the job from the portal this morning and it worked fine and is being called eevery 10 minutes again as expected. My NLog internal log file logged the the following for the last run that was called

2016-12-15 21:40:02.6449 Error Error has been raised. Exception: Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (409) Conflict. ---> System.Net.WebException: The remote server returned an error: (409) Conflict.
   at System.Net.HttpWebRequest.GetResponse()
   at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext)
   --- End of inner exception stack trace ---
   at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext)
   at Microsoft.WindowsAzure.Storage.Table.TableOperation.Execute(CloudTableClient client, CloudTable table, TableRequestOptions requestOptions, OperationContext operationContext)
   at Microsoft.WindowsAzure.Storage.Table.CloudTable.Execute(TableOperation operation, TableRequestOptions requestOptions, OperationContext operationContext)
   at NLog.AzureTableStorage.AzureTableStorageTarget.Write(LogEventInfo logEvent)
   at NLog.Targets.Target.Write(AsyncLogEventInfo logEvent)
Request Information
RequestID:1153c4ed-0002-000e-611b-57d353000000
RequestDate:Thu, 15 Dec 2016 21:40:02 GMT
StatusMessage:Conflict
ErrorCode:EntityAlreadyExists

The errors don't make any sense to me but the bigger question is why the job just quit being called? One run failed with some unexplained error and the scheduler quits calling it does not seem right.

How reliable are WebJobs? What kind of checks do I need in place to validate that they are being called?

1

There are 1 best solutions below

7
On BEST ANSWER

Please make sure you have AlwaysOn enabled for your Web App. CRON Scheduled jobs require this - see documentation here. The runtime will actually emit a Warning to the logs if we detect you don't have AlwaysOn enabled:

Always On' doesn't appear to be enabled for this Web App. To ensure your continuous job doesn't stop running when the SCM host is idle for too long, consider enabling 'Always On' in the configuration settings for your Web App. Note: 'Always On' is available only in Basic, Standard and Premium modes.

Please check your WebJob logs for this - you should see it. The idea of that log was to help users auto-diagnose this, but perhaps you didn't see it? We also show a warning in the portal for the WebJob if we detect you have continuous WebJobs but AlwaysOn is not enabled. The warning for this will appear by the AlwaysOn setting on the settings page for your WebApp.