Stackify NLog intermittent logging

622 Views Asked by At

We have an application where we have several Azure WebJobs set up, and we have NLog set up to report on the entry and exit of those webjobs. We have NLog hooked into Stackify so that we can use the Stackify logger to watch things move from WebJob to WebJob. The logging works (so we know the configuration is OK). The problem we are hitting is that the logging is intermittent - we have independent confirmation of messages travelling along several WebJobs, but we will only get partial logging. For example, the first log will record that a message was read off of it, but won't show the exit message. The next log will show the entry and exit. The problem is that they all leverage the exact same code, so the same log message should appear every single time. Has anyone else encountered this issue before? Before you answer, we have taken these diagnostic steps:

(1) We have verified the proper configuration settings in all webjob config files.

(2) We have verified that all of the webjob config files are properly posted in Azure.

(3) We have independently verified (via the webjob consoles in Azure) that the messages are getting to the different webjobs, processing, and then moving on to the next webjob, meaning that they would be invoking the logging code.

(4) We have verified that our Stackify log data is not being throttled (this might have made sense if the logs had simply stopped - but the fact that we just get partial logging in some cases and full logging in others during a single message transmission makes this unlikely).

(5) We have pored over the Stackify documentation, which seems to ensure that configuration is set up properly so that you get logging at all. It covers the all-or-nothing scenario, but not the some scenario.

I will be happy to provide more clarification as needed.

3

There are 3 best solutions below

0
On BEST ANSWER

We discovered what is going on, and it is rather strange. The issue ended up being with our use of NLog.Targets.Stackify. We were using version 1.25.4. When I ran a test using the NLog debugger suggested by Julian, this is what showed up:

2017-01-09 10:14:43.5079 Info Loading assembly name: NLog.Targets.Stackify
2017-01-09 10:14:43.5449 Debug ScanAssembly('NLog, Version=4.0.0.0, Culture=neutral, PublicKeyToken=5120e14c03d0593c')
2017-01-09 10:14:43.5929 Debug Start auto loading, location: C:\Dev\AffinityMain\platform\Integrity.WebJob.Rating\bin\Debug
2017-01-09 10:14:43.5929 Info Auto loading assembly file: C:\Dev\AffinityMain\platform\Integrity.WebJob.Rating\bin\Debug\NLog.Targets.Stackify.dll
2017-01-09 10:14:43.6039 Info NLog.Targets.Stackify, Version=1.18.6200.39247, Culture=neutral, PublicKeyToken=null. File version: 1.18.*. Product version: 1.25.4.
2017-01-09 10:14:43.6039 Debug ScanAssembly('NLog.Targets.Stackify, Version=1.18.6200.39247, Culture=neutral, PublicKeyToken=null')
2017-01-09 10:14:43.6249 Warn Type load exception. Exception: System.IO.FileLoadException: Could not load file or assembly 'NLog, Version=5.0.0.0, Culture=neutral, PublicKeyToken=5120e14c03d0593c' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)
File name: 'NLog, Version=5.0.0.0, Culture=neutral, PublicKeyToken=5120e14c03d0593c' 

This was followed shortly by this error when it was parsing the config:

System.ArgumentException: Target cannot be found: 'StackifyTarget'

The documentation indicates that version 4.4 of NLog is sufficient to run this library. However, the NLog debugger indicates that it is looking for version 5.0, and not finding it (which it wouldn't because we aren't using it). Now, what makes this strange is that it did work at one time, with this version, so somehow a reference to NLog 5.0 is stuck somewhere in the system, but we can't find it because everything in our solution is running NLog 4.4. We've checked the csproj, the packages config, the app config, the actual installed nuget packages - no reference to version 5.0.

The answer to the problem was to downgrade to 1.25.3. As soon as I did that, it worked. I then tried to bump it back to 1.25.4, and had the same problem again. Matt - to your point about Shutdown - we are going to add that to our webjobs, and then I will monitor to see if we are seeing complete logs. Thank you everyone for your suggestions!

1
On

Could be nice with an example of the NLog-config. If using the async-wrapper, then by default the overflow action is to discard "random" messages.

1
On

You should be able to fix this problem by adding one line of code at the end of your application to StackifyLib to flush.

StackifyLib.Logger.Shutdown();

https://github.com/stackify/stackify-api-dotnet/blob/master/README.md

I hope this fixes it for you, if not, please contact Stackify support.