Earlier today I was debugging and chasing down a strange behavior in an ServiceBus triggered WebJob in Azure. Looking at the logs I could tell that the WebJob was triggered when a new message arrived, but after that it seemed like it was idle for quite some time before starting to process the message. Since it took (quite) some time the WebJob lost the lock on the message, and it went back into the queue for a second go at processing the same thing.
I initially thought it was due to some kind of intermittent issue with the ServiceBus, that it might be serving the contents of the message slowly and that made the WebJob look like it was idling before the lock expired.
Since I couldn’t track the issue down all the way, I introduced a AutoRenewTimeout policy on an arbitrary number of minutes and started to follow the messages more closely in the logs. What I found was really interesting: The WebJob triggered on a couple of messages that came in at the same time, but after that it started to process the messages one by one. That means that the last couple of messages had their locks lost because they were claimed way before the actual processing began.
Going back to the code I noticed that the WebJob had the Singleton attribute which guarantees that only one instance of the WebJobs runs at the same time. For some, still unknown, reason it honors this and only runs one instance, but at the same time also honors the MaxConcurrentCalls value. So what was happening was that the WebJob first claimed a lock on 16 messages, and them began processing them one by one. No wonder the lock had expired on the last couple of messages…
When the singleton attribute was introduced in the code it was probably the MaxConcurrentCalls value that they were looking for.
So, a word of caution: The singleton attribute, used by itself without other configuration updates, might give you some weird results that aren’t actually what you’re after.