Saturday, August 25, 2012

Entry Processors part 2 - Deadlocking


This is Part two of my entry processor post and deals with the advantages and disadvantages of cross cache entry processors.

Cross cache Entry processors
As mentioned in the previous post entry processors allow us to update values in the most efficient manner, however, often you will want to update values in multiple caches, or update a value that is derived from related entries, to do this, all you would need to do is a call from CacheFactory.getCache,    from inside the entry processor, for instance:

By looking at the diagram we can see the problem straight away, as our code will perform an extra network hop to get one of the blue objects, nothing guarantees the object will be within the same node. So how can we solve this?

Coherence offers something called key association, which allows us to ensure that two or more related entries that share the same business key are co-located, this will require for both caches to be sharing a service. I have previously shown some benefits of key association in the versioning post, and the documentation is here, so, I’ll assume everyone is familiar with it, if not, take a look at the mechanism and i’ll wait and check my emails.
Welcome back, now, after we implement key association our cache would look like this:

Which is exactly what we wanted, but when we run the code we get this exception:
2012-08-07 11:49:41,510 ERROR [Logger@9252516 3.7.1.0] Coherence(3) - 2012-08-07 11:49:41.507/14.742 Oracle Coherence GE 3.7.1.0 <Error> (thread=DistributedCache:first-service, member=1): Assertion failed: poll() is a blocking call and cannot be called on the Service thread

This is very dangerous, because if we have multiple threads for the service instead of an exception you will get is the warning:
2012-08-07 11:40:50,040 WARN  [Logger@9243192 3.7.1.0] Coherence(3) - 2012-08-07 11:40:50.040/5.833 Oracle Coherence GE 3.7.1.0 <Warning> (thread=first-serviceWorker:1, member=1): Application code running on "first-service" service thread(s) should not call ensureCache as this may result in deadlock. The most common case is a CacheFactory call from a custom CacheStore implementation.

This happens because you are accessing the service from a thread from the same service,  and here is where the behavior starts getting tricky, as if you have two or more free threads it will work as expected, barring the exception, however if you only have one free thread or only one extra thread on your service, the execution of the entry processor will deadlock until being killed by the service guardian. You should never do this, so how should we go cross cache properly?

Direct backing map cross cache access 
As the title, which is a mouthful, explains we should access the entires directly in the backing. Because we are using key association we are sure the entries will be co-located, as explained in figure 2.



Entry Processors in coherence 3.7


















Entry Processors have gotten better in 3.7, as its transactionality and locking has been extended to include cross cache access, so all of the wonderful transactional guarantees and locking explained in my previous post are extended to cross cache access. This however creates quite a big problem, as you can potentially deadlock your application. It might not be clear at first glance, but the diagram below should make it clearer:

As all access locks, the following example will deadlock and the entry processor that was invoked last will fail due to deadlocking while the one who arrived first will continue its execution. This is a deceptively simple problem to avoid with good design… Creating and documenting the expected flow through the caches-  i.e. all cross cache entry processors should go from the first to the second cache, this way they will queue and provide the expected behavior from entry processors.

Cheers,

No comments:

Post a Comment