Saturday, June 23, 2012

Filtering Filters

Hello ! (Who needs a greeting anyway), distributed queries are a very powerful part of coherence, but can introduce show stopping performance degradation, so it is vital to get your filters right.

The filters provided by coherence, take an Extractor as a parameter, and will execute that extractor across all of the nodes, and the evaluate method will run on the result of that extractor.
This means that, if you are using a normal reflexion extractor(what is used by default when you call a filter with just a method name) new EqualsFilter(“getName”,”some_name”). This is quite slow and Inefficient,  as every object will be deserialized in order to evaluate that filter. There are two ways to avoid this,

The simplest was to avoid this is to use an Index to store those values. indexed values are kept unserialized and are matched in the same was as regular db indexes. The Filter classes provided by coherence implement IndexAwareFilter, and know how to lookup those indexes, therefore avoiding the deserialization.

The other way is to simply pass a POF extractor as a parameter, POF extractors do not deserialize an object In order to extract the desired value. However this approach is slower than having Indexes, as each object will still be evaluated, as opposed to simply looking up the Index table. However, this provides more flexibility, as you can decide at run time which fields will be used.

Custom Filters:
Sometimes you need more complex logic than the custom filters provide you, so you may decide to code your own, the simplest way to do that is to simply implement the Filter interface, however, a filter created in this manner will not use indexes even if they are relevant and will deserialize every object in that particular cache resulting in catastrophic performance.

There are a few ways to create a custom filter efficiently:

Implementing EntryFilter instead of Filter allows you to implement the evaluateEntry Method, which receives a BinaryEntry as a parameter, allowing you to extract the values yourself in a clever manner, avoiding deserializing the whole entry, however the extraction will still occur for every entry, which can be time consuming.

Extending ExtractorFilter allows you to, through its superclasse’s constructor, to take advantange of the created indexes and, by implementing the evaluateExtracted method to write your complex custom logic to evaluate that Object.


Cheers,