Data requests like database queries, extensive computations, file reads or report composition, come at a dear price of high latency times. Caching APIs can lower that price, yielding a drop in latencies, considerable increases in performance, improved concurrency and scalability plus savings in bandwidth and a reduction in server costs. This article explores the core concepts of Java application improvement through caching.
A cache is an area of local memory that holds a copy of frequently accessed data that is otherwise expensive to get or compute. A typical interface for a Java cache provides access to the data using a unique key:
public interface Cache<K, V> { V get(K key); V put(K key, V value); }
In our cache example, we suppose that the cache doesn't permit null values. So get(...) will return the value mapped to the key, or null if this cache doesn't contain that key. put(...) puts both its parameters, key and value, into the cache and returns the value that was previously mapped to that key, provided that the key was already present in the cache. If the cache didn't contain that key, put(...) returns null.
Just like your web browser stores web pages in a local cache to prevent duplicate requests to the web server each time you revisit one of these pages, similarly a caching API can improve the performance of many Java applications, leveraging the same principle of storing data in a cache, implemented as a block of local memory, so future requests for that data can be served faster.
Caching may provide significant performance improvement for a Java application, often in orders of large magnitude. The performance improvement comes from serving hard to get data from the local memory of the application.
For example, consider an application that shows a 50-line system status report displayed on a web page after the user logs into the system. For each line it sends a complex SQL query to a database. To execute each query and to fetch results over the network takes 100 milliseconds on average. The total average time to collect all data for the page is about 5 seconds. Getting the same results from a cache takes about 5 microseconds on a modern 2GHz CPU. The performance improvement for this particular use scenario is 1,000,000!
While caching offers a lot of benefits, it has some disadvantages. A cache takes up memory and it CPU time to move entries to and from the cache. Caching doesn't work too well for data that is updated often because the expense of maintaining the cache is higher than the time saved on infrequent reads.
Common cache use scenarios include:
An application cache is a cache that an application accesses directly. An application benefits from using a cache by keeping most frequently accessed data in memory.
The following communication diagram illustrates an application cache usage:
One of the major use scenarios for a cache is a level-2 (L2) cache . An L2 cache provides caching services to an object-relational mapping (ORM) framework or a data mapping (DM) framework such as Hibernate or iBatis respectively. An L2 cache hides the complexity of the caching logic from an application.
An L2 cache improves performance of an ORM or DM framework by reducing unnecessary trips to the database. .
The following communication diagram illustrates using an L2 cache:
The application does not access cache directly in this use scenario. Instead, the application utilizes a high level interface provided by an ORM or a DM framework. The framework uses cache for caching its internal data structures such as mapped objects and database query results. If the cached data is not available, the framework retrieves it from the database and puts it into the cache.
A data grid is a reliable distributed cache that uses an external data source to retrieve data that is not present in the cache and an external data store to write the updates. An application using a data grid benefits from simplified programming of cache access.
This use scenario is different from the application or the second-level cache when an application or a data access framework is responsible for populating the cache in case of cache misses.
The following communication diagram illustrates a data grid usage:
A cache uses a part of the application's memory. That is why the size of the cache has to be small. In order to benefit from caching, the access to data should display properties of temporal and spatial locality.
The data from the example in the beginning of the article satisfies the requirement of temporal and spatial locality . Users log into the system around the same time and the number of items from the reports that are accessed in rapid succession is small.
Data that does not satisfy the requirement of temporal and spatial locality of access leads to faster eviction of cache entries and as a result will lower the number of cache hits and increased cost of maintaining the cache.
The main performance characteristic of a cache is a hit/miss ratio. Suppose an application requests data by submitting a particular key. If this key (and the corresponding data) can be found in the cache, we made a "hit". Alternatively, if the cache doesn't contain this key, we got a "miss", and the data must be fetched from our data source, returned to the application and copied into the cache, allowing faster access, in case this data are requested again.
The hit/miss ratio is calculated as number of cache hits divided by number of cache misses accumulated over a period of time. A high hit/miss ratio means that a cache is performing well. A low hit/miss ratio means that the cache is applied to data that should not be cached. Also, the low hit/miss ratio may mean that a cache is too small to capture temporal locality of data access. From our experience, a hit/miss ratio less than 60-70%% is often considered low.
A cache eviction policy is an algorithm according to which an existing element is removed from a cache when a new element is added. The eviction policy is applied to ensure that the size of the cache does not exceed a maximum size. Least Recently Used (LRU) is one of the most popular among a number of eviction policies. LRU earned its popularity for being the best in capturing temporal and spatial locality of data access.
A minor disadvantage of LRU is its sensitivity to a full scan. The sensitivity manifests itself in evicting accumulated frequently accessed cache elements when accessing data that does not satisfy the requirement of temporal locality. This disadvantage is minor because LRU recovers from full scans quickly.
A typical implementation of a cache with the LRU eviction policy consists of a map and a linked list. The map stores cached elements. The linked list keeps tracks of the least recently used cache elements. When a cache element is updated, it is removed from the list and added to the top of the list. The new elements are added to the top of the list as well. If the cache grows bigger than its maximum size, an element is removed from the bottom of the list and from the map. This way the least recently used elements are evicted first.
While simple cache in Java can be implemented in a few lines of code, it would be missing important features that a cache needs to have in order to be usable in a real application such as concurrency and cache size management. Picking a production-grade Open Source cache such as Cacheonix maybe a better choice.
Putting and getting an object from the cache is simple. The following example uses our Open Source Java cache Cacheonix:
final Cacheonix cacheManager = Cacheonix.getInstance(); final Cache<String, String> cache = cacheManager.getCache("invoce.cache"); // Put object to the cache final String key = "key"; final String value = "value"; cache.put(key, value); // Get object from the cache final String myObject = cache.get(key);
Caching provides such a great improvement of performance that it is often used without limit. An anti-pattern Cache Them All is characterized by caching all data, without regard to temporal or spatial locality of data access.
Cache Them All degrades application performance instead of improving it. The degradation of performance is caused by the overhead of maintaining a cache without benefiting from reduced cost of accessing frequently used data.
To avoid the pitfall of the Cache Them All, only data that is hard to get and shows temporal and spatial locality of access should be cached.
If you are looking for a quality cache API, consider Cacheonix. Cacheonix is an Open Source Java cache that provides a fast local cache and a strictly-consistent distributed cache. Cacheonix is being developed actively. The latest Cacheonix version 2.3.1 released on May 26, 2017 offers caching web requests for Java web applications.
To add Cacheonix to your Maven project, add the following to the <dependencies> section of your pom.xml:
<dependency> <groupId>org.cacheonix</groupId> <artifactId>cacheonix-core</artifactId> <version>2.3.1</version> <dependency>Please visit Cacheonix wiki to learn about configuring and programing with Cacheonix.
Adding a caching to your Java application can yield a drop in latency, considerable increase in performance, improved concurrency and scalability plus savings in bandwidth and a reduction in server costs. Avoiding common pitfalls such as caching frequently-updated, rarely-read data will help to maintain a healthy cache.