Distributed caching provides unmatched possibilities for improving scalability and concurrence of Java web applications.
Scalability is an ability of an application to process more requests by adding computational resources. There are two types of scalability: vertical and horizontal.
Vertical scalability is achieved by adding more computational resources, such as RAM and CPUs, to a single computer system. Vertical scalability, while easy to achieve, is limited. As of this writing, a single reasonably priced computer system can be scaled only up to 64GB of RAM, 8 CPU cores, 10Gbit/s network I/O.
As the demand grows, a Java web application needs more computing power and network bandwidth than single box can provide. The application has to scale horizontally. Horizontal scalability is an ability to process more requests by adding computer systems to a set of computer systems running the application and responsible for processing requests. This set of computer systems is also known as a cluster.
Applications architects are familiar with a phenomenon that adding more nodes to a cluster causes the ability of the application to process more requests to drop instead of grow. An application that does not respond with increased ability to process request by adding more server to the cluster is known as not scalable. An application that does not scale can only process request as fast as it slowest resource.
The main source of scalability problems is an external bottleneck. A bottleneck is a resource for that an access should be serialized. When the bottleneck is present, members of the cluster are waiting for obtaining a right to operate on the resource because it can be accessed only by a single cluster node at a time. Bottlenecks that often affect web application are:
Distributed caching provides a solution for the problem of horizontal scalability by allowing a web application to avoid accessing a slow serialized data source and instead to get the data from the fast local memory. Because access to the external bottleneck is removed, the application can scale linearly, just by adding more nodes to the cluster.
A Java web application using a distributed cache stores frequently accessed data such as results of requests to a transactional database in a large coherent cache. The distributed cache evenly spreads its data across the cluster. Even distribution of cached data among the members of the cluster balances the load on each member.
A distributed caching system maintains a fast local cache, also known as a front cache, on each cluster member to speed up access to the cached data that resides on a remote node and to reduce use of network bandwidth. Cache coherence provided by the distributed caching system ensures that all members of the cluster have a consistent view of the data despite of local caching.
Java web applications often use distributed caching to cache the following data:
Once the opportunities for vertical scalability are exhausted, the distributed caching allows to scale Java web applications horizontally.