Tuesday, June 23, 2009

Cache Management in Java Applications

Cache Management in Java Application
IntroductionAn increasing the volume of business critical data leads to new challenges when we are developing an application with Java. The applications data is stored in data sources like database or legacy file system.
Frequent access for a same object is an overhead for an application’s performance. An application may slow down or even crash if it receives more simultaneous requests. Caching mechanism may address these challenges.
This article explains the ways for improving performance, concurrency and scalability in java application through caching mechanism.

Caching Mechanism
A cache is an area of local memory that holds a copy of an application data which is expensive to retrieve data from a data sources like the data base or legacy systems.Cached data can includes a result of a query to a database, a disk file or a report.
The cached data is identified by a unique key. The caching mechanism works based on the following simple algorithm
  1. An application requests data from a cache memory using a unique key.
    If that key is present in a cache, it returns the value of the key

  2. Otherwise, it will retrieve data from a data source and put it into the cache as a key-value pair
  3. The next request for the same key is serviced from the cache instead of accessing from the data source

  4. The cached data released from memory when it’s no longer required


In caching mechanism Hash Table, JNDI or EJB provides the way to stores an object in memory and perform object lookup using a unique key when it is required.
But these implementations do not have any algorithms for removing an object from the memory when it’s no longer required or automatically creating an object after expiration.
The following diagrams illustrates, the general structure of cache management system,



Figure1.General structure of Cache Management System

Benefits
  • It reduces frequent accesses to the database or other data sources, such as XML databases or ERP legacy systems

  • It avoids the cost of repeatedly re-creating an objects

  • It shares objects between threads in a process and between the processes

  • It frees up valuable system hardware and software resources by distributing data across an enterprise rather than storing it in one centralized place such as the data tier

  • Locally stored data directly solves the latency issues, reduces operating costs, and eliminates performance bottlenecks


Risks in cachingCaching may consume more memory in application server.
  • It may lead to data in-accuracies

  • In-appropriate caching management algorithm degrades an application performance


Data shouldn’t cache
  • Secure information that other users can access on a Website Personal information, such as account details

  • Business information that changes frequently and causes problems if not up-to-date and accurate

  • Session-specific data that may not be intended for access by other users


Use case scenario
  1. Application cache
    In an Application cache, an application access the data directly from a cache if that data is present otherwise the application retrieves the data from a data source and stores them in a local memory for future use
    This has a complexity to implement an algorithm for object removing.<
    The figure 1 is Illustrates the general structure of Application cache.


  2. Level 2 cache
    This provides the caching services to an object-mapping framework or data mapping frameworks such as Hibernate or iBatis respectively.
    This type of caches hides the complexity of caching logic from the application.

    The following diagrams illustrates, the general structure of level 2 cache management system,















    Figure.2.General structure of Level 2 Cache Management System


  3. Hybrid Cache
    A hybrid cache is a cache that uses an external data source to retrieve data that is not present in the cache. An application using a hybrid cache benefits from simplified programming of cache access.
    For example, Oracle has a caching mechanism.

    The following diagrams illustrates, the general structure of Hybrid cache management system,













    Figure.3.General structure of hybrid Cache Management System


Caching Algorithms
Cache requires heap memory for holding an application data. If these data are not used for a long time, holding these data in cache proves inefficiency. Because the cache capacity is limited,

An object removing from the cache based on the following criteria such as,
  • Least frequently used (LFU),

  • Least recently used (LRU),

  • Most recently used (MRU),

  • First in first out (FIFO),

  • Last access time and

  • Object Size based


Caching Frameworks
Here I listed some of the open source and commercial frameworks which is available in market.
Open Source:
  • Java Caching System (JCS)

  • OSCache

  • Java Object Cache (JOCache)

  • Java Caching Service, an open source implementation of the JCache API

  • SwarmCache

  • JBossCache

  • EHCache

  • ShiftOne

  • cache4j 
  • eclipselink



Commercial:
  • SpiritCache (from SpiritSoft)

  • Coherence (Oracle)

  • ObjectCache (ObjectStore)

  • Object Caching Service for Java (Oracle)


Conclusion
Effectively designed caching improves an application performance, scalability and availability. . The degradation of performance is caused by the overhead of maintaining a cache without benefiting from reduced cost of access to frequently used data.
To avoid the pitfall of the Cache all, only data that is hard to get from a data source.