Java

Advanced Spring Boot Caching: Build a Two-Layer Redis and Caffeine Strategy

Learn advanced Spring Boot caching with Caffeine and Redis, using L1/L2 layers, TTL tuning, and cache eviction to cut latency fast.

Apr 13, 2026 by Aarav Joshi

Advanced Spring Boot Caching: Build a Two-Layer Redis and Caffeine Strategy

This morning, I watched one of our main APIs start to slow down. The database was groaning under a simple, repeated query—the kind of thing that should be fast. It hit me again: the quickest data to fetch is the data you already have. But what does “have” really mean in a system that runs in ten places at once? I want to talk about moving beyond the basics of caching. Let’s build something smarter.

Most of us start with a simple cache. We add an annotation, point it to Redis, and call it a day. It works, until your application needs to run on more than one server. Suddenly, every cache call, even for the same piece of data fetched a millisecond ago on another instance, requires a trip over the network. This adds up. You might ask, isn’t the remote cache supposed to solve latency? It does, but we can do better.

What if we could have the best of both worlds? Imagine a system where the first, incredibly fast check happens right inside your application’s memory. If the data isn’t there, then it goes to the shared, remote cache. This is a two-layer approach. The first layer, often called L1, is local. The second layer, L2, is distributed. The goal is to serve as many requests as possible from the blindingly fast local store, only falling back to the network when absolutely necessary.

Setting this up in Spring Boot means thinking about coordination. If my local cache holds data for 5 minutes, but the shared Redis cache holds it for 10, what happens after minute 6? My local cache is empty, but the Redis cache still has valid data. That’s a cache miss we could have avoided. The local time-to-live should be a fraction of the remote one. This ensures the local cache acts as a true, hot copy of the remote one, not a separate entity.

Let’s look at how you might configure the local cache, using Caffeine. It’s powerful and integrates well with Spring.

@Configuration
@EnableCaching
public class CacheConfig {

    @Bean
    public CacheManager cacheManager(RedisConnectionFactory connectionFactory) {
        CaffeineCacheManager caffeineCacheManager = new CaffeineCacheManager();
        caffeineCacheManager.setCaffeine(Caffeine.newBuilder()
            .initialCapacity(100)
            .maximumSize(500)
            .expireAfterWrite(Duration.ofMinutes(2)) // L1 TTL
            .recordStats());
        return caffeineCacheManager;
    }
}

Now, we need to combine this with a Redis cache manager. We don’t want to replace it; we want to layer it. We create a custom manager that checks the local store first. Only on a local miss does it call upon Redis. This requires a bit more code, but the logic is straightforward: check L1, then L2, then the database.

But here’s a tricky question: what happens when data changes? If I update a user’s profile on one application instance, that instance clears its local cache and updates Redis. However, a dozen other instances still have the old profile sitting in their local memory. They won’t know to clear it. How do we keep everyone in sync?

This is where distributed messaging comes in. When an instance updates a cached piece of data, it can publish an event. All other instances listen for this event and evict the stale data from their local caches. Redis has a built-in system for this called Pub/Sub.

@Service
public class CacheEvictionService {

    private final CacheManager localCacheManager;
    private final StringRedisTemplate redisTemplate;

    public void evictUser(Long userId) {
        // 1. Clear from local cache on *this* instance
        localCacheManager.getCache("users").evict(userId);

        // 2. Publish an event to tell other instances
        redisTemplate.convertAndSend("cache-eviction", "users:" + userId);
    }

    @EventListener
    public void onMessage(String cacheKeyPattern) {
        // 3. Other instances receive and clear their local cache
        localCacheManager.getCache(getCacheName(cacheKeyPattern)).evict(getKey(cacheKeyPattern));
    }
}

It adds complexity, but for data that changes often, it’s often necessary. For data that is mostly read, like a list of countries, you might skip this and just rely on the short TTLs.

Another challenge is serialization. Java objects need to be turned into bytes for Redis. The default Java serialization is slow and creates large payloads. A better choice is JSON. We can configure Spring Data Redis to use Jackson.

@Configuration
public class RedisConfig {

    @Bean
    public RedisTemplate<String, Object> redisTemplate(RedisConnectionFactory connectionFactory) {
        RedisTemplate<String, Object> template = new RedisTemplate<>();
        template.setConnectionFactory(connectionFactory);
        template.setKeySerializer(new StringRedisSerializer());
        template.setValueSerializer(new GenericJackson2JsonRedisSerializer());
        return template;
    }
}

For very large objects, you can even add compression. Before sending the JSON bytes to Redis, compress them. When reading them back, decompress. This saves network bandwidth and memory in Redis, at the cost of a small amount of CPU time. It’s a classic trade-off.

Finally, how do we know this is all working? We need to measure. Both Caffeine and Micrometer (Spring Boot’s metrics library) can provide statistics. How many hits are we getting on the local cache versus the remote one? What’s the eviction rate? Monitoring these numbers tells you if your cache sizes and TTLs are set correctly.

Caching is a tool, not a magic wand. A poorly designed cache can hide performance problems or, worse, introduce stale data bugs. But a thoughtful, multi-layered strategy is a cornerstone of high-performance systems. It respects the simple truth: time spent waiting for data is time wasted.

I hope this walkthrough gives you a practical starting point. Have you encountered a caching problem that seemed unsolvable? What strategies did you use? Share your thoughts in the comments below—let’s learn from each other’s battles. If you found this guide helpful, please like and share it with a colleague who might be facing a similar scaling challenge.

As a best-selling author, I invite you to explore my books on Amazon. Don’t forget to follow me on Medium and show your support. Thank you! Your support means the world!

101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!

Our Creations

Be sure to check out our creations:

We are on Medium

// keywords

Spring Boot cachingRedis cacheCaffeine cachetwo-layer cachingcache eviction

Advanced Spring Boot Caching: Build a Two-Layer Redis and Caffeine Strategy

101 Books

Our Creations

We are on Medium

More from our team

Keep Reading

Apache Kafka Spring Boot Integration Guide: Building Scalable Event-Driven Microservices Architecture 2024

Apache Kafka Spring Cloud Stream Integration: Build Scalable Event-Driven Microservices Architecture

Java 21 Virtual Threads and Structured Concurrency: Advanced Implementation Guide for High-Performance Applications

Apache Kafka Spring Integration: Build Scalable Event-Driven Microservices with Spring Boot Tutorial

Master Event Sourcing: Build High-Performance Systems with Spring Boot, Kafka and Event Store

How to Scale Spring Boot with ShardingSphere-JDBC Without Rewriting Your App