Skip to main content

Caching And Load Balancing

Caching

What is it?

Caching is the process of storing copies of data in a temporary storage (called a cache) so that future requests for that data can be served faster.

Why use it?

To reduce the time it takes to retrieve frequently accessed data and decrease the load on your backend systems (like databases or APIs).

Real-world analogy:

Think of it like saving a shortcut to your favorite app on your phone’s home screen—you don’t have to dig through menus every time.

Example:

  • A web browser caches images and files from websites you visit, so the next time you visit, it loads faster.
  • A server might cache database results so it doesn’t have to run the same expensive query again.

Cache Hit

Occurs when the requested data is found in the cache.

It improves performance since the system doesn't need to fetch the data from the original (usually slower) source.

Example: A webpage loads faster the second time because assets are already cached in the browser.

Cache Miss

Happens when the requested data is not in the cache.

The system then fetches the data from the main source (e.g., database, server) and usually stores it in the cache for future use.

Impact: Slower response time compared to a cache hit.

Cache Eviction

When the cache is full, some data needs to be removed to make space for new data.

This removal process is called eviction, and it's usually based on eviction policies.

Common Eviction Policies:

  1. LRU (Least Recently Used): Evicts the data that hasn’t been accessed for the longest time.

  2. LFU (Least Frequently Used): Removes the data that’s accessed the least.

  3. FIFO (First In, First Out): Evicts the oldest data added to the cache.

When to Use Cache

Caching is ideal when:

Data is Static or Rarely Changes

Why? Because re-fetching the same data repeatedly from the backend or database is wasteful if the data doesn't change often.

Examples:

  • Product listings that update once a day.

  • Static content like blog posts, FAQs, category lists.

  • Configuration files or global settings.

Benefits:

  • Improved Performance: Faster response times since data is served directly from memory or edge cache.

  • Reduced Server Load: Fewer database or API calls, leading to lower operational costs.

  • Better User Experience: Especially important for high-traffic applications.

Avoid Caching When:
  • Data changes frequently or is user-specific (e.g., banking balance, live scores).
  • Real-time accuracy is critical.

Write-Through vs Write-Back Cache

FeatureWrite-Through CacheWrite-Back Cache
DefinitionData is written to cache and to the database at the same time.Data is written only to cache first, and written to the database later (lazily).
Data ConsistencyHigh consistency between cache and DBPossible inconsistency if data isn’t flushed in time
PerformanceSlightly slower (due to double-write)Faster writes (cache only, initially)
RiskLow risk of data lossHigher risk if cache fails before write-back
Use CaseGood for read-heavy and critical-write systemsGood for write-heavy systems with tolerant consistency
ExampleUser profile update in banking appLogging system or analytics buffer

Write-Through Cache

  • Every time you write, both the cache and database are updated.
  • Ensures that cache always has the latest data.

Write-Back Cache (a.k.a Write-Behind)

  • Data is first written to the cache only.
  • Actual write to the database is deferred or batched.
  • Requires a background process to flush changes.

What is Write-Around Cache?

Write-Around Cache is a caching strategy where writes are made directly to the database, bypassing the cache entirely. The cache is only updated when a read occurs after the write.

FeatureWrite-Around Cache
Write BehaviorWrite goes directly to DB, not to cache
Cache UpdateHappens on next read if the data isn't in cache
Read MissesHigher possibility of cache misses immediately after a write
Data FreshnessCache might be stale until read refreshes it
Use CaseUseful when data is infrequently read after write

When to Use Write Around Cache

When most written data is not immediately read.

To reduce cache churn for write-heavy but read-light scenarios.

If you're okay with slightly stale reads on first access.

Comparison:

StrategyWrite to CacheWrite to DBRead Flow
Write-Through✅ Yes✅ YesAlways hits fresh cache
Write-Back✅ Yes🔄 LaterFast write, delayed DB sync
Write-Around❌ No✅ YesCache updated only on next read

What is Cache Coherency?

Cache Coherency (or Cache Consistency) is the concept of keeping multiple copies of data in sync across different caches or between cache and the main memory (or database). It ensures that every reader sees the most recent write, regardless of where it's reading from.

Why It Matters In distributed systems or multi-core CPUs, multiple caches may store the same data. If one cache updates the data, other caches must reflect that change to avoid stale reads. Cache coherency prevents:

  • Reading outdated data
  • Data conflicts
  • Unexpected behavior in concurrent systems
How to Maintain Cache Coherency
  • Write-through cache: Ensures DB and cache are always in sync.
  • Eviction and invalidation policies: Remove or update stale entries on change.
  • Pub/Sub or Event systems: Notify all caches when a change occurs.
  • Distributed cache frameworks: Like Redis or Memcached with built-in coherency mechanisms.
Summary
TermMeaning
Cache CoherencyKeeping all cached copies of a data item consistent across systems
When NeededMulti-cache setups, distributed apps, multi-threaded environments
Common FixesInvalidation, TTLs, write-through, event-based updates, strong consistency models

Redis vs Memcached – Comparison Table

Feature / CriteriaRedisMemcached
Data TypesStrings, Lists, Sets, Hashes, Sorted Sets, Streams, Bitmaps, HyperLogLogsStrings only
Persistence✅ Supports persistence (RDB & AOF)❌ No persistence
Replication✅ Master-Slave replication❌ No built-in replication
Cluster Support✅ Redis Cluster for sharding✅ Limited support via client-side
Pub/Sub✅ Built-in Pub/Sub support❌ Not supported
Performance⚡ High (slightly slower than Memcached for pure string operations)⚡ Very high (especially for simple key-value)
Memory EfficiencySlightly more memory-heavy (due to rich data types)Very efficient memory usage
Eviction Policy✅ Multiple eviction strategies (LRU, LFU, TTL, etc.)✅ LRU supported
Use Case FitAdvanced caching, real-time analytics, counters, queues, sessionsSimple key-value caching
Transactions✅ Supports transactions via MULTI/EXEC❌ Not supported
Atomic Operations✅ Rich atomic operations on data types✅ Basic atomic operations
Data Size per KeyUp to 512MB1MB (can be configured higher)
LicenseBSD Open SourceBSD Open Source
When to Use Redis
  • You need data persistence along with caching.
  • You want to store complex data structures (e.g., lists, sets).
  • You need Pub/Sub messaging.
  • You need real-time counters, leaderboards, or task queues.
  • You require clustering and replication.
When to Use Memcached
  • You need extremely fast and lightweight caching for simple strings.
  • You have limited memory and want the most efficient cache.
  • You don’t need persistence or complex data types.
  • Your cache is short-lived and only used to relieve DB read pressure.

Load Balancing

In this section, we need to discuss 3 things.

  1. Forward proxy
  2. Reverse Proxy
  3. Load balancing
Forward Proxy (Client-Side Proxy)
  • Used by clients to access external networks.
  • Hides the client’s identity from the server.
  • Helps enforce access policies, filter traffic, and cache responses.

Example: A school computer lab uses a forward proxy to filter websites and restrict access to social media.

Reverse Proxy (Server-Side Proxy)
  • Used by servers to receive and manage incoming client requests.

  • Hides the actual backend servers from clients.

  • Can perform:

    • Load balancing
    • SSL termination
    • Caching & compression
    • Application firewalling

Example: A company’s website routes all user traffic through Nginx as a reverse proxy, which forwards the requests to one of several backend servers.

Load Balancing: What is it?

Load balancing is the process of distributing incoming traffic across multiple servers to ensure no single server gets overwhelmed.

Why use it?

To improve system availability, scalability, and fault tolerance.

Real-world analogy:

Imagine a popular restaurant with multiple waiters. If all customers go to one waiter, service slows down. A host at the door (the load balancer) directs each new customer to the waiter with the least work.

Example:

  • A website that gets millions of visitors daily might use a load balancer to split traffic between five web servers, so none of them crashes or slows down.

How They Work Together

  • Caching helps speed up access to frequently used data.
  • Load balancing makes sure user traffic is evenly spread across multiple systems.

Used together, they help websites and apps run faster, smoother, and more reliably.

Another explanation- Load Balancer (Specialized Reverse Proxy)
  • Type of reverse proxy focused specifically on distributing traffic.
  • Ensures:
    • High availability
    • Redundancy
    • Scalability
  • Supports strategies like:
    • Round Robin
    • Least Connections
    • IP Hash

Example: An e-commerce site uses a load balancer to distribute traffic between multiple backend application servers during peak sales.

Forward Proxy vs Reverse Proxy vs Load Balancer

FeatureForward ProxyReverse ProxyLoad Balancer
Position in NetworkBetween client and external serverBetween external client and internal serversSits between clients and multiple backend servers
PurposeActs on behalf of the client to access outside resourcesActs on behalf of the server to handle client requestsDistributes incoming traffic across multiple backend servers
Client AwarenessClients know they are using a proxyClients do not know they are using a reverse proxyClients typically unaware of load balancing mechanics
Use Cases- Access control - Caching - Anonymity - Monitoring client traffic- SSL termination - Caching - Compression - Application firewall- Scaling web servers - High availability - Fault tolerance
Common ToolsSquid, PrivoxyNginx, Apache, HAProxyHAProxy, Nginx, AWS ELB, Google Load Balancer