Caching And Load Balancing

Caching

What is it?

Caching is the process of storing copies of data in a temporary storage (called a cache) so that future requests for that data can be served faster.

Why use it?

To reduce the time it takes to retrieve frequently accessed data and decrease the load on your backend systems (like databases or APIs).

Real-world analogy:

Think of it like saving a shortcut to your favorite app on your phone’s home screen—you don’t have to dig through menus every time.

Example:

A web browser caches images and files from websites you visit, so the next time you visit, it loads faster.
A server might cache database results so it doesn’t have to run the same expensive query again.

Cache Hit

Occurs when the requested data is found in the cache.

It improves performance since the system doesn't need to fetch the data from the original (usually slower) source.

Example: A webpage loads faster the second time because assets are already cached in the browser.

Cache Miss

Happens when the requested data is not in the cache.

The system then fetches the data from the main source (e.g., database, server) and usually stores it in the cache for future use.

Impact: Slower response time compared to a cache hit.

Cache Eviction

When the cache is full, some data needs to be removed to make space for new data.

This removal process is called eviction, and it's usually based on eviction policies.

Common Eviction Policies:

LRU (Least Recently Used): Evicts the data that hasn’t been accessed for the longest time.
LFU (Least Frequently Used): Removes the data that’s accessed the least.
FIFO (First In, First Out): Evicts the oldest data added to the cache.

When to Use Cache

Caching is ideal when:

Data is Static or Rarely Changes

Why? Because re-fetching the same data repeatedly from the backend or database is wasteful if the data doesn't change often.

Examples:

Product listings that update once a day.
Static content like blog posts, FAQs, category lists.
Configuration files or global settings.

Benefits:

Improved Performance: Faster response times since data is served directly from memory or edge cache.
Reduced Server Load: Fewer database or API calls, leading to lower operational costs.
Better User Experience: Especially important for high-traffic applications.

Avoid Caching When:

Data changes frequently or is user-specific (e.g., banking balance, live scores).
Real-time accuracy is critical.

Write-Through vs Write-Back Cache

Feature	Write-Through Cache	Write-Back Cache
Definition	Data is written to cache and to the database at the same time.	Data is written only to cache first, and written to the database later (lazily).
Data Consistency	High consistency between cache and DB	Possible inconsistency if data isn’t flushed in time
Performance	Slightly slower (due to double-write)	Faster writes (cache only, initially)
Risk	Low risk of data loss	Higher risk if cache fails before write-back
Use Case	Good for read-heavy and critical-write systems	Good for write-heavy systems with tolerant consistency
Example	User profile update in banking app	Logging system or analytics buffer

Write-Through Cache

Every time you write, both the cache and database are updated.
Ensures that cache always has the latest data.

Write-Back Cache (a.k.a Write-Behind)

Data is first written to the cache only.
Actual write to the database is deferred or batched.
Requires a background process to flush changes.

What is Write-Around Cache?

Write-Around Cache is a caching strategy where writes are made directly to the database, bypassing the cache entirely. The cache is only updated when a read occurs after the write.

Feature	Write-Around Cache
Write Behavior	Write goes directly to DB, not to cache
Cache Update	Happens on next read if the data isn't in cache
Read Misses	Higher possibility of cache misses immediately after a write
Data Freshness	Cache might be stale until read refreshes it
Use Case	Useful when data is infrequently read after write

When to Use Write Around Cache

When most written data is not immediately read.

To reduce cache churn for write-heavy but read-light scenarios.

If you're okay with slightly stale reads on first access.

Comparison:

Strategy	Write to Cache	Write to DB	Read Flow
Write-Through	✅ Yes	✅ Yes	Always hits fresh cache
Write-Back	✅ Yes	🔄 Later	Fast write, delayed DB sync
Write-Around	❌ No	✅ Yes	Cache updated only on next read

What is Cache Coherency?

Cache Coherency (or Cache Consistency) is the concept of keeping multiple copies of data in sync across different caches or between cache and the main memory (or database). It ensures that every reader sees the most recent write, regardless of where it's reading from.

Why It Matters In distributed systems or multi-core CPUs, multiple caches may store the same data. If one cache updates the data, other caches must reflect that change to avoid stale reads. Cache coherency prevents:

Reading outdated data
Data conflicts
Unexpected behavior in concurrent systems

How to Maintain Cache Coherency

Write-through cache: Ensures DB and cache are always in sync.
Eviction and invalidation policies: Remove or update stale entries on change.
Pub/Sub or Event systems: Notify all caches when a change occurs.
Distributed cache frameworks: Like Redis or Memcached with built-in coherency mechanisms.

Summary

Term	Meaning
Cache Coherency	Keeping all cached copies of a data item consistent across systems
When Needed	Multi-cache setups, distributed apps, multi-threaded environments
Common Fixes	Invalidation, TTLs, write-through, event-based updates, strong consistency models

Redis vs Memcached – Comparison Table

Feature / Criteria	Redis	Memcached
Data Types	Strings, Lists, Sets, Hashes, Sorted Sets, Streams, Bitmaps, HyperLogLogs	Strings only
Persistence	✅ Supports persistence (RDB & AOF)	❌ No persistence
Replication	✅ Master-Slave replication	❌ No built-in replication
Cluster Support	✅ Redis Cluster for sharding	✅ Limited support via client-side
Pub/Sub	✅ Built-in Pub/Sub support	❌ Not supported
Performance	⚡ High (slightly slower than Memcached for pure string operations)	⚡ Very high (especially for simple key-value)
Memory Efficiency	Slightly more memory-heavy (due to rich data types)	Very efficient memory usage
Eviction Policy	✅ Multiple eviction strategies (LRU, LFU, TTL, etc.)	✅ LRU supported
Use Case Fit	Advanced caching, real-time analytics, counters, queues, sessions	Simple key-value caching
Transactions	✅ Supports transactions via MULTI/EXEC	❌ Not supported
Atomic Operations	✅ Rich atomic operations on data types	✅ Basic atomic operations
Data Size per Key	Up to 512MB	1MB (can be configured higher)
License	BSD Open Source	BSD Open Source

When to Use Redis

You need data persistence along with caching.
You want to store complex data structures (e.g., lists, sets).
You need Pub/Sub messaging.
You need real-time counters, leaderboards, or task queues.
You require clustering and replication.

When to Use Memcached

You need extremely fast and lightweight caching for simple strings.
You have limited memory and want the most efficient cache.
You don’t need persistence or complex data types.
Your cache is short-lived and only used to relieve DB read pressure.

Load Balancing

In this section, we need to discuss 3 things.

Forward proxy
Reverse Proxy
Load balancing

Forward Proxy (Client-Side Proxy)

Used by clients to access external networks.
Hides the client’s identity from the server.
Helps enforce access policies, filter traffic, and cache responses.

Example: A school computer lab uses a forward proxy to filter websites and restrict access to social media.

Reverse Proxy (Server-Side Proxy)

Used by servers to receive and manage incoming client requests.
Hides the actual backend servers from clients.
Can perform:
- Load balancing
- SSL termination
- Caching & compression
- Application firewalling

Example: A company’s website routes all user traffic through Nginx as a reverse proxy, which forwards the requests to one of several backend servers.

Load Balancing: What is it?

Load balancing is the process of distributing incoming traffic across multiple servers to ensure no single server gets overwhelmed.

Why use it?

To improve system availability, scalability, and fault tolerance.

Real-world analogy:

Imagine a popular restaurant with multiple waiters. If all customers go to one waiter, service slows down. A host at the door (the load balancer) directs each new customer to the waiter with the least work.

Example:

A website that gets millions of visitors daily might use a load balancer to split traffic between five web servers, so none of them crashes or slows down.

How They Work Together

Caching helps speed up access to frequently used data.
Load balancing makes sure user traffic is evenly spread across multiple systems.

Used together, they help websites and apps run faster, smoother, and more reliably.

Another explanation- Load Balancer (Specialized Reverse Proxy)

Type of reverse proxy focused specifically on distributing traffic.
Ensures:
- High availability
- Redundancy
- Scalability
Supports strategies like:
- Round Robin
- Least Connections
- IP Hash

Example: An e-commerce site uses a load balancer to distribute traffic between multiple backend application servers during peak sales.

Forward Proxy vs Reverse Proxy vs Load Balancer

Feature	Forward Proxy	Reverse Proxy	Load Balancer
Position in Network	Between client and external server	Between external client and internal servers	Sits between clients and multiple backend servers
Purpose	Acts on behalf of the client to access outside resources	Acts on behalf of the server to handle client requests	Distributes incoming traffic across multiple backend servers
Client Awareness	Clients know they are using a proxy	Clients do not know they are using a reverse proxy	Clients typically unaware of load balancing mechanics
Use Cases	- Access control - Caching - Anonymity - Monitoring client traffic	- SSL termination - Caching - Compression - Application firewall	- Scaling web servers - High availability - Fault tolerance
Common Tools	Squid, Privoxy	Nginx, Apache, HAProxy	HAProxy, Nginx, AWS ELB, Google Load Balancer

Caching​

Cache Hit​

Cache Miss​

Cache Eviction​

Common Eviction Policies:​

When to Use Cache​

Data is Static or Rarely Changes​

Avoid Caching When:​

Write-Through vs Write-Back Cache​

Write-Through Cache​

Write-Back Cache (a.k.a Write-Behind)​

What is Write-Around Cache?​

When to Use Write Around Cache​

Comparison:​

What is Cache Coherency?​

How to Maintain Cache Coherency​

Summary​

Redis vs Memcached – Comparison Table​

When to Use Redis​

When to Use Memcached​

Load Balancing​

Forward Proxy (Client-Side Proxy)​

Reverse Proxy (Server-Side Proxy)​

Load Balancing: What is it?​

How They Work Together​

Another explanation- Load Balancer (Specialized Reverse Proxy)​

Forward Proxy vs Reverse Proxy vs Load Balancer​