In a distributed SAAS platform where you are processing millions of events every hour, it is expected that the response of processing of each event happens in the order of a few milliseconds or even lesser. To achieve a high level of throughput under high volume, relying on only traditional databases (either Relational or Non-Relational) is difficult. More often than not, database operations involve the persistence of data and `the IO leads to an increase in the response time (even if it is by a few milliseconds). That is where caches come to your rescue. Let's start with the basics.
What is a cache?
For the uninitiated, a cache is temporary storage often used to hold frequently needed information as opposed to fetching it from the original storage where it is stored permanently or at least for a relatively long time period. A simple example of a cache in your day-to-day life is when you have a party in your house and you buy 10 bottles of cold drinks and keep it in your refrigerator as opposed to going to a store or a factory outlet every time you are run out of cold drinks in the party.
Why are caches used?
In the above example, all you are trying to do is keep the items (cold drinks) closer to where you need to use them (in the party) and not invest time and resources in fetching them from long-term storage (a grocery store). Another application of cache is where a net computed result can be cached to avoid repetitive computations. If the operations that yield a result are costly and don't change frequently then you can always look to cache it to avoid the same computations happening multiple times. An example could be that if you are putting up a Student Dashboard indicating their CGPA, you can don't have to compute the net CGPA of all the students by going through grades of students across all subjects every time your application is loaded. You can make the calculations once and then save the resultant data in a cache for your Student Dashboard. Now that have established that we can use caches for faster access as well as avoiding repetitive computation. Let's look at distributed caching.
What is Distributed caching?
In a distributed system, you have your application servers spread across multiple Virtual Machines (VMs) and sometimes across geographies. For these applications to get data faster, you need the cache also to be distributed. It means that you need to have multiple servers where you cache the data closer to the application server itself so that the data can be fetched quickly by the application. Of courses things like servers going down, not reachable, etc need to be handled by your distributed caching system.
How does Distributed Caching work?
Distributed caches work by building larger arrays VMs with RAM (In-Memory compute). They do share the intelligence about what caches for which region are being managed by which servers. They also build an active communication between the nodes in the cluster to ensure that if one of the servers goes down, another server takes its place and continues to server requests. These are typically active-passive node setups where the active nodes continue to take the data volumes and load. They do asynchronously communicate with passive nodes and replicate based on configurations. While the specific way of distribution of nodes, caching and data division may vary based on the distributed cache implementation you pick, the general underlying concept remains.