I have been working with scalable systems for a few years already but all the knowledge that I got was from experience and also reading articles but not books so my goal when I was reading this book was to review the basic knowledge and see if I missed some basic knowledge.

This post of Learning is from a technical book, so it may be a bit different than the other posts of the collection Learnings from books, as when I read technical books I focus more on the parts that I looking to learn and for my current situation.

Learnings from the book Foundations of Scalable Systems

As the name implies, the book brings the foundations of designing scalable services talking about service design, caching, asynchronous messaging, serverless processing, microservices scalable data systems, and scalable streaming systems.

If you are new to the Scalable Systems world, it is a good read, if you have already experience in the area some parts will be just a review for you.

The book has a lot of learnings, but to not be too long a post, I will share my 5 most important learnings from the book.

Focus on hyper-scale systems

When designing scalable systems, we should focus on systems that are characterized by the ability to expand exponentially, with costs increasing linearly. The book brings an excellent definition: “Hyper scalable systems exhibit exponential growth in computational and storage capabilities while exhibiting linear growth rates in the costs of resources required to build, operate, support, and evolve the required software and hardware resources.”

So, when thinking about new functionalities, balance achieving the performance and scalability you require, while also keeping your costs as low as possible.

Cache is king

An effectively designed caching strategy proves invaluable when it comes to scaling a system. Caching is particularly effective for data that experiences infrequent changes but sees frequent access, such as inventory catalogs, event details, and contact information.

In the realm of scalability, distributed caching is a fundamental component. This caching approach allows the reuse of results from costly queries and computations in subsequent requests at a minimal cost. By avoiding the need to reconstruct cached results for every request, the system’s capacity is enhanced, enabling it to scale effectively to handle increased workloads.

Idempotence is important

When handling errors, we need a way to retry without generating duplicated results, for that one of the best ways is to make the API idempotent.

The standard methodology for constructing idempotent operations involves several key steps:

Clients incorporate a distinct idempotency key into all requests that induce state changes. This key serves to uniquely identify a particular operation originating from a specific client or event source. Typically, it consists of a combination of a user identifier, such as a session key, and a unique value such as a local timestamp, UUID, or sequence number.

Upon receiving a request, the server verifies whether it has encountered the idempotency key value before by querying a specialized database designed for implementing idempotency. If the key is not present in the database, it signifies a new request. Consequently, the server executes the requisite business logic to update the application state. Additionally, it records the idempotency key in the database to signify the successful application of the operation.

If the idempotency key is found in the database, it indicates that the current request is a retry from the client and should not be processed anew. In such cases, the server issues a valid response for the operation, with the expectation that the client will refrain from further retry attempts.

Fail fast

Cascading failures pose a subtle threat, stemming from their initiation through the gradual deterioration of response times in dependent services. In contrast to the immediate error notifications received when a downstream service crashes or becomes temporarily unavailable due to network issues, the insidious aspect arises when services experience a gradual slowdown. Requests still yield results, albeit with prolonged response times. If the besieged component remains inundated with requests, it lacks the necessary time to recover, perpetuating the growth of response times.

The fundamental issue with sluggish services lies in their prolonged utilization of system resources for requests. As a requesting thread awaits a response, it remains stalled for extended durations. Extended response times not only pose technical challenges but also negatively impact client engagement.

That is why is important to fail fast, as things don’t get coggled.

Building a distributed system is hard

The inherent complexity of distributed systems stems from their multifaceted nature, encompassing various failure modes that demand thorough consideration and strategic design to effectively navigate all possible scenarios. This complexity is further compounded when applications face heightened demand with substantial request volumes and rapidly expanding data resources.


These are my 5 most important learnings from the book Foundations of Scalable Systems: Designing Distributed Architectures written by Ian Gorton

Happy coding!