2020 will mark its place in modern history and thank everyone, it is coming to an end now. Look back 2020, I have learnt a lot in this special year. Here I just want to share some books that I found most helpful!

Streaming systems: The what, where, when and how of larege-scale data processing (By Tyler Akidau, Slava Chernyak, Reuven Lax)

This is a very good book from Google. It summarizes the experience of all these years from Google. Its name is streaming system, but it also talks about big data processing in the views from batching/streaming. There are some typical models in this book: google dataflow, apache ben model etc.

Currently in our business system, we still use SOA architecture, but I have done some projects that migrated some current database applications into micro-service architecture + container. This is a current trend in the industry. However, looking forward what will be the next generation in 10-20 years?

So if we look at the trend: * (gen1) SOA: Requirements are encapsulated into different service, no hardware abstractions;

  • (gen2) micro-service + container: Requirements are encapsulated into different micro services and the abstraction of hardware resource (“capability of running on some generalize hardware”); This is a continuation development of gen1 concepts (service) plus a new abstraction;

  • (gen3) service mesh + serverless: Requirements are encapulated into different service mesh and the abstraction of a server(severless); This is a continuation development of gen2 concepts (service+hardware abstraction) plus a new abstraction;

The pattern is the development will follow the direction that more fit to the distributed system or cloud, and will include new abstractions that will enable the service more fit to the different operation system and hardware.

I would guess “streaming” will be part of the next gen of “service” concept (no matter what it calls in the future), plus a new abstraction of different CPU architectue. Streaming will resovle the constistentcy issue between different storage layers, it is a continueation development on the solution for the same issue stressed by service mesh.

On the other hand, gen3 serverless is capable of running on all X86 architecture CPU, however, it cannot run on other “type” of processing unit, say GPUs or AI chips. So, we need a new abstraction to deal with new issues. Well, we can see there are already some business applications that use AI/deep learning to enhance the service. So to be able to include the abstraction that can best serve the AI computing will be critical.

Actually, we have a perfect example from Apple recently. It released M1 CPU (ARM) and start to sell several laptop models built on M1. I would call M1 a epoch making invention, because: * It bring a new architecture of CPU, ARM, to the apple “PC”;

  • The design of M1, includes not only CPU, GPU but also other kinds of IC into the M1 chip;

So I believe the PC and Servers built on Heterogeneous system would become avaiable/main stream in next 10 years. And the above pattern I mention will be the gen4 in 10 years therefore.

Guide to Reliable Distributed Systems (By Kenneth P Birman)

This is a book solving pratical problems and giving practical solutions/advice. Author definitely has a lot of valueable experience on building varies distributed systems. What is more important is, he also talked about many trade-offs he did for those systems. These are very helpful in resolving our own problems.

Distributed system is defintely a very hot topic in leading companies in industry. I have some experience back to my graduate school where we built a beowulf PC cluster for intensive computer simulation/ numerical calculations, which is a toy-model for distributed system. it was indeed a helpful project to do when you are in the school with limited fundings and limited resource. I learnt a lot from building it and coding parallel programs on it. You cannot understand the concerns from leading companies, if even don’t have practical experience from a toy-model.

Well, if that is your case, missing eseential experience in building a districuted system yourself, you still can read this book and get ready for your career in distributed system!