5.2.3 The Tradeoffs between System Architecture and Programming Model
Course subject(s)
5: System Architecture and Programming Models
When we talk about architecture, we mean the structural principle in which you organize your system. For example the client-server model distinguished between two different roles of machines with the servers providing services for a large number of client devices. Load balancers can be used to make a flat topology of servers more scalable by distributing requests more evenly among them. In a hierarchical topology, scalability can be improved by making each level of servers be responsible for a distinct subset of the problem, as it is done in DNS. Peer to peer systems do not distinguish between different roles as every device is technically both a client making requests, and a server serving requests from other peers.
On the other end, we have plenty of Programming Models and will examine some of them later in the module. What is important to note is that any choice of Programming Model and System Architecture is the result of tradeoffs.
Tradeoffs between Programming models and System Architecture
We have learned about functional and non-functional requirements and both have a strong influence on the PM and the SA. For non-functional, this is more apparent because, for example, if you have specific latency requirements, you cannot arbitrarily organize your machines. This dictates what the critical path will look like and then you can structure your system around it.
In the picture above, the worst-case latency of a client request is determined by the longest path in the network. In the centralized system, there is only one hop between the client in orange and the server in green and it has a latency of 4. This is also true for any other client in the network. In the peer-to-peer topology, the experienced latency can vary much more. In the best case, the latency is one because a request is routed to the neighboring peer. In the worst case, however, the latency can be seven, as illustrated between the orange and the green peer.
There are still good reasons for using peer-to-peer topologies in situations where predictable latency is less of a concern. For instance, in a peer-to-peer system content can be cached or replicated in many different nodes, thereby increasing availability as long as the churn of nodes is not too high. Furthermore, a peer-to-peer system might be a superior solution to a centralized architecture if the information served by the system is privacy-sensitive in nature. A lack of centralized control allows every peer to make their own decisions about who is entitled to retrieve information from them.
Another typical tradeoff is between replication and consistency. More replicas increase availability and reliability but, unless the service is entirely stateless, they also aggravate the issue of consistency. When consistency is an issue, though, then having low land predictable latency between replicas can be important to keep the overhead of the consistency protocol manageable, which impacts the topologies that can be used in such a case.
Impedance Mismatches
In theory, every programming model can be combined with every system architecture but not all combinations are good fits. It is important to consider all parameters of a problem and understand what the solution is expected to deliver. If a system is intended to operate between different departments of the same organization then a hierarchical topology with a programming model that matches well with this structure might be better suited than a peer-to-peer network. Using a programming model that is based on node-to-node communication (e.g., RPC) with a peer-to-peer network is a mismatch because of the high degree of interaction in such topology and because every peer would have to keep track of every other peer to communicate. This model works much better, e.g., in a hierarchical topology where nodes only communicate along the branches of the tree and the setup is less dynamic.
Cost of Convenience
You can program and design at the equivalent of an assembly-level language and that does give you maximum control but that has a very low degree of abstraction and portability. And just like when you consider abstraction vs control for picking a Programming Model, you also do it at the higher level where you design a system. Some Programming Models give you a lot of abstraction, you can instantiate them quickly and the burden and skill level you need is lower, but you might pay in terms of overhead and loss of control.
Modern Distributed Systems by TU Delft OpenCourseWare is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://online-learning.tudelft.nl/courses/modern-distributed-systems/