Designing A Distributed Network - Part 1

For awhile now I have been becoming increasinly disillusioned by the current status quo of the majority of networking being done using the conventional centralized client/server model and have been waiting for distributed networks to become more prevalent to guide us into a better future. Years have come and gone and while much of the world has changed, the lack of movement in the distributed networking space has remained much the same. In an effort to "be the change I want to see", I want to try and create a distributed network on the tildeverse to both figure out how all of it works as well as contributing to a space I feel needs more attention. While I have some experience in the networking space, I am by no means an "expert", so what follows will be how I have been working through the various problems that crop up when using a distributed network.

What I want to do in this article is walk through the logic of how routing in a distributed network could be setup with an end goal of designing a system that optimizes the experience for those using the network. The traditional model for networks is to use a centralized model (i.e. many clients connecting to a single server), however with the objective being a distributed network this will not work. Instead, each peer (a word here I am using to mean "a member of a network") of the network will have its own server to receive connections and a client to connect with other servers on the network. This requires a bit more setup by peers to join the network compared to centralized models, but if the network is designed properly this extra effort will result in a more resilient network that can remain persistent even if any a few servers fall off the network. While there are many problems to tackle with getting a stable distributed networking design, this article will focus purely on how routing will work in the network.

Joining The Network

The first task to handle is how joining the network would work. In a centralized network when a client joins the network there is not much fuss since they are merely connecting to a server and have no obligations to the network after they disconnect. For a server, they need to communicate to designated server that they have joined the network (e.g. a DNS server on the Internet). With our setup of each peer joining the network as both a server and client there is not a designated server to fulfill the role of letting the rest of the network know about new peers joining the network.

One approach would be using a mesh design since it would solve the routing problem by having each peer directly connected to each other, but that will give us problems later as more peers join the network since the connections to keep track of would grow exponentially. Another approach would be to make the job of notifying the network that a peer has joined the responsibility of every peer on the network. To explain, let's assume the network is setup as follows:

There are eight peers (designated as P1 through P8) already on the network.
P1 is connected to P2, P3, and P4.
P4 is connected to P1, P5, and P6.
P6 is connected to P4, P7, and P8.
A new peer, P9, wants to join the network and does so by connecting to P8.

In this scenario, once P9 has connected to P8, P9 will send a message over the network to P8 letting it know it has joined the network. P8, upon getting this alert, will note that P9 is now a peer of the network and will tell its other peer, P6, about the new peer. P6 will then do the same and alert its peers P4 and P7 (but not P8 since that is who it got the message from). Since P7 has no other peers it has no one to alert, but P4 does so it will alert its peers P1 and P5 (but not P6 since that is who it got the message from). Since P5 has no other peers it has no one to alert, but P1 does and will alert the remaining peers on the network P2 and P3 (but not P4 since that is who it got the message from).

Using the simple rules of "alert your peers but not the sender" we have managed to traverse the entirety of the network without any peers being alerted more than once. However, this network is does not have an ideal structure since it has many single points of failure that could partition the network (i.e. splitting the network as a whole into smaller isolated networks) if P1, P4, or P8 went down. To make this network more resilient we will want to let's add the rule that peers must be connected to at least N other peers on the network. An added bonus of this new rule, as long as a peer always maintains connected to at least N other peers (e.g. if a peer drops off leaving only N-1 connections, the peer would seek out another peer to connect to) the network will remain resilient as long as at least N peers remain on the network. For the sake of this example N will equal 3, so the new setup is:

There are eight peers (designated as P1 through P8) already on the network.
P1 is connected to P4, P5, and P6.
P2 is connected to P5, P6, and P7.
P3 is connected to P6, P7, and P8.
P4 is connected to P7, P8, and P1.
P5 is connected to P8, P1, and P2.
P6 is connected to P1, P2, and P3.
P7 is connected to P2, P3, and P4.
P8 is connected to P3, P4, and P5.
A new peer, P9, wants to join the network and does so by connecting to P1, P2, and P3.

Following this scenario, once P9 has connected to its three peers (P1, P2, and P3) it will send a message to them letting it know it has joined the network. Starting with P1, it will alert its peers P4, P5, and P6 about P9 joining the network. At the same time, P2 will alerts its peers P5, P6, and P7 about P9 joining the network. Already we have an issue, P5 and P6 will both get alerted to P9 joining the network twice. To handle this case, another rule needs to be added. To recap, so far the rules are:

Each peer must maintain a connection to at least N other peers.
New peers advertise that they have joined by sending an alert to its peers
When alerted that a new peer has joined the network, inform immediate peers except for the sender of the alert.

To fix this, we need amend the last rule to check if it has seen this alert already, and if so to ignore the alert. This needs to be done to prevent the alert from being endlessly broadcasting across the network with nothing to stop it. With this new rule in mind, continuing through the scenario P3 would also simultaneously alert its peers P6, P7, and P8. P4 will have been alerted by P1 and will alerts its other peers P7 and P8 (but both will ignore this request). P5 will have been alerted by P1 and will alert its other peers P8 and P2 (but both will ignore this request). P6 will have been alerted by P1 and will alert its other peers P2 and P3 (but both will ignore this request). P7 will have been alerted by P2 and will alert its other peers P3 and P4 (but both will ingore this request). P8 will have been alerted by P2 and will alert its peers P3 and P4 (but both will ignore this request).

With the size of this network and the value of N used, the entire network was already notified of P9's arrival after its immediate peers P1, P2, and P3 alerted the rest of the network. This will not always be the case and may take a few more hops through the network, but will still result in the entire network learning about the new peer without any repeats. Although, there are still problems with this setup. Since there is nothing dictating which peers a new peer should connect with partitions can still happen with the rules as they currently stand, e.g. P1, P3, P5, and P7 can be connected to each other along with P2, P4, P6, and P8 connected to each other will satisfy the current rules but results in a partitioned network. As well, this disorganized structure could lead to bottlenecks in the network if any of the linked peers have high latency between each other, e.g. the previous example could bridge the partition by linking one peer from each group together, but if there is high latency between those two peers it would bottleneck all communications between the two groups. Of these two issues, partitioning and latency, the latter is the easier of the two to solve so we will start by solving it. The following will go over a few different strategies on how the latency issue can be solved.

Advertise Latency On Join

A new peer wishing to join the network will still need to temporarily connect with N arbitrary peers. However, instead of just noting the new peer being on the network, Once the other peers on the network get the alert they can ping the new peer and provide back to it the latency of their connection to it. If the latency is less than any of the peers it has currently joined it will establish a connection to it in favor of the slowest connection it has. While this would accomplish the goal of peers having an optimized connection to the network, there are a few logistical issues with this approach.

Firstly, since this approach aggressively aims at having peers only maintain a low-latency connection to N peers, if there are N peers that are all have low-latency connections with each other then this approach would immediately partition those peers from the rest of the network since they would only connect to each other. Unless if another peer between one of the peers and the rest of the network joins, those peers would remain outside the rest of the network (which is worse than what we had before since this now directly creates partitions rather than only coincidentally). Secondly, as the size of the network scales the approach of having every other peer send a connection to the new peer could soon become overwhelming as an unintended distributed denial of service (DDoS) attack against new peers rather than helping them optimize their connection to the network. So while this approach optimizes for speed, additional steps or compromises need to be made to prevent directly creating partitions and not overwhelming new peers as the network scales.

Searching & Logging

Addressing the scaling concern of the previous approach, flipping the idea from every peer pinging the new peer to the new peer pinging every other peer on the network would resolve that issue since it distributes the load of the latency pings throughout the network from a single peer rather than directing the entire network at a single peer. Although, this approach has a few drawbacks to the previous method. Before a new peer did not need to know how to contact every other peer on the network, that burden was distributed across the network since the new peer only needed to advertise itself to its immediate peers and then the other peers in the network would then contact it first. With this "search" method, a new peer needs a way to easily discover every other peer directly on the network.

One approach to searching for the other peers would be to manually walk the graph of the network by first asking its immediate peers what peers they are connected to, followed by asking those peers who they are connected to, and so on. However, if the peer is not keeping track of who it has spoken to it can easily get stuck in a loop during this process. To fix this, during the process of contacting each peer it can add the peer to a list and when it gets another list of peers to contact it can check if they are in its own list before reaching out.

An interesting consequence of this is that if peers on the network are already doing this searching and logging process, a new peer could simply ask one of its peers for its list of peers (since it would have already done this work) and then get started on the work of reaching out to all of them to calculate their latency to find its ideal peers... Although, if all of the peers acted this way to begin with there is then no longer a list to ask for since none of them would have done the initial work of searching and logging. Instead, there will need to be two processes for maintaining the list of known peers on the network: bootstrapping and appending. The distributed network needs to be started somehow, so there will need to be N peers that will bootstrap the network by joining to each other (i.e. creating a mesh network) and note the peers they connected to in their own list. Now that the network has been bootstrapped, new peers joining the network can ask for another peer's list. However, that list would only be for the bootstrap peers if no other peers are being added to the lists. To resolve that issue, whenever a new peer joins the network and advertises that it joined all other peers will append the new peer to their list. This means that the new peer will already be on the requested list, so one more rule needs to be added for the peer to ignore its own entry when traversing through the list.

Optimized Logging

The "searching and logging" method improves the advertise latency method, but it still has a scaling problem since every peer will need to contact every other peer on the network before it can join. While the network is small this would be negligible, but as the network grows along with further geographic gaps between the peers the discovery process for a new peer could take quite awhile to complete (e.g. if 1000 peers have a latency of 1 second to the new peer and the search is done serially, the discovery process would already take over 15 minutes). This process could be improved if when each peer is building their list of peers they also included the latency to each peer alongside its entry. Now whenever a new peer joins the network via its initial N peers it can do a latency test against them and ask the one with the lowest latency for its list of peers while ignoring the others. Since the new peer should already be in the listing with its latency listed, it can look for other peers in the list and contact all peers with a similar latency as prime candidates for performing a latency test. Some of those tests will come out a lot higher since this heuristic is using the latency relative to the other peer and not itself, so only a portion of the peers in the same latency range as the new peer will actually have a low latency connection. A good way to rationalize this is to look at the following scenario:

There are three peers (designated as P1 through P3) already on the network, connected to each other in a mesh configuration.
The latency from P1 to P2 and P3 are 1ms and 2ms (respectively).
The latency from P2 to P1 and P3 are both 1ms.
The latency from P3 to P1 and P2 are 2ms and 1ms (respectively).
A new peer, P4, wants to join the network and does so by connecting to P2.

This scenario is using an (ill-advised) configuration of N=1, but is purely used here for demonstration purposes. Suppose the network visually looks like the following and each link has a latency of 1ms:

text P1 - P2 - P4 | P3

From the list P4 gets back from P2, it would see that its latency to P2 is 1m and so is the latency for P1 and P3 to P2. However, when it does its own latency checks P4 finds it latency to P1 and P3 are, respectively, 2ms and 1.4ms (do note that in reality the latency will not just be geographic distance problem but how the networking infrastructure between any two peers are, I only use it here to keep the problem simple). In this scenario P4 would remain connected to P2 since it has the lowest latency, but this helps illustrate how the latency results from another peer cannot solely be relied on for finding good candidates to use as its immediate peers.

For larger networks and/or when N is a higher value, the candidates found from the initial peers may not be the most optimal peers on the network, so the search can continue by performing the same process again on the new immediate peers. This process can be continued to find better immediate peer candidates and if continued would traverse the entire network and find the "global minimum" for the peers with the lowest latency. Since we want to avoid traversing the whole network, we can instead impose a latency "threshold" such that the peer will continue iterating through searching process until all N immediate peers have a latency at or below the threshold or it has traversed the whole network. In the worst case, the full network will still get searched, but this method now provides a way to reach it sooner, requires contacting less peers, and using less of a blind search.

Solving Partitioning

With the last method used above, we have a good method for solving the latency issue such that peers in the network will all have low latency to their immediate peers, so by consequence any routing done by hopping through the network would be devoid of any artificial bottlenecks. However, this still does not address the partitioning issue since the current rules still allow immediate peers to be selected without any regard for the connection to the rest of the network as a whole. I will provide solutions for this in my next article on this topic.

Author's Note

I am aware that I am not the first one to try and solve these problems. Part of why I am doing this is for the fun of trying to work though these problems and figure out why some approaches are better or worse than others rather than going with the solution someone else went with. A truth I have accepted in this world is "there are no perfect solutions, only trade-offs", so this is my own personal journey into the trade-offs I feel should exist in a distributed networking system. Your opinions on this ~may~ will likely differ from mine, but that is all part of the process. This article more exists as a resource for others who are interested in this topic rather than as a self-proclaimed paragon of the best solution for distributed networking.