The size of a network is important not so much because of the sheer quantity of elements we are dealing with, but more because it sets the context for how close or far on average one node in the network is from another. And this is important because it will tell us how quickly something will spread through the network and also how integrated different components in the network are likely to be. Making a connection within a network, or traveling from one node to another is rarely free. It typically costs some resource, whether this is the cost of fuel to travel in a transportation network, the laying of cables to transport information, or the risk of rejection when you ask someone out on a date. The further we have to travel along a network to get from A to B the more it will cost and the less likely it will occur, with the result being a lower level of integration to the system.
As an example of this, we might think of the Russian empire in the 1900s, a vast land mass spanning from Europe to East Asia. But without any coherent transportation network connecting it, it was continuously under threat of falling to pieces. With the building of the Trans-Siberian rail network, information, goods and resources could eventually diffuse to the different parts of the system, and it is this interaction between components to the system that gives it cohesion and integration. We may be dealing with a very large system, but if there is no diffusion or interaction across the network it ceases to function as an entirety.
Two important metrics for capturing this overall distance between nodes are; the network’s diameter and its average path length. The diameter of a network is simply the longest of all the geodesics in the network, where geodesic means the shortest path between two nodes.1 When we are asking for the diameter of a network we are looking at all the shortest paths and then choosing the longest one, and this will give us an idea of how far something might have to travel to get all the way across the network. The average path length is calculated by finding the shortest path between all pairs of nodes, adding them up, and then dividing by the total number of pairs. This will show us the number of steps on average it takes to get from one member of the network to another.
Small World Phenomenon
We can take a real world network and ask what is its average path length. For example, a number of years ago researchers studied the social network of Facebook when it had approximately 721 million users with 69 billion friendship connections between them. The average path length turned out to be just 4.74 intermediary connections.2 This appears to be an extraordinary low distance between any two members of such a large network. Quoting one of Facebook’s spokesperson on this finding, “In these two works, we show how the Facebook social network is at once both global and local. It connects people who are far apart, but also has the dense local structure we see in small communities.” This is what is called the small-world phenomena.3 We can see from this that the question of how close things are in a network is not just a product of its size, but it is also a product of the structure to the network as we would expect.
Looking at the two network topologies on the right will help to demonstrate this. If we look at the diameter of the ring network on top, we will see it is quite high. In fact, it is half the number of nodes in the network, if we look at the tree network on the bottom that has the same number of nodes we see the diameter is much shorter, due to this branching structure to the network; which is a much more efficient way of connecting things. The thing to note is that with the ring topology, the number of nodes one can reach is only growing in a linear fashion relative to the number of links one have to traverse. So if I travel 1 link away, I can reach 2 nodes. If I travel 2 links away I can reach 4 nodes. If I go 3 links I can reach 6, and so on. This is a linear progression, in contrast to our other network structure if we place ourselves at the top of the tree. Now due to this branching structure, we can get an exponential growth in the number of nodes we can reach relative to the distance we have to travel. At one step we reach just 2 nodes, but at 2 steps I can reach 6 nodes, at 3 steps the whole network of 14 nodes. And it is this same exponential growth that is one of the factors behind the small-world phenomena allowing us to have a surprisingly short average path length even within very large networks like Facebook.
Thus, even when we have a very larger network, if it has the right structure then we will be able to reach more nodes in a shorter path length than if the network was smaller but had a structure that gave it only a linear relation between distance traveled and nodes connected to. We often think about and measure size and scale in terms of some static quantity, the number of people in a city or the creatures in an ecosystem. But as networks are all about connectivity, what really matters with respect to scale here is how far you are away from other nodes which can be dramatically altered by simply restructuring the network, and thus scale becomes much more subjective and relative to the topology of the network.