What are the Fallacies of Distributed Computing?

Why should I care?

These are a set of fallacies - things we wrongly assume - about networks.

Almost everything we build relies on a network of some sort. Whether that's the connection between the user's browser and your web application, or complex cloud infrastructure.

This list of 8 fallacies was created at Sun Microsystems in the 90s, but what are they, and are they still relevant today?

(spoiler: whilst some of them are less relevant, a lot of them are even more important today!)

The Fallacies of Distributed Computing

1. The network is reliable

We often write our code with the assumption that all the other computers on the network are available and listening.

When we're developing locally, everything 'just works'; there are no network outages or patchy connections.

But in the real world, this is not always the case. Sometimes a third-party API isn't listening, or the user loses their network connection.

We can handle this in various ways, including caching important things locally, load-balancing, and retrying failed requests.

The important thing is that we think about possible network failures while we're building software, and it isn't an afterthought.

2. Latency is zero

How often do you test your software on slow connections? If you're like me, the answer is 'not often enough'!

We usually assume that networks are fast, and that the calls we make are almost instant - but what about when they aren't?

This doesn't just apply to connections between the client and server. Your database connection might be slow for a period, or that third-party API you're calling from the server might not respond immediately.

Try to reduce the impact of a slow connection; cache things where it's sensible to do so, and try not to block the user interface while you're waiting for something to happen.

3. Bandwidth is infinite

Networks can usually handle a reasonable amount of traffic.

You and your users are likely to have a fast internet connection at home, so bandwidth between them and the server is probably reasonable. It's also rare (although not impossible) to encounter bandwidth issues on the server-side.

There is one instance where we might have to care about bandwidth, and that's when your users are on a mobile device.

Consider whether you can serve a lighter version of your application to mobile devices, and try to ensure that it's still usable while they wait for everything to arrive from the server.

4. The network is secure

This is as important today as it ever was, perhaps more so.

You can never guarantee that the caller of your APIs isn't trying to do something malicious, or that a browser isn't infected with malware.

It's not out of the question that communication between your own servers is compromised either. We're always finding new exploits that allow malicious actors to run their code on somebody else's servers.

Always build with security in mind, and never assume data sent or received over the network is safe.

5. Topology doesn't change

'Topology' refers to the layout of your network - what servers are running and where they are.

This could change for a number of reasons; perhaps a server crashes, or maybe your cloud provider moves your application to another data centre.

If the server you're calling is behind a load-balancer, two consecutive calls might end up going to completely different servers.

As far as possible, we should write our code to be independent of the network topology.

That could be as simple as not hard-coding IP addresses, or as involved as using another layer to pass messages, avoiding direct server-to-server communication altogether.

6. There is one administrator

Any reasonably complicated network will be managed by multiple people.

Perhaps the infrastructure for each 'service' in your application is owned by a different team, for example.

We can make this easier to achieve by 'de-coupling' the system, so different areas of the network can be managed separately.

Often we'll use a message passing layer (e.g. a service bus) to communicate between different parts of the system. That means we're free to change one area of the network without affecting another.

This fallacy is strongly related to the previous one ('topology doesn't change'). If different areas of the network are administered separately, the topology might change at any time.. so we want to try and avoid being 'coupled' to a particular topology.

7. Transport cost is zero

The second fallacy ('latency is zero') talked about how it isn't 'free' to send something over the network - there's a time cost.

Sometimes it's small, sometimes it isn't, but it's always there.

This fallacy says the same thing, but about resources.

In the modern world of cloud computing, this is usually less of an issue. It's rare to have to pay for data sent over a network, and if you're paying for networking infrastructure in the cloud, that cost doesn't usually change based on the amount of data we send.

We should stil be mindful of what we're sending over the network. however.

For example, traffic from a user's mobile device might not be free, so try to be as efficient as possible when sending data over the network.

8. The network is homogenous

'Homogenous' means that the network is made up of computers that are configured in the same way, and communicate using the same protocols.

That's rarely likely, especially in the world of cloud computing. Sometimes - when using containers, for example - you might not even know what operating system your server is running.

That means we have to do our best to make communication 'standard' across the network.

This is why we prefer standardised protocols - like HTTP - or standardised data formats, like JSON or XML.

Want to know more?

Check out these links: