What is a Distributed System?

Distributed systems are a collection of independent components and machines located on different systems, communicating in order to operate as a single unit.

In this complete introduction, learn how distributed systems work, some real world examples, basic architectures, the benefits and disadvantages, and common solutions for real-time distributed streaming.

Try Confluent for Free

How Distributed Systems Work

Distributed System - Definition

Also known as distributed computing and distributed databases, a distributed system is a collection of independent components located on different machines that share messages with each other in order to achieve common goals.

As such, the distributed system will appear as if it is one interface or computer to the end-user. The hope is that together, the system can maximize resources and information while preventing failures, as if one system fails, it won't affect the availability of the service.

Today, data is more distributed than ever, and modern applications no longer run in isolation. The vast majority of products and applications rely on distributed systems.

Elements of a Distributed System

The most important functions of distributed computing are:

Resource sharing - whether it’s the hardware, software or data that can be shared
Openness - how open is the software designed to be developed and shared with each other
Concurrency - multiple machines can process the same function at the same time
Scalability - how do the computing and processing capabilities multiply when extended to many machines
Fault tolerance - how easy and quickly can failures in parts of the system be detected and recovered
Transparency - how much access does one node have to locate and communicate with other nodes in the system.

Modern distributed systems have evolved to include autonomous processes that might run on the same physical machine, but interact by exchanging messages with each other.

Distributed System Examples

Networks

The earliest example of a distributed system happened in the 1970s when ethernet was invented and LAN (local area networks) were created. For the first time computers would be able to send messages to other systems with a local IP address. Peer-to-peer networks evolved and e-mail and then the Internet as we know it continue to be the biggest, ever growing example of distributed systems. As the internet changed from IPv4 to IPv6, distributed systems have evolved from “LAN” based to “Internet” based.

Telecommunication networks

Telephone and cellular networks are also examples of distributed networks. Telephone networks have been around for over a century and it started as an early example of a peer to peer network. Cellular networks are distributed networks with base stations physically distributed in areas called cells. As telephone networks have evolved to VOIP (voice over IP), it continues to grow in complexity as a distributed network.

Distributed Real-time Systems

Many industries use real-time systems that are distributed locally and globally. Airlines use flight control systems, Uber and Lyft use dispatch systems, manufacturing plants use automation control systems, logistics and e-commerce companies use real-time tracking systems.

Parallel Processing

There used to be a distinction between parallel computing and distributed systems. Parallel computing was focused on how to run software on multiple threads or processors that accessed the same data and memory. Distributed systems meant separate machines with their own processors and memory. With the rise of modern operating systems, processors and cloud services these days, distributed computing also encompasses parallel processing.

Distributed Artificial intelligence

Distributed Artificial Intelligence is a way to use large scale computing power and parallel processing to learn and process very large data sets using multi-agents.

Distributed Database Systems

A distributed database is a database that is located over multiple servers and/or physical locations. The data can either be replicated or duplicated across systems.

Most popular applications use a distributed database and need to be aware of the homogenous or heterogenous nature of the distributed database system.

A homogenous distributed database means that each system has the same database management system and data model. They are easier to manage and scale performance by adding new nodes and locations.

Heterogenous distributed databases allow for multiple data models, different database management systems. Gateways are used to translate the data between nodes and usually happen as a result of merging applications and systems.

Distributed System Architecture

Distributed systems must have a network that connects all components (machines, hardware, or software) together so they can transfer messages to communicate with each other.

That network could be connected with an IP address or use cables or even on a circuit board.
The messages passed between machines contain forms of data that the systems want to share like databases, objects, and files.
The way the messages are communicated reliably whether it’s sent, received, acknowledged or how a node retries on failure is an important feature of a distributed system.
Distributed systems were created out of necessity as services and applications needed to scale and new machines needed to be added and managed. In the design of distributed systems, the major trade-off to consider is complexity vs performance.

To understand this, let’s look at types of distributed architectures, pros, and cons.

Types of Distributed System Architectures

Distributed applications and processes typically use one of four architecture types below:

Client-server:

In the early days, distributed systems architecture consisted of a server as a shared resource like a printer, database, or a web server. It had multiple clients (for example, users behind computers) that decide when to use the shared resource, how to use and display it, change data, and send it back to the server. Code repositories like git is a good example where the intelligence is placed on the developers committing the changes to the code.

Today, distributed systems architecture has evolved with web applications into:

Three-tier: In this architecture, the clients no longer need to be intelligent and can rely on a middle tier to do the processing and decision making. Most of the first web applications fall under this category. The middle tier could be called an agent that receives requests from clients, that could be stateless, processes the data and then forwards it on to the servers.
Multi-tier: Enterprise web services first created n-tier or multi-tier systems architectures. This popularized the application servers that contain the business logic and interacts both with the data tiers and presentation tiers.
Peer-to-peer: There are no centralized or special machine that does the heavy lifting and intelligent work in this architecture. All the decision making and responsibilities are split up amongst the machines involved and each could take on client or server roles. Blockchain is a good example of this.

Pros and Cons of Distributed Systems

Advantages of Distributed Systems:

The ultimate goal of a distributed system is to enable the scalability, performance and high availability of applications.

Major benefits include:

Unlimited Horizontal Scaling - machines can be added whenever required.
Low Latency - having machines that are geographically located closer to users, it will reduce the time it takes to serve users.
Fault Tolerance - if one server or data centre goes down, others could still serve the users of the service.

Disadvantages of Distributed Systems:

Every engineering decision has trade offs. Complexity is the biggest disadvantage of distributed systems. There are more machines, more messages, more data being passed between more parties which leads to issues with:

Data Integration & Consistency - being able to synchronize the order of changes to data and states of the application in a distributed system is challenging, especially when there nodes are starting, stopping or failing.
Network and Communication Failure - messages may not be delivered to the right nodes or in the incorrect order which lead to a breakdown in communication and functionality.
Management Overhead - more intelligence, monitoring, logging, load balancing functions need to be added for visibility into the operation and failures of the distributed systems

How Distributed Streaming Platforms Can Help

Confluent is the only data streaming platform for any cloud, on-prem, or hybrid cloud environment. Connect 120+ data sources with enterprise grade scalability, security, and integrations for real-time visibility across all your distributed systems.