Five Required Characteristics of a Reliable Distributed System: An Analysis of Resource Sharing, Openness, Scalability, Concurrency and Fault Tolerance

Published: 2022/01/10
Number of words: 1223

Introduction

A distributed system is a set of independent parts sited in different machines, which interact to share resources and information between its components in the pursuit of common goals. While a distributed system may comprise several disparate components, the end-user should be able to access all components through a single interface. Distributed systems maximize the efficiency of resource use and ensure scalability, redundancy and fault tolerance in the system, and have enabled large-scale, high-performance computing operations. This assignment will discuss five key characteristics required to build a reliable distributed system, which are resource sharing, openness, scalability, concurrency and fault tolerance.

Overview of distributed systems

Distributed systems refer to systems built on sets of independent and interconnected components which exist within a network to provide the user with access to its full breadth of resources. Distributed systems are used in telecommunications networks, real time systems, parallel computing and database systems to provide efficient, scalable and timely access to resources, and may take a number of different architectures, such as the client-server architecture, three-tier architecture, n-tier architecture and peer-to-peer architecture. For example, parallel computing allows software to run on several processors which each access the same memory and data, while distributed database systems allow data to be accessed across numerous servers. In order to achieve such functions, distributed systems rely on a central network that allows for the flow of information between individual machines. Distributed systems allow for greater scalability, performance and availability of software applications, with the ability to add additional machines, or to use another machine if the current one in use encounters a fault (Sari & Akkaya, 2015). However, distributed systems may suffer from communication or network related failures if messages and information are not served to the right nodes, and may not be able to consistently integrate and synchronize data, especially in the event of multiple machine or node faults across the network (Sari & Akkaya, 2015).

Need an essay assistance?
Our professional writers are here to help you.
Place an order

Resource sharing

The first key characteristic of distributed systems is resource sharing, which refers to the ability for a user of a distributed system to access resources across multiple components and machines in the system. This should extend to access of hardware, software and data across a distributed system, and should also serve as the backbone of the consistent and synchronised exchange of information across the distributed network. For example, the StackPath distributed system works by sharing resources across its network nodes, and stores the most commonly requested content for its content delivery service in the nodes closest to the respective consumer, allowing the network to share resources and effectively serve consumers where they are based.

Openness

Openness refers to the ability of distributed systems to be extended and improved in its hardware and software components. The distributed system should be able to accept and remove components at will, based on a detailed and standardised interface of components. New components added should also be able to effectively integrate with existing components without causing any compatibility issues, while being perceived by the user as an integrated whole, as opposed to a poorly integrated collection of individual components and machines. Generally, an open distributed system by companies such as IBM should allow resource sharing services and addition of new machines and components to be made freely available to all users. For example, two local area networks could be connected through the use of an open distributed system architecture to allow for the addition of new users at will.

Scalability

Scalability refers to the ability of the distributed system to increase in bandwidth, computing and processing power while retaining its functionality. Scalability serves as a test of how the distributed system is able to manage the growth of a system, whether in terms of machines or users, and a scalable distributed system should not require major system modifications or redesigns each time it scales in size. Furthermore, the overall latency and availability of the system should continue to be preserved at a reasonable level even as the system scales, although the team may have to make tradeoffs in latency and availability as the system scales in general. For example, an internet of things distributed network could need to add additional edge computing devices such as sensors, and the distributed system should be able to support the additional hardware and computing bandwidth required without requiring a major system design modification (Vladyko et al, 2019). As another example, when an e-commerce platform such as Amazon or Shopee continues to grow, it may need to add additional machines to cope with the bandwidth of new users accessing the network. A distributed system could help to spread resource use across different use periods to prevent the exhaustion of computing bandwidth during high-usage periods, and continue to guarantee that the system can support the addition of new users and machines as the company’s operations expand geographically.

Concurrency

Concurrency refers to the ability of several machines across the distributed system network being able to process the same function simultaneously. Concurrency ensures that processes on distinct machines are managed by a common system that allows the simultaneous execution of processes. For example, the use of multiple applications on a distributed computer system requires the fulfilment of concurrency as a criteria. This requires the mechanisms of synchronisation and coordination to be used as ways to control access to shared resources across numerous activities, which ensures that processes can be run concurrently without interfering with each other.

Worry about your grades?
See how we can help you with our essay writing service.
LEARN MORE

Fault tolerance

Fault tolerance refers to the ability of the distributed system to quickly identify and rectify faults across the system, while concurrently maintaining service accessibility. Fault tolerance measures how effectively a distributed system can continue to provide a desired service despite system failures, such as a power outage, denial of service attack or natural disaster. For example, backup hardware and data sources can be installed as redundancies across the distributed system, which would ensure that the user is still able to access backup versions of a machine or data source even if the primary source is destroyed or compromised.

Conclusion

Distributed systems help to enable efficient and effective high-performance computing, and this is primarily enabled through the five key characteristics of resource sharing, openness, scalability, concurrency and fault tolerance. These characteristics ensure that distributed systems are able to scale and integrate new components, while ensuring uninterrupted latency, availability and access to resources across the network for their users.

References

Jiang, Y. (2015). A survey of task allocation and load balancing in distributed systems. IEEE Transactions on Parallel and Distributed Systems27(2), 585-599. https://doi.org/10.1109/TPDS.2015.2407900

Jogalekar, P., & Woodside, M. (2000). Evaluating the scalability of distributed systems. IEEE Transactions on parallel and distributed systems11(6), 589-603. https://doi.org/10.1109/71.862209

Sari, A., & Akkaya, M. (2015). Fault tolerance mechanisms in distributed systems. International Journal of Communications, Network and System Sciences8(12), 471. https://doi.org/10.4236/ijcns.2015.812042

Vladyko, A., Khakimov, A., Muthanna, A., Ateya, A. A., & Koucheryavy, A. (2019). Distributed edge computing to assist ultra-low-latency VANET applications. Future Internet11(6), 128. https://doi.org/10.3390/fi11060128

Tanenbaum, A. S., & Van Steen, M. (2007). Distributed systems: principles and paradigms. Prentice-Hall.

Cite this page

Choose cite format:
APA
MLA
Harvard
Vancouver
Chicago
ASA
IEEE
AMA
Copy
Copy
Copy
Copy
Copy
Copy
Copy
Copy
Online Chat Messenger Email
+44 800 520 0055