Distributed Operating System - DOS - FaQ 4 - IndianTechnoEra

Section - A (Each question of 1 mark)

i. Which types of networks can be used by distributed systems?

Distributed systems can function across various network types, including:

Local Area Networks (LANs): Small, high-speed networks connecting devices within a single location (e.g., office building).
Wide Area Networks (WANs): Geographically dispersed networks connecting devices across large distances (e.g., national or international).
Metropolitan Area Networks (MANs): Networks spanning a larger area than a LAN but smaller than a WAN (e.g., city-wide network).
Wireless Networks: Networks using wireless connections (e.g., Wi-Fi, Bluetooth) for device communication.
Private Networks: Networks owned and operated by a single organization or entity.
Public Networks: Networks open to the general public (e.g., the internet).

The choice of network depends on factors like system scale, geographical distribution, performance requirements, and security needs.

ii. What are the different types of networks (based on topology)?

Network topology refers to the physical or logical arrangement of connections between devices. Common topologies include:

Star: Devices connect to a central hub or switch, offering centralized control but vulnerable to single point of failure.
Bus: Devices share a single communication channel, efficient for small networks but prone to congestion.
Ring: Devices are connected in a closed loop, data flows in one direction, robust but complex to add/remove devices.
Mesh: Devices are interconnected, multiple paths possible, resilient but more complex to manage.

iii. Define latency.

Latency refers to the time it takes for data to travel from one point to another in a network. It is measured in milliseconds (ms) and impacts responsiveness and performance. Factors affecting latency include:

Distance: Longer distances introduce greater delays.
Network congestion: Busy networks have slower data flow.
Bandwidth: Higher bandwidth allows faster data transfer.
Processing delays: Devices may introduce delays for routing, switching, or security checks.

iv. What is the difference between networking and internetworking?

Networking is a broader term encompassing any single connected network, while internetworking specifically addresses connecting and allowing communication between different networks.

v. What is meant by networking?

Networking: Refers to the connection and communication between devices within a single network. It involves defining communication protocols, managing devices, and ensuring smooth data flow within the network boundaries.

vi. What is meant by internetworking?

Internetworking: Deals with connecting and enabling communication between multiple networks. It focuses on connecting different network types, routing data across them, and overcoming differences in protocols and technologies.

Section-B: (Each question of 2 mark)

i. Explain in detail about the various system models:

Distributed systems utilize diverse models to define their structure, interactions, and functionalities. Here are some key models:

Architectural Models:

Layered Model: Divides the system into distinct layers (e.g., application, communication, network) with well-defined interfaces. Promotes modularity, reusability, and easier development.

Client-Server Model: Processes act as clients requesting services from other processes acting as servers. Offers clear roles and efficient resource utilization.

Peer-to-Peer Model: Processes participate equally in providing and consuming services. Decentralized and scalable, but complex to manage consistency and reliability.

Hybrid Model: Combines elements of client-server and peer-to-peer models, offering flexibility and adaptability.

Fundamental Models:

Remote Procedure Call (RPC): Processes invoke procedures on remote machines as if they were local, simplifying programming.

Message Passing: Processes exchange messages containing data and instructions, offering more control and flexibility.

Publish-Subscribe: Processes subscribe to channels (topics) and receive relevant messages published by others, enabling efficient broadcast communication.

Mobile Agent: Autonomous programs travel through the network, executing tasks on different nodes, suitable for dynamic and distributed tasks.

Other Models:

Replication and Consistency Management: Ensuring consistency of data across multiple copies in a distributed system.

Distributed File System: Providing transparent access to files across geographically dispersed locations.

Security Models: Mechanisms for authentication, authorization, and data protection in distributed systems.

Choosing the Right Model:

The selection of appropriate models depends on several factors, including:

Application needs: Functionality, performance, scalability, and security requirements.
System complexity: Size, distribution, and heterogeneity of the system.

Development and management considerations: Ease of programming, debugging, and monitoring.

ii. a. Describe details about the architectural model:

Architectural models focus on organizing a distributed system's functionalities and responsibilities across different layers or components. Key characteristics include:

Layering:

Divides the system into horizontal layers, each with specific tasks and interfaces.

Layers interact through well-defined protocols, hiding lower-level complexities from higher layers.

Common layers include application, presentation, communication, network, and hardware.

Distribution of Functionality:

Determines where different functionalities (e.g., resource management, security, naming) are implemented within the system.

Centralized architecture places most functions in a single location (e.g., server).

Distributed architecture spreads functions across different nodes based on needs and efficiency.

Transparency:

The degree to which the distributed nature of the system is hidden from users and applications.

Highly transparent architectures expose distribution, requiring user awareness of different nodes and communication.

Less transparent architectures mask distribution, presenting a unified access point to users.

Examples:

The OSI model is a seven-layer architectural model for network communication.

Three-tier and n-tier architectures distribute application logic across presentation, business logic, and data access layers.

Microkernel architectures separate core functionalities from user-space services for flexibility and modularity.

ii. b. Describe details about functional model?

In distributed systems, a functional model describes the system's functions and their relationships without delving into specific implementation details. It lays out the system's activities, processes, and operations in a structured way, providing a high-level understanding of its overall functionality.

Here are some key aspects of a functional model:

Component Identification:

The model identifies the main components within the system, which can be processes, resources, communication links, or services.

Each component represents a specific functionality or entity involved in the system's operations.

Activity Decomposition:

The model decomposes the overall system functionality into smaller, well-defined activities or tasks.

These activities may represent individual operations, data processing steps, or interactions between components.

Relationships and Flow:

The model defines the relationships between different activities and components, showing how they interact and collaborate.

This typically involves data flow, control flow, or dependency relationships between activities.

Common Representation Techniques:

Functional models can be depicted using various techniques like:

Flowcharts: Visually represent activities as connected boxes with arrows showing the flow of data and control.
Data flow diagrams (DFDs): Show processes as transformations of data flows, highlighting information transfers.
Activity diagrams: Focus on the actions and interactions within a system, including sequencing and parallel execution.

Benefits of Functional Modeling:

Provides a clear understanding of the system's functionality and behavior.
Facilitates communication and collaboration among stakeholders with different technical backgrounds.
Helps identify potential challenges and areas for improvement.
Serves as a foundation for detailed design and implementation.

Additional Considerations:

The level of detail in a functional model depends on the specific needs and objectives.

Different models can be used to represent different aspects of the system's functionality.

Functional models should be kept up-to-date as the system evolves.

Section-C (Each question of 4 mark)

i. What is the use of cryptography in distributed systems?

Cryptography plays a crucial role in securing distributed systems by protecting data confidentiality, integrity, and authenticity. Here are some specific uses:

Confidentiality:

Encryption: Protects data at rest (stored) and in transit (network communication) from unauthorized access.

Key management: Secure storage, distribution, and access control of cryptographic keys are essential for encryption's effectiveness.

Integrity:

Digital signatures: Verify the authenticity and unaltered nature of data, preventing tampering and forgery.

Hash functions: Generate unique "fingerprints" of data to detect any modifications.

Authentication:

Password hashing: Stores user credentials securely, preventing unauthorized access based on stolen passwords.

Public key infrastructure (PKI): Provides digital certificates and digital signatures for mutual authentication of processes and users.

Additional benefits:

Non-repudiation: Ensures accountability by proving who created or sent data.

Secure communication: Establishes encrypted channels for secure communication between processes or users.

Access control: Encrypts sensitive data and grants access only to authorized users based on credentials and permissions.

Challenges in distributed systems:

Managing and distributing cryptographic keys securely across multiple nodes.
Balancing security with performance overhead introduced by encryption and decryption.
Choosing appropriate cryptographic algorithms and protocols based on security needs and resource constraints.

Examples of cryptographic use in distributed systems:

Securely storing medical records in a distributed healthcare system.
Authenticating users and protecting financial transactions in online banking.
Encrypting communication between components in a cloud computing platform.

ii. What is meant by a distributed file system (DFS)?

A distributed file system (DFS) provides seamless access to files stored across multiple physical servers in a network, appearing to users as a single, unified logical file system.

Benefits include:

Transparency: Users interact with the DFS as if it were a local file system, unaware of the physical distribution of files.
Scalability: Easily scales storage capacity by adding more servers to the DFS.
Availability: High availability is achieved by replicating files across multiple servers, ensuring access even if some servers fail.
Fault tolerance: If a server fails, other servers can provide access to the stored files, minimizing downtime.
Performance: Data access can be optimized by storing files closer to users or replicating frequently accessed files on multiple servers.

Challenges in DFS:

Maintaining consistency: Ensuring all replicas of a file are identical despite concurrent updates.
Managing naming and location: Efficiently locating files stored across multiple servers.
Security: Access control and data protection become more complex due to distributed storage.
Performance overhead: Network communication and management of replicas can introduce additional overhead.

Examples of DFS:

Google File System (GFS)
Hadoop Distributed File System (HDFS)
Network File System (NFS)
Amazon S3

Section D: (Each question of 6 mark)

i. a. Threads in Distributed Systems:

Threads are lightweight processes within a single process that can be executed concurrently. While primarily used in single-system scenarios, they also play a role in distributed systems:

Benefits:

Improved responsiveness: Distributed applications can utilize multiple threads to perform tasks concurrently, even on single-processor machines, creating a seemingly more responsive experience.

Efficient resource utilization: Threads share the resources of their parent process, leading to lower memory overhead compared to creating separate processes for each task.

Fine-grained synchronization: Threads within a process can more efficiently collaborate and synchronize access to shared resources compared to inter-process communication.

Challenges:

Distributed scheduling: Scheduling threads across multiple nodes in a distributed system introduces additional complexity.

Deadlock potential: When dealing with shared resources across the network, deadlock risks increase as more threads compete for access.

Distributed debugging: Debugging issues involving concurrent threads and network interactions can be more complex.

Common uses of threads in distributed systems:

Handling multiple client requests on a server concurrently.

Performing asynchronous communication tasks within a distributed application.

Managing background tasks like data replication or fault tolerance mechanisms.

i. b. Distributed File System (DFS):

A distributed file system (DFS) is a fundamental component of a distributed operating system (DOS). It is a key technology that enables users to access and manage files stored across multiple interconnected computers as if they were all part of a single file system. The primary goal of a distributed file system is to provide transparent access to files and data distributed across a network of computers while offering reliability, scalability, and performance.

Here are some key aspects and features of distributed file systems:

Transparency: One of the primary objectives of a distributed file system is to provide transparency to users and applications.

Location Independence: Users interact with files using logical file names or directory paths without needing to know the physical location of the files or the specific machines on which they are stored.

Scalability: Distributed file systems are designed to scale horizontally, allowing them to accommodate growing amounts of data and increasing numbers of users and client nodes.

Fault Tolerance: Distributed file systems incorporate mechanisms to ensure data availability and reliability in the face of failures.

Concurrency Control: In multi-user environments, distributed file systems must support concurrent access to files and ensure data consistency and integrity.

Caching and Performance Optimization: Distributed file systems employ caching mechanisms to improve performance by reducing the latency of file accesses and minimizing network traffic.

Security: Security is a critical aspect of distributed file systems, particularly in open or shared environments.

Examples of distributed file systems include:

NFS (Network File System): Developed by Sun Microsystems, NFS is a widely used distributed file system protocol that allows remote clients to access files and directories over a network.

AFS (Andrew File System): A distributed file system developed at Carnegie Mellon University, AFS provides location-independent access to files and supports large-scale distributed computing environments.

HDFS (Hadoop Distributed File System): HDFS is a distributed file system designed for storing and processing large volumes of data across clusters of commodity hardware. It is a core component of the Apache Hadoop ecosystem for big data analytics and processing.

CephFS: Ceph is a distributed storage system that provides a distributed file system interface on top of its object storage infrastructure.

Overall, distributed file systems play a crucial role in enabling distributed computing environments by providing transparent, reliable, and scalable access to shared files and data across networks of interconnected computers

ii. File Server Architecture:

File server architecture is a classic approach to providing file access in a distributed system.

It involves:

Centralized server: A dedicated server stores all files and acts as the single point of access.
Clients: Users and applications access files by sending requests to the file server.
Protocols: Protocols like NFS or CIFS enable communication between clients and the server.

Advantages:

Simple to manage: Centralized administration and control of files and access permissions.

Efficient for small networks: Offers good performance for a limited number of clients and files.

Scalability: Can be scaled to some extent by adding more resources to the server.

Disadvantages:

Single point of failure: If the server fails, all file access becomes unavailable.

Performance bottleneck: Server becomes congested as the number of clients and file access requests increases.

Limited scalability: Adding more servers adds complexity and may not provide significant performance gains.

Modern distributed systems often use more advanced architectures like:

Distributed Hash Tables (DHTs): Files are distributed across multiple nodes based on a hash function, improving scalability and fault tolerance.

Peer-to-peer (P2P) file sharing: Nodes directly share files with each other, eliminating the need for a central server.

The choice of architecture depends on various factors like scalability requirements, fault tolerance needs, and performance demands.

IndianTechnoEra