Distributed Operating System - DOS - FaQ 3 - IndianTechnoEra

Section-A (Each question of 1 mark)

i. Define System model.

A system model is a conceptual representation of the structure and behavior of a distributed system. It encompasses the following key elements:

Processes: Units of computation that execute concurrently.
Resources: Memory, I/O devices, processors, files, and other elements required by processes.
Communication: Mechanisms for processes to exchange information and coordinate their activities (e.g., message passing, remote procedure calls).
Concurrency: Concurrent execution of multiple processes, potentially on different nodes.
Distribution: Processes and resources may be physically separated across a network.
Failure: Nodes, communication links, or resources may fail, necessitating fault tolerance mechanisms.

ii. What is the architectural model?

An architectural model defines the high-level organization of a distributed system, including:

Number of layers: Layered architectures provide modularity and separation of concerns. Common layers include the application layer, network layer, communication layer, kernel layer, and hardware layer.
Distribution of functionality: Where different functionalities (e.g., resource management, naming, synchronization) are implemented within the system.
Transparency: The degree to which the distributed nature of the system is hidden from users and applications.

iii. What is the fundamental model?

Several fundamental models underpin distributed systems:

Client-server: Processes act as clients requesting services from other processes acting as servers.
Peer-to-peer: Processes participate equally in providing and consuming services.
Hybrid: Combines aspects of client-server and peer-to-peer models.
Mobile agent: Agents travel through the network, autonomously executing tasks on different nodes.

iv. What are the difficult for treat and distributed system?

Heterogeneity: Systems may have different hardware, software, network protocols, and administrative domains.

Concurrency: Coordinating and ensuring correctness of concurrent processes involves challenges like synchronization, mutual exclusion, and deadlock prevention.

Distribution: Communication costs add overhead, and failures can be more complex to handle.

Security: Maintaining data and system integrity across a network raises security concerns.

Scalability: Performance should degrade gracefully as the system grows larger.

v. Define Middleware.

Middleware is software that provides an abstraction layer between applications and the underlying distributed system, simplifying programming and development.

Key functions include:

Communication support: Message passing, remote procedure calls, group communication.
Naming and directory services: Identifying and locating resources in the system.
Replication and consistency management: Maintaining data consistency across multiple copies.
Transaction management: Ensuring atomicity, consistency, isolation, and durability (ACID) properties for actions.
Security services: Authentication, authorization, encryption, and other security mechanisms.

vi. Different Types of Models:

Several models address specific concerns in distributed systems:

Distributed computing models: MapReduce, Hadoop, distributed database models.
Distributed coordination models: Distributed transactions, consensus algorithms, leader election.
Distributed file system models: NFS, CIFS, Hadoop Distributed File System (HDFS).
Security models: Role-based access control (RBAC), Kerberos, public-key infrastructure (PKI).

Section-B (2 mark)

i. Describe in detail about client - server communication.

Client-server communication is a fundamental interaction pattern in distributed systems. It involves two distinct entities:

Client: Initiates requests for services or resources from another entity.
Server: Provides services or resources upon receiving requests from clients.

Communication mechanisms:

Remote Procedure Calls (RPC):

Client invokes a procedure on the server as if it were a local procedure.

Server performs the task and returns results to the client.

Advantages: Transparency, hides communication details, easier development.

Disadvantages: Overhead for marshalling/unmarshalling data, potential performance bottleneck at server.

Message Passing:

Client sends messages containing data and instructions to the server.

Server processes messages and sends responses (if needed).

More flexible than RPC, but requires manual message handling and synchronization.

Publish-Subscribe:

Clients subscribe to channels (topics) of interest.

Server publishes messages to channels, reaching subscribed clients.

Efficient for broadcast-like communication, decoupling sender and receiver.

Important aspects:

Naming and discovery: Clients need to find server addresses (e.g., DNS, service registries).

Transport protocols: TCP for reliable, ordered delivery; UDP for speed-critical, less reliable scenarios.

Serialization/marshalling: Convert data structures to a format suitable for transmission.

Security: Authentication, authorization, encryption to protect communication.

ii. Write in detail about group communication.

Group communication involves sending messages to a set of processes simultaneously. It plays a crucial role in applications like:

Data dissemination: Sharing updates with all members of a group (e.g., stock prices, chat messages).
Coordination: Processes agree on a common state or make decisions collectively (e.g., distributed database updates).

Types of group communication:

Unordered Multicast: Messages may be delivered to members in any order.
Ordered Multicast: Messages are delivered to members in the same order they were sent.
Reliable Multicast: Guaranteed delivery of messages to all members, even if failures occur.
Byzantine Fault Tolerance: Tolerates arbitrary process failures, including malicious behavior.

Additional considerations:

Group membership: How processes join and leave the group.
Failure handling: Techniques to maintain communication despite failures.
Scalability: Ensuring efficient communication as the group size grows.

Real-world examples:

Email distribution lists
Instant messaging groups
Online multiplayer games
Distributed resource management

Section-C (4 mark)

i. What is meant by invocation performance?

Invocation performance refers to the efficiency and speed of remote procedure calls (RPCs) or message passing operations in a distributed system. It encompasses the time taken for a client to invoke a method on a remote server and receive the results.

Several factors contribute to invocation performance:

Network latency: Distance, bandwidth, and network congestion can add significant delays.
Protocol overhead: Encoding/decoding data, message routing, and security measures contribute overhead.
Operating system overhead: System calls, context switching, and memory management incur costs.
Server processing: Time taken by the server to execute the requested operation.
Middleware processing: If present, middleware software might add processing overhead.

Factors affecting invocation performance:

Message size: Larger messages take longer to transmit and process.
Complexity of operation: Complex operations require more processing on the server.
Frequency of invocations: Frequent invocations can overload the network or server.
Concurrency: Simultaneous invocations require careful synchronization to avoid contention.

Optimizing invocation performance:

Reduce message size: Minimize data sent between client and server.
Simplify operations: Design simpler operations to reduce server processing time.
Cache frequently accessed data: Avoid redundant network transfers.
Use asynchronous invocations: Allow clients to continue processing while waiting for response.
Optimize middleware: Choose efficient middleware with low overhead.
Tune network parameters: Adjust network settings to optimize performance.

Importance of invocation performance:

It directly impacts responsiveness and perceived speed of distributed applications.

Affects resource utilization and scalability of the system.

Can be a key differentiator between competing distributed systems.

ii. Difference between monolithic and micro kernels.

Monolithic kernel:

All operating system components (file system, device drivers, networking) are bundled into a single large kernel.

Offers tight integration and performance, often used in resource-constrained environments.

Less modular and flexible, making it difficult to add new features or fix bugs.

Sensitive to changes in one component affecting the entire kernel.

Microkernel:

Only core functionalities (memory management, process management) reside in the microkernel.

Other services (file system, networking) run as user-space processes outside the kernel.

More modular and flexible, allowing easier addition and update of system components.

Generally slower than monolithic kernels due to context switching between kernel and user-space processes.

Choosing between monolithic and microkernels:

Monolithic kernel: Suitable for performance-critical systems with limited need for flexibility.
Microkernel: Preferred for systems requiring modularity, flexibility, and easy customization.

Modern operating systems often use hybrid approaches, incorporating elements of both architectures to achieve a balance between performance, modularity, and security.

Section-D (6 marks)

i. Explain Global States and distributed debugging.

Global State:

In a distributed system, processes operate independently on multiple machines.

A global state captures the combined state of all processes and communication channels at a specific point in time.

It's important for various tasks like debugging, deadlock detection, and system monitoring.

Challenges:

Processes are geographically dispersed, so information needs to be gathered from each machine.

Events happen concurrently, making it difficult to capture a truly "instantaneous" global state.

Communication introduces delays, leading to inconsistent snapshots.

Algorithms:

Snapshot algorithms: Processes exchange information about their local states and message channels at a specific point in time.

Chandy-Lamport: Efficient algorithm based on message markers.

Mattern-Mella: Handles asynchronous communication and message passing in flight.

Distributed Debugging:

Involves identifying and fixing bugs in distributed systems.

Traditional debugging tools designed for single systems are inadequate.

Distributed debugging tools utilize global state information to:

Examine process states, variables, and call stacks.

Analyze message flow and interactions between processes.

Identify events leading to errors or unexpected behavior.

Challenges:

Accessing distributed system state across various machines.

Visualizing complex interaction patterns and message flows.

Dealing with temporal aspects and potential non-determinism.

Tools and Techniques:

Debuggers with distributed awareness (e.g., Allinea GRID, TotalView).

Monitoring tools for real-time system observation.

Event logging and analysis frameworks.

ii. Explain the algorithms for mutual exclusion.

Mutual exclusion: Ensures exclusive access to a shared resource by only one process at a time. Critical for maintaining data integrity and preventing conflicts.

Centralized Algorithms:

Single coordinator: A designated process grants access to the resource upon request.

Simple but creates a single point of failure.

Distributed Algorithms:

Token-based: A unique token circulates among processes; the holder has exclusive access.

Efficient but token loss/replication can cause issues.

Logical clocks: Processes use timestamps to determine access order.

No central authority but complex and prone to message delays.

Election algorithms: Processes elect a leader who grants access; others wait.

Dynamic and fault-tolerant but election overhead can impact performance.

Choosing the right algorithm:

Depends on system characteristics, performance requirements, and fault tolerance needs.

Additional considerations:

Scalability: How well the algorithm performs as the number of processes increases.

Deadlock avoidance: Algorithms should prevent processes from waiting indefinitely.

Liveness: Processes eventually gain access to the resource when not in use by others.

IndianTechnoEra