100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
IN4150 Distributed Algorithms - Summary $14.98
Add to cart

Summary

IN4150 Distributed Algorithms - Summary

 23 views  0 purchase
  • Course
  • Institution

This PDF is a cheatsheet (but cannot be used during the exam) containing all relevant information very densely presented.

Preview 1 out of 4  pages

  • November 20, 2021
  • 4
  • 2020/2021
  • Summary
avatar-seller
CHAPTER 1 – INTRODUCTION
A distributed system (DS) consists of multiple autonomous processors that do not share memory, but cooperate by sending messages over a communications network.
DS characteristics: • Autonomy (distribution of authority) • Cooperation (distribution of functionality) • Communication (distribution of information)
DS properties: • No regular structure (need protocols) • No directly accessible common state (maintain a logical common state by message passing)
• No common clock (synchronize through message passing) • Non-determinism (individual progress) • Independent failure modes
DS motivations: • Organizational • Resource sharing • Extensibility • Availability • Reliability • Security • Performance
DS requirements: • Transparancy (a DS should present itself as a single entity) • Scalability (capacity should proportionally increase) • Consistency (in terms of performance, UI, global state)
Parallel vs. distributed: • Small vs. large task granularity • Frequent (ms scale) vs. infrequent (min scale) communication • Homogenous vs. heterogenous tasks • Simultaneous vs. synchronized
execution • Homogenous vs. heterogenous hardware
In general, DSs are relevant to the transport layer of the seven-layer OSI-model, which is concerned with the communication between processes. When we discuss DAs for specific interconnection
structures (e.g. unidirectional rings, complete networks) we base ourselves on the network layer. When we want distributed algorithms to be resilient to link failures, we’re talking link layer.
Important techniques: • Replication • Caching • Locality • Time stamps • Timeouts • Randomization
CHAPTER 2 – MODELING
A DS is modeled as a set – a connected network – of processors/processes connected by unidirectional communication channels/links that do local computations, send messages, and receive messages.
The set of values of all its variables constitute the state of a process. Initial states and terminal states are subsets of states. Messages sent along the channel that have not yet been received contitute the
state of a channel (initially empty). A configuration of a DS is a set of simultaneous states of all its processes and channels.
An event e1 takes a DS from one state to the next. An execution of a DA in an asynchronous system is an alternating sequence of configurations and events with all processes and channels.
Asynchronous communication Synchronous communication
Non-blocking send, receive message is later than send message, receive may block or not. Send and receive message simultaneously (both blocking), never messages in transit.
Simulating a synchronous system: requires explicit acknowledgements to force the sender Simulating an asynchronous system: introduce for every link a process with a buffer that
only to continue when the message has been received. may introduce artificial delays for messages
In an asynchronous system: • message delays are finite but arbitrary • the relative processor In a synchronous system: • message delays are bounded • the relative processor speeds are
speeds may be unbounded • there is no common clock bounded • there is a common clock
Event-driven: an asynchronous DA consists of pieces of code, one for the receipt of each Time-driven: a synchronous DA consists of rounds.
message type, and possibly for internal events.
0
Simulations make a system/model behave differently through a software layer, i.e. suppose you have a DA A on model M , model M is more complicated (e.g., may exhibit errors), design a simulation
0 0
(protocol) P for M to let M behave like M , run A on top of P .
0
In a local simulation M with P looks like M to every process in the system, but to an outside observer there may be a difference (e.g. processes can be in different rounds).
In a global simulation there is no difference also to an outside observer.
A symmetrical DA has identitical processes. A uniform network/DA has its nodes not know the number of nodes. A anonymous network/DA doesn’t have node IDs.
CHAPTER 3 – SYNCHRONIZATION
Three types of events: • internal eventsS • message-send events • message-receive events
Set of events in Pi : Ei (total set E = E
i i )
The happened-before relation (HB-relation) (or causality relation, →) on E is the smallest relation satisfying:
– local order: if a, b in Ei , and a occurs before b, then a → b
– message exchange: if a in Ei is the event of sending a message m and b in Pj is the event of receiving m, then a → b
– transivity: if a → b and b → c, then a → c
If neither a → b nor b → a holds, then a and b are concurrent (notation: a||b).
Causal past (history) of an event a: P (a) = {b ∈ E|b → a}. Causal future of an event a: F (a) = {b ∈ E|a → b}. Events concurrent with a: C(a) = {b ∈ E|a||b}. E = P (a) ∪ C(a) ∪ F (a).
A logical clock is a function C : E → S (where S is some partially ordered set) which
– is consistent with the HB relation if a → b, then C(a) < C(b) (converse may not hold)
– characterizes the HB relation if C(a) < C(b) iff a → b
Sufficient conditions for a local logical clock Ci :
– if a and b in Ei and a → b, then Ci (a) < Ci (b)
– if a is the event of sending message m in Pi and b is the event of receiving m in Pj , then Ci (a) < Cj (b)
Scalar clocks Vector clocks
If a ∈ Ei is not a message-receive event, then Pi first increments Ci (e.g. by 1) and then Comparing two vector clocks v , w :
sets C(a) to the new value of Ci . – v = w iff v[i] = w[i] for all i
If a is sending m in Pi and b is receiving m in Pj , then Pi sends C(a) with m, Pj assigns – v ≤ w iff v[i] ≤ w[i] for all i (Similar for ≥)
Cj = max(Cj + 1, C(a) + 1) and C(b) = Cj . – v < w iff v[i] ≤ w[i] for all i and v 6= w (Similar for >)
A scalar clock cannot characterize the HB relation. Component-wise maximum: max(v, w)[i] = max(v[i], w[i])
Unit vector in dimension i: ei = (0, 0, ..., 0, 1, 0, ..., 0)
If a ∈ Ei is not a message-receive event, then Pi first increments Vi [i] by 1 and then sets
V (a) to the new value of Vi .
If a is sending m in Pi and b is receiving m in Pj , then Pi sends V (a) with m, Pj assigns
Vj = max(Vj + ej , V (a)) and V (b) = Vj .
a||b if V (a)[i] > V (b)[i] and V (a)[j] < V (b)[j] for any i, j .
Vector clocks do characterize the HB relation. Theorem: For a k-dimensional vector clock
to characterize the HB relation in a system with n processes, we need k ≥ n.
Synchronizers
Synchronizers are local simulations that allow running non-fault-tolerant synchronous algorithms (rounds of sending messages, performing local computations, and receiving messages) on asynchronous
systems, by letting the processes proceed in (simulated) rounds, checking whether all messages of a round have been received, and then letting a synchronizer generate a ”clock pulse”. But when to
clock? After all, a process has no (direct) way of knowing when it has received all messages, as delays are unbounded.
Method 1: Using acknowledgements Method 2: Always send one message to each neighbour
A node is safe with respect to a certain pulse if each of its messages has been received, Pack multiple messages into one; send an empty message in case nothing needs to be sent. A
determined by an ACK from each of its recipients. A safe node sends a SAFE message to its node knows that it has received all its messages in a pulse when it has received one message
neighbors. Then, a node knows that it has received all its messages in a pulse when all its from each of its neighbors.
neighbors are safe.
Awerbuch’s three types of synchronizers:
• α-synchronizers: communication-inefficient, time-efficient
Basically method 1. A node generates a new local pulse after receiving a SAFE message from all its neighbors.
• β -synchronizers: communication-efficient, time-inefficient
Init: Elect a leader s and create a spanning tree rooted at s. When a node finds it is safe and all its descendants are safe, it sends a SAFE message to its parent. When the root gets a SAFE
message from all its descendants and is safe, it sends a new PULSE message down the tree.
• γ -synchronizers: efficiency depends on the original graph and the cluster partitioning
Init: partition nodes into clusters, elect leaders per cluster, create spanning tree per clusters, create a preferred link between each pair of clusters.
β –synchronizer is executed in every tree/cluster. When a root knows its whole tree is safe, it broadcasts a CLUSTER SAFE message down its tree and across the preferred links. Nodes
convergecast READY messages upwards when they have received a READY message from each of its descendants and a CLUSTER SAFE message along all its preferred links.
Message Ordering
0
In non-FIFO asynchronous systems, applications sometimes need message ordering (can be seen as a simulation). For every process P , assume an additional P that checks whether the arriving
0
messages can be delivered to P according to the required order. If not, P stores the message in a buffer, and re-checks when messages that arrive later have been delivered. This means reception (at
P 0 ) and delivery (at P ) are distinct.
Set of destinations of message m is denoted by Dest(m). The event of multicasting m is denoted by m(m). The event of delivering m to Pi is denoted by di (m).
Message order is causal when for every two messages m1 and m2 : if m(m1 ) → m(m2 ), then di (m1 ) → di (m2 ) for all i in both Dest(m1 ) and Dest(m2 ).
Message order is total when all processes receive all messages in the same order. They don’t imply each other.
• Birman-Schiper-Stephenson algorithm for causal message ordering of broadcast messages
1
A message m is send with vector clock Vm . The condition for delivery of a message m in Pi from Pj is (V + ej ≥ Vm ) (i.e. Pi expects the next message from Pj and Pi is at least as up to
date as Pj wrt other processes).
• Schiper-Eggli-Sandoz algorithm for causal ordering of point-to-point messages
2
Every process maintains a local buffer Si = {(Pj , Vj ); ...} of ordered ID-vector clock pairs (”the most recent knowledge about what Pj should know, used to tell Pj what it should know”).
Messages are sent with the complete Si .
Condition for delivering m with buffer Sm in Pi is that (i, V ) ∈ / Sm or (i, V ) ∈ Sm and V ≤ Vi . After delivery, Sm and Si are merged (component-wise max).
• Algorithm for total message ordering of broadcast messages
– Simple solution: processes send their messages to a special sequencer process P0 , which numbers the messages and broadcasts them.
– Sophisticated solution with scalar clocks (with process IDs as tie breakers) and FIFO channels: every process maintains an ordered message queue, and messages (i.e. broadcasts with
timestamps) can be delivered in a process when it is at the head of the local message queue (the oldest message it knows about), and the process has received an ACK for that message from
every process (so no older message will arrive), i.e. all processes acknowledge all messages to all processes. When a process receives a message, it broadcasts an ACK for it.
Detecting Global States
Non-trivial because of a lack of a common clock and arbitrary delays of messages. Used for debugging and detecting stable properties (once it holds, it holds forever, e.g. termination, deadlock).
A cut is a set consisting of one internal event for every process: {c1 , c2 , ..., cn } with ci an internal event in Pi .
The vector time of a cut C is the component-wise max of it’s events: V (C) = max{V (c1 ), V (c2 ), ..., V (cn )}.
A cut it consistent iff
– there are no events ei ∈ Pi and ej ∈ Pj with ei → ej AND ej → cj AND ei 6→ ci (else ei → ej would be an orphan message: sent after but received before cut).
– ci ||cj for all i, j .
– the local components are the maximums: V (C) = (V (c1 )[1], V (c2 )[2], ..., V (cn )[n]).
Global state detection = finding a set of concurrent state-recording events.
The state of a channel should be the sequence of messages sent before the sender records its state minus the messages that have been received before the recipient does so. It turns out that it is too
ambitious to determine a state an asynchronous system has actually been in. We assume fault-free, FIFO, infinite-capacity channels.
• Chandy’s and Lamport’s algorithm for determining global states
Any process may spontaneously start. It first records its own local state, and then sends a marker along every outgoing channel. Upon the first receipt of a marker along any channel, a process
records the state of that channel as the empty state, records its own local state, sends a marker along every outgoing channel, and creates an empty FIFO message buffer for each of its incoming
channels (except for the one along which the first marker is received), used for recording the channel state as defined above. Every subsequently message received along a channel is entered into
the corresponding buffer until a marker is received along it. A process has finished its part of the algorithm when it has received a marker along every incoming channel.
0
The resulting recorded state might not actually have occurred at the same moment in time. This is not actually a problem, because there is a sequence S of events equivalent to the actual
0 0
sequence S such that the recorded state does occur in S , i.e. the sequences of messages on the channels in S after all prerecording and before all post-recording events are identical to the
states recorded for these channels by the algorithm. The state can still be used for detecting stable properties.
Termination Detection
About determining that a distributed computation has finished. A process can be active or passive (cannot send computation-related messages). An active process can always become passive, but a
passive process can only become active when receiving a computational message.
• Termination detection algorithm for a unidirectional ring with FIFO links
Processes are numbered decreasingly, i.e. Pi can only send to Pi−1 . P0 starts by sending a token (to Pn−1 ) along the ring. Each process and the token can be white (good) or black (spoiled).
Termination is detected when the white token returns to P0 , indicating that all processes are passive. (If the token returns black, try again.) If a process sends a message to a higher-numbered
process (where the token may have already passed, i.e. it’s passiveness is spoiled), the sending process turns black. When the token arrives at a black process, it turns black, and the process
turns white again to prepare for the next attempt.
• General termination detection algorithm
There is one special process P that does not participate in the application itself. This process starts at weight 0, all other processes start with weight 1/n. When a message is send, the sending
3
process transfers half of its weight to the recipient. When a process becomes passive, it transfers its weight to P . Termination when P has weight 1.

1 Vector clocks are only used for message ordering
2 Vector clocks used for entire operation, not just message ordering.
3 Transferring means that the sender loses weight and the recipient gains weight.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller emielbos. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $14.98. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

53022 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$14.98
  • (0)
Add to cart
Added