Distributed Systems
Shikha Gautam
Assistant Professor
KIET, Ghaziabad
UNIT-1
• Characterization of Distributed Systems:
Introduction, Examples of distributed Systems,
Resource sharing and the Web Challenges.
Architectural models, Fundamental Models.
• Theoretical Foundation for Distributed System:
Limitation of Distributed system, absence of
global clock, shared memory, Logical clocks
,Lamport’s & vectors logical clocks.
• Concepts in Message Passing System.
What is a distributed system?
• A network of autonomous computers that
communicate by message passing to achieve a
common goal.
• Consequences:
– Concurrency
– No global clock
– Independent failures
• Motivation
– Resource sharing
Architecture of D.S.
Why we use Distributed Systems?
What are the advantages?
Other Advantages
Examples Of Distributed Systems
Resource Sharing and the Web
Design Challenges of D.S.
• Heterogeneity
• Openness
• Security
• Scalability
• Failure handling
• Concurrency
• Transparency
Challenges: Openness
Challenges: Security
Challenges: Scalability
Challenges: Failure handling
Challenges: Concurrency
Challenges: Transparency
• Systems that are intended for use in real-world
environments should be designed to function
correctly in the widest possible range of
circumstances and in the face of many possible
difficulties and threats.
• Different system models are:
System Model
1. Architectural models: Based on the
architectural style, e.g., classifying the
processes as server, client, and peer.
2. Fundamental model: Based on the some
fundamental properties, such as
characteristics, failures, and security.
Architectural models
Architectural models
• Architecture: structure in terms of separately
specified components and how
these components are placed.
• Overall goal: structure will meet present and
likely future demands
• Major concerns: make system
– Reliable
– Manageable
– Adaptable
– Cost-effective
• Architectural model
– Simplifies & abstracts functions of components
– Placement of components
– Interrelationships between components
• Overview
– Software layers
– System architectures
– Design requirements
Architectural models
Architectural models:
Software layers
Applications, serv ices
Computer and network hardware
Platf orm
Operating system
Middleware
• Platform
– Various implementations
– Provides communication & cooperation between
processes
Architectural models:
Software layers
Architectural models:
Software layers
• Middleware
– Purpose
• Mask heterogeneity
• Provide convenient programming model
– Raises level of communication activities
• Remote method invocation: RMI, CORBA, DCOM
• Group communication
• Notification of events
• Partitioning, replication of shared data
– Provides infrastructural services
• Naming, transactions, persistent storage
Architectural models:
Software layers
• Middleware limitations :
end-to-end argument
– Some aspects require support at application level
Architectural models
• Architectural model
– Simplifies & abstracts functions of components
– Placement of components
– Interrelationships between components
• Overview
– Software layers
– System architectures
– Design requirements
Architectural models:
System architectures
Various Architectural models
1. Client-server model
2. Peer-to-peer
Variations of the above two:
• Proxy server
• Mobile code
• Mobile agents
• Network computers
• Thin clients
• Mobile devices
Architectural models:
1. Client-Server Model
• Defines roles for two interacting entities
1. Client:
 needs a particular service
 sends request to server and gets reply
2. Server:
 awaits requests from clients
 performs requested function
• The client-server model is usually based on a simple
request/reply protocol, implemented with send/receive
primitives or using remote procedure calls (RPC) or
remote method invocation (RMI).
• Server can be client of another server.
Client-Server Model
Serv er
Client
Client
inv ocation
result
Serv erinv ocation
result
Process:
Key:
Computer:
Architectural models:
2. Peer-to-Peer Model
All processes (objects) play similar role.
• Processes interact without particular distinction between
clients and servers.
• The pattern of communication depends on the particular
application.
• A large number of data objects are shared; any individual
computer holds only a small part of the application database.
• Processing and communication loads for access to objects
are distributed across many computers and access links.
• This is the most general and flexible model.
Peer-to-Peer Model
Proxy servers and caches
Client
Proxy
Web
server
Web
server
server
Client
+ Reduce load on network & web servers
- Consistency!
Mobile code
– Good interactive response
– Potential security threat
a) client request results in the downloading of applet code
Web
server
Client
Web
serverApplet
Applet code
Client
b) client interacts with the applet
Architectural models
• Architectural model
– Simplifies & abstracts functions of components
– Placement of components
– Interrelationships between components
• Overview
– Software layers
– System architectures
– Design requirements
Architectural models:
Design requirements
Architectural models:
Design requirements
• Minimal requirement:
– maintain functionality of a non-distributed system
• added value:
– extended resource access
– extended application interface for explicit sharing, fault tolerance, etc.
– advanced end user applications: CSCW (computer supported
cooperative work)
• Quality of Service:
– Reliability
– Security
– Performance
– Adaptability
User Requirements
Quality of service
• Reliability and availability
– reliability = measure of the likelihood of the
system to deviate from the designed behaviour
– increased by enabling failure detection and
recovery
– highly reliable services  often worse response
– fault tolerant system: detects failures and either
• fails gracefully (predictably)
• masks the fault
User Requirements
Quality of service
• Security: new problems
– privacy and integrity of users data in network
packets
• by tampering the network cable
• by connecting a machine to read and/or inject data
packets
– openness to interface with system software
• not all machines are physically secure
User Requirements
Quality of service
• Performance
– Responsiveness
– Throughput
• Processing speed at clients & servers + data transfer
rate
– Balancing computational load
Fundamental Models
Fundamental Models
• System model gives answers to
– What are the main entities in the system?
– How do they interact?
– What are characteristics that affect individual &
collective behavior?
• Purpose of model:
– Make explicit all relevant assumptions
– Make generalizations concerning what is possible or
impossible
Fundamental models
Three fundamental models:
1. Interaction Model
2. Failure Model
3. Security Model
Fundamental models:
Interaction model
Performance characteristics of
communication channels
1. Latency: The delay time. The start of message
transmission from one process and the beginning of
its receipt by another process.
2. Bandwidth: Total amount of information that can be
transmitted over a communication channel per
second.
3. Jitter: variation in the time taken to deliver a series
of messages.
Interaction Model [Synchronous vs
Asynchronous]
Fundamental models:
Interaction model
• Time is important
– E.g. multimedia application requires timeliness
– E.g. Event ordering problem in email Inbox
Item From Subject
1 Z Re:Meeting
2 X Meeting
3 Y Re:Meeting
Fundamental models:
Interaction model
• No global notion of time
• Synchronization of time impossible due to:
– Performance variations:
• Latency
• Bandwidth
• Processing time for messages
– Computers have different clock drift rates
Fundamental models:
Interaction model
• Synchronous distributed systems
– Upper & lower bounds for
• Time to execute processing step
• Message transmission
• Clock drift rate
– Allow
• Use of timeouts to detect process failure
• Guarantee of timeliness (multimedia)
• Partial clock synchronisation
Fundamental models:
Interaction model
• Asynchronous distributed systems
– No time bounds
– Many systems are asynchronous
• E.g. Internet
• Due to sharing of processors & communication
channels
• Often offer the best performance (because no
resources are wasted)
– Consequences:
• Clock synchronization impossible
• No guarantee of timeliness possible
Fundamental models:
Interaction model
• Solution to ordering problem
–With (perfect) clock synchronization
no problem
–In asynchronous model
• Facts:
– Ordering possible within a single process
– Send m before receive m
Event ordering possible
• Implementation: logical clocks
Fundamental models
• Aspects captured in models:
– Interaction
– Failure
– Security
Failure Model
Fundamental models:
Failure model
Class of failure Affects Description
Fail-stop Process Process halts and remains halted. Other processes may
detect this state.
Crash Process Process halts and remains halted. Other processes may
not be able to detect this state.
Omission Channel A message inserted in an outgoing message buffer never
arrives at the other end’s incoming message buffer.
Send-omission Process A process completes a send,but the message is not put
in its outgoing message buffer.
Receive-omissionProcess A message is put in a process’s incoming message
buffer, but that process does not receive it.
Arbitrary
(Byzantine)
Process or
channel
Process/channel exhibits arbitrary behaviour: it may
send/transmit arbitrary messages at arbitrary times,
commit omissions; a process may stop or take an
incorrect step.
Fundamental models:
Failure model
• Timing failures
– Applicable in synchronous systems
Class of Failure Affects Description
Clock Process Process’s local clock exceeds the bounds on its
rate of drift from real time.
Performance Process Process exceeds the bounds on the interval
between two steps.
Performance Channel A message’s transmission takes longer than the
stated bound.
Fundamental models:
Failure model
• How can distributed systems fail?
– Partial failure of
• processes and communication channels
• Kind of failure:
• Omission
• Arbitrary
• Timing
Fundamental models:
Failure model
• Omission failure
= Failure to perform an action
a) Processes:
• Subclasses:
– Crash no further execution
– Fail-stop crash + detection possible
• Consequences for asynchronous systems
– Failure not detectable
– Reaching agreement impossible
Fundamental models:
Failure model
• Omission failure
– Communication:
• Send-omission
• Receive-omission
• Channel-omission
processp process q
Communication channel
send
Outgoing m essage buff er Incoming message buff er
receivem
Fundamental models:
Failure model
• Arbitrary or Byzantine failures:
= Worst possible failure semantics
• Any behavior possible
– Processes:
• Omit processing steps
• Perform unintended steps
– Communication
• Message contents corrupted
• Non-existing message delivered
• Messages delivered twice
• Rare: checksums, sequence numbers
Fundamental models:
Failure model
Examples:
• Checksums: corrupted message  omission failure
• Retransmission of message: hide omission failure
Fundamental models
• Aspects captured in models:
– Interaction
– Failure
– Security
Fundamental models:
Security model
• Avoid unauthorized use of resources.
• Secure processes and interactions.
• Threat to processes & communication
channels.
Communication channel
Copy of m
Process p Process qm
The enemy
m’
Fundamental models:
Security model
Protecting objects
• Protecting objects/resources by
– giving access rights to users
– associating with each invocation an authority (a user
with access rights) who allows for the use of the
object or asked for it
e.g. user asks a remote process to print something on his
printer,the authority here is the user.
• Server checks identity of authority and checks its
access rights
• Works only if communication is secure.
Fundamental models:
Security model
• Securing processes and interactions
– Threats to processes
• False identification of sender of message
– Threats to communication channels
• Copy, alter, inject messages
– Denial of service
• Overload resource (channel, processor)
Fundamental models:
Security model
• Defeating security threats
– Cryptography
– Shared secrets
Authentication
Secure channels
Principal A
Secure channelProcess p Process q
Principal B
Limitation of Distributed systems
Limitation of Distributed systems
– A DS is a collection of computers that are spatially
separated and do not share a common memory
– Processes communicate by exchanging messages over
communication channel
– Messages are delivered after an arbitrary transmission
delay
– DS suffers some inherent limitations because of lack
of common memory and a system wide common clock
– How these limitations can be overcome?
1. Absence of global clock
• There is no system-wide common clock in a DS.
• Solutions can be:
– Either having a global clock common to al the computers, or
– Having synchronized clocks, one at each computer
• Both of the above solutions are impractical due to following reasons:
– If one global clock is provided in the DS:
• Two processes will observe a global clock value at different instants due
to unpredictable delays
• Two processes will falsely perceive two different instants in physical
time to be a single instant in physical time
– if the clocks of different systems are tried to synchronize:
• These clocks can drift from the physical time and the drift rate may vary
from clock to clock due to technological limitations
• This may also end up with the same result
• We cannot have a system of perfectly synchronized clocks
Impact of the absence of global time
Temporal ordering of events is integral to the design
and development of DS.
– an OS is responsible for scheduling processes
– A basic criterion used in scheduling is the temporal
order in which requests to execute processes arrive
– Hence, algorithms for DS are more difficult to
design and debug
– Also, the up-to-date state of the system is harder to
collect
• Due to the lack of shared memory, an up-to-date state
of the entire system is not available to any individual
process
• It is necessary for reasoning about the system’s
behavior, debugging and recovery
• A process in a DS can obtain a coherent but partial view
or a complete but incoherent view of the system
• A view is said to be coherent if all the observations of
different processes are made at the same physical time
• A complete view includes the local views (local states)
at all the computers and any messages that are in transit
in the DS
• A complete view is also referred to as global state.
2. Absence of Shared Memory
Need of Synchronized Clocks
Every computer is equipped with CMOS clock circuit.
These are electronic devices that count oscillations
occurring in a crystal.
• Also called timer, usually a quartz crystal, oscillating at a
well defined frequency.
• Timer is associated with two registers: A Counter and
a Holding Register, counter decreasing one at each
oscillations.
• When counter reaches zero, an interrupt is generated;
this is the clock tick.
• Clock tick have a frequency of 60-100 ticks per second.
Physical Clocks
It is impossible to guarantee that crystals in
different computers all run at exactly the same
frequency.
This difference in time values is clock skew.
Drifting of Clocks
Why do we need global clock
Lack of Global Time in DS
When each machine has its own clock, an event that occurred
after another event may nevertheless be assigned an earlier time.
Lack of Global Time in DS (Example)
a) Each processes with own clock with different rates.
b) Lamport's algorithm corrects the clocks.
c) Can add machine ID to break ties
For many purposes it is sufficient that all machines agree on the
same time.
Logical clocks
Often processes need to agree on the order in which events occur.
Logical Clocks
• For a certain class of algorithms, it is the
internal consistency of the clocks that matters.
The convention in these algorithms is to speak
of logical clocks.
Lamport showed clock synchronization need not be
absolute. What is important is that all processes agree on
the order in which events occur.
Lamport Timestamps
If a happens before b, then if any changes happens
in a that will definitely affect b.
• In general, an event changes the system state,
which in turn influences the occurrence and
outcome of future events
• past events influence future events and this
influence among causally related events (those
events that can be ordered by “ “ ) is referred to
as causal affects.
Casually Related Vs Concurrent Events
Lamport’s algorithm corrects the clocks.
Vector clock: In short
Obeying Causality
Concurrent Events
• The purpose of causal ordering of messages is to insure that
the same causal relationship for the "message send" events
correspond with "message receive" events.
- i.e. All the messages are processed in order that they were
created.
Two protocols that makes use of vector clocks for
the causal ordering of messages in distributed
systems.
1. Birman-Schiper-Stephenson Protocol -Processes
are assumed to communicate using broadcast
messages.
2. Schiper-Eggli-Sendoz Protocol – Does not require
processes to communicate only through broadcast
messages.
BSS Algorithm
• BSS: Birman-Schiper-Stephenson Protocol
• Broadcast based: a message sent is received by all other
processes.
• Deliver a message to a process only if the message preceding
it immediately, has been delivered to the process.
• Otherwise, buffer the message.
• Accomplished by using a vector accompanying the message.
Processes are assumed to communicate using
broadcast messages.
BSS Algorithm ...
1. Process Pi increments the vector time VTpi[i], time stamps,
and broadcasts the message m. VTpi[i] - 1 denotes the number
of messages preceding m.
2. Pj != Pi receives m. m is delivered when:
a. VTpj[i] == VTm[i] – 1 [Pj has received all messages from Pi before m]
b. VTpj[k] >= VTm[k] for all k in {1,2,..n} - {i}, n is the
total number of processes. Delayed message are queued
in a sorted manner. [Pj has received all those messages received by Pi before m]
c. Concurrent messages are ordered by time of receipt.
3. When m is delivered at Pj, VTpj updated according Rule 2 of
vector clocks.
2(a) : Pj has received all Pi’s messages preceding m.
2(b): Pj has received all other messages received by Pi
before sending m.
• All messages are time stamped by the sending process.
[Note: This time is separate from the global time talked about in
the previous sections. Instead each element of the vector
corresponds to the number of messages sent (including this one)
to other processes.]
• A message can not be delivered until:
– All the messages before this one have been delivered locally.
– All the other messages that have been sent out from the
original processs has been accounted as delivered at the
receiving process.
• When a message is delivered, the clock is updated.
• This protocol requires that the processes communicate through
broadcast messages since this would ensure that only one
message could be received at any one time (thus concurrently
time stamped messages can be ordered).
SES Algorithm
• No need for broadcast messages.
• Each process maintains a vector V_P of size N - 1,
N the number of processes in the system.
• V_P is a vector of tuple (P’,t): P’ the destination
process id and t, a vector timestamp.
• Tm: logical time of sending message m
• Tpi: present logical time at pi
• Initially, V_P is empty.
SES Algorithm: Example
$500
Communication ChannelS1:A S2:A
$200(a)
$450
Communication ChannelS1:A S2:A
$200(b)
$500
Communication ChannelS1:A S2:A
$250
(c)
Global State
The ability to extract and reason about the global state
of a distributed application has several other important
applications:
• distributed deadlock detection
• distributed termination detection
• distributed debugging
Q: Is it possible to assemble a global state from local
states in the absence of a global clock?
CA
P(IDLE)
CA
CA
P(ACTIVE)
P(IDLE)
W=1
W=w1+w2,w1>0,w2>0
W=0
W=w1 W=0+w2
W=w2
B(w2)
C(w2)
W=w1+w2
W=1 W=0
Distributed Systems Introduction and Importance

Distributed Systems Introduction and Importance

  • 1.
  • 2.
    UNIT-1 • Characterization ofDistributed Systems: Introduction, Examples of distributed Systems, Resource sharing and the Web Challenges. Architectural models, Fundamental Models. • Theoretical Foundation for Distributed System: Limitation of Distributed system, absence of global clock, shared memory, Logical clocks ,Lamport’s & vectors logical clocks. • Concepts in Message Passing System.
  • 4.
    What is adistributed system? • A network of autonomous computers that communicate by message passing to achieve a common goal. • Consequences: – Concurrency – No global clock – Independent failures • Motivation – Resource sharing
  • 5.
  • 10.
    Why we useDistributed Systems? What are the advantages?
  • 11.
  • 12.
  • 19.
  • 26.
    Design Challenges ofD.S. • Heterogeneity • Openness • Security • Scalability • Failure handling • Concurrency • Transparency
  • 28.
  • 29.
  • 32.
  • 37.
  • 38.
  • 39.
  • 41.
    • Systems thatare intended for use in real-world environments should be designed to function correctly in the widest possible range of circumstances and in the face of many possible difficulties and threats. • Different system models are: System Model
  • 42.
    1. Architectural models:Based on the architectural style, e.g., classifying the processes as server, client, and peer. 2. Fundamental model: Based on the some fundamental properties, such as characteristics, failures, and security.
  • 43.
  • 44.
    Architectural models • Architecture:structure in terms of separately specified components and how these components are placed. • Overall goal: structure will meet present and likely future demands • Major concerns: make system – Reliable – Manageable – Adaptable – Cost-effective
  • 45.
    • Architectural model –Simplifies & abstracts functions of components – Placement of components – Interrelationships between components • Overview – Software layers – System architectures – Design requirements Architectural models
  • 46.
    Architectural models: Software layers Applications,serv ices Computer and network hardware Platf orm Operating system Middleware
  • 47.
    • Platform – Variousimplementations – Provides communication & cooperation between processes Architectural models: Software layers
  • 48.
    Architectural models: Software layers •Middleware – Purpose • Mask heterogeneity • Provide convenient programming model – Raises level of communication activities • Remote method invocation: RMI, CORBA, DCOM • Group communication • Notification of events • Partitioning, replication of shared data – Provides infrastructural services • Naming, transactions, persistent storage
  • 49.
    Architectural models: Software layers •Middleware limitations : end-to-end argument – Some aspects require support at application level
  • 50.
    Architectural models • Architecturalmodel – Simplifies & abstracts functions of components – Placement of components – Interrelationships between components • Overview – Software layers – System architectures – Design requirements
  • 51.
  • 52.
    Various Architectural models 1.Client-server model 2. Peer-to-peer Variations of the above two: • Proxy server • Mobile code • Mobile agents • Network computers • Thin clients • Mobile devices
  • 53.
    Architectural models: 1. Client-ServerModel • Defines roles for two interacting entities 1. Client:  needs a particular service  sends request to server and gets reply 2. Server:  awaits requests from clients  performs requested function • The client-server model is usually based on a simple request/reply protocol, implemented with send/receive primitives or using remote procedure calls (RPC) or remote method invocation (RMI). • Server can be client of another server.
  • 54.
    Client-Server Model Serv er Client Client invocation result Serv erinv ocation result Process: Key: Computer:
  • 58.
    Architectural models: 2. Peer-to-PeerModel All processes (objects) play similar role. • Processes interact without particular distinction between clients and servers. • The pattern of communication depends on the particular application. • A large number of data objects are shared; any individual computer holds only a small part of the application database. • Processing and communication loads for access to objects are distributed across many computers and access links. • This is the most general and flexible model.
  • 59.
  • 61.
    Proxy servers andcaches Client Proxy Web server Web server server Client + Reduce load on network & web servers - Consistency!
  • 63.
    Mobile code – Goodinteractive response – Potential security threat a) client request results in the downloading of applet code Web server Client Web serverApplet Applet code Client b) client interacts with the applet
  • 68.
    Architectural models • Architecturalmodel – Simplifies & abstracts functions of components – Placement of components – Interrelationships between components • Overview – Software layers – System architectures – Design requirements
  • 69.
  • 70.
    Architectural models: Design requirements •Minimal requirement: – maintain functionality of a non-distributed system • added value: – extended resource access – extended application interface for explicit sharing, fault tolerance, etc. – advanced end user applications: CSCW (computer supported cooperative work) • Quality of Service: – Reliability – Security – Performance – Adaptability
  • 71.
    User Requirements Quality ofservice • Reliability and availability – reliability = measure of the likelihood of the system to deviate from the designed behaviour – increased by enabling failure detection and recovery – highly reliable services  often worse response – fault tolerant system: detects failures and either • fails gracefully (predictably) • masks the fault
  • 72.
    User Requirements Quality ofservice • Security: new problems – privacy and integrity of users data in network packets • by tampering the network cable • by connecting a machine to read and/or inject data packets – openness to interface with system software • not all machines are physically secure
  • 73.
    User Requirements Quality ofservice • Performance – Responsiveness – Throughput • Processing speed at clients & servers + data transfer rate – Balancing computational load
  • 74.
  • 75.
    Fundamental Models • Systemmodel gives answers to – What are the main entities in the system? – How do they interact? – What are characteristics that affect individual & collective behavior? • Purpose of model: – Make explicit all relevant assumptions – Make generalizations concerning what is possible or impossible
  • 76.
    Fundamental models Three fundamentalmodels: 1. Interaction Model 2. Failure Model 3. Security Model
  • 77.
  • 78.
    Performance characteristics of communicationchannels 1. Latency: The delay time. The start of message transmission from one process and the beginning of its receipt by another process. 2. Bandwidth: Total amount of information that can be transmitted over a communication channel per second. 3. Jitter: variation in the time taken to deliver a series of messages.
  • 79.
  • 81.
    Fundamental models: Interaction model •Time is important – E.g. multimedia application requires timeliness – E.g. Event ordering problem in email Inbox Item From Subject 1 Z Re:Meeting 2 X Meeting 3 Y Re:Meeting
  • 82.
    Fundamental models: Interaction model •No global notion of time • Synchronization of time impossible due to: – Performance variations: • Latency • Bandwidth • Processing time for messages – Computers have different clock drift rates
  • 83.
    Fundamental models: Interaction model •Synchronous distributed systems – Upper & lower bounds for • Time to execute processing step • Message transmission • Clock drift rate – Allow • Use of timeouts to detect process failure • Guarantee of timeliness (multimedia) • Partial clock synchronisation
  • 84.
    Fundamental models: Interaction model •Asynchronous distributed systems – No time bounds – Many systems are asynchronous • E.g. Internet • Due to sharing of processors & communication channels • Often offer the best performance (because no resources are wasted) – Consequences: • Clock synchronization impossible • No guarantee of timeliness possible
  • 85.
    Fundamental models: Interaction model •Solution to ordering problem –With (perfect) clock synchronization no problem –In asynchronous model • Facts: – Ordering possible within a single process – Send m before receive m Event ordering possible • Implementation: logical clocks
  • 87.
    Fundamental models • Aspectscaptured in models: – Interaction – Failure – Security
  • 88.
  • 90.
    Fundamental models: Failure model Classof failure Affects Description Fail-stop Process Process halts and remains halted. Other processes may detect this state. Crash Process Process halts and remains halted. Other processes may not be able to detect this state. Omission Channel A message inserted in an outgoing message buffer never arrives at the other end’s incoming message buffer. Send-omission Process A process completes a send,but the message is not put in its outgoing message buffer. Receive-omissionProcess A message is put in a process’s incoming message buffer, but that process does not receive it. Arbitrary (Byzantine) Process or channel Process/channel exhibits arbitrary behaviour: it may send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step.
  • 91.
    Fundamental models: Failure model •Timing failures – Applicable in synchronous systems Class of Failure Affects Description Clock Process Process’s local clock exceeds the bounds on its rate of drift from real time. Performance Process Process exceeds the bounds on the interval between two steps. Performance Channel A message’s transmission takes longer than the stated bound.
  • 92.
    Fundamental models: Failure model •How can distributed systems fail? – Partial failure of • processes and communication channels • Kind of failure: • Omission • Arbitrary • Timing
  • 93.
    Fundamental models: Failure model •Omission failure = Failure to perform an action a) Processes: • Subclasses: – Crash no further execution – Fail-stop crash + detection possible • Consequences for asynchronous systems – Failure not detectable – Reaching agreement impossible
  • 94.
    Fundamental models: Failure model •Omission failure – Communication: • Send-omission • Receive-omission • Channel-omission processp process q Communication channel send Outgoing m essage buff er Incoming message buff er receivem
  • 95.
    Fundamental models: Failure model •Arbitrary or Byzantine failures: = Worst possible failure semantics • Any behavior possible – Processes: • Omit processing steps • Perform unintended steps – Communication • Message contents corrupted • Non-existing message delivered • Messages delivered twice • Rare: checksums, sequence numbers
  • 96.
    Fundamental models: Failure model Examples: •Checksums: corrupted message  omission failure • Retransmission of message: hide omission failure
  • 97.
    Fundamental models • Aspectscaptured in models: – Interaction – Failure – Security
  • 98.
    Fundamental models: Security model •Avoid unauthorized use of resources. • Secure processes and interactions. • Threat to processes & communication channels. Communication channel Copy of m Process p Process qm The enemy m’
  • 99.
  • 100.
    Protecting objects • Protectingobjects/resources by – giving access rights to users – associating with each invocation an authority (a user with access rights) who allows for the use of the object or asked for it e.g. user asks a remote process to print something on his printer,the authority here is the user. • Server checks identity of authority and checks its access rights • Works only if communication is secure.
  • 101.
    Fundamental models: Security model •Securing processes and interactions – Threats to processes • False identification of sender of message – Threats to communication channels • Copy, alter, inject messages – Denial of service • Overload resource (channel, processor)
  • 102.
    Fundamental models: Security model •Defeating security threats – Cryptography – Shared secrets Authentication Secure channels Principal A Secure channelProcess p Process q Principal B
  • 103.
  • 104.
    Limitation of Distributedsystems – A DS is a collection of computers that are spatially separated and do not share a common memory – Processes communicate by exchanging messages over communication channel – Messages are delivered after an arbitrary transmission delay – DS suffers some inherent limitations because of lack of common memory and a system wide common clock – How these limitations can be overcome?
  • 105.
    1. Absence ofglobal clock • There is no system-wide common clock in a DS. • Solutions can be: – Either having a global clock common to al the computers, or – Having synchronized clocks, one at each computer • Both of the above solutions are impractical due to following reasons: – If one global clock is provided in the DS: • Two processes will observe a global clock value at different instants due to unpredictable delays • Two processes will falsely perceive two different instants in physical time to be a single instant in physical time – if the clocks of different systems are tried to synchronize: • These clocks can drift from the physical time and the drift rate may vary from clock to clock due to technological limitations • This may also end up with the same result • We cannot have a system of perfectly synchronized clocks
  • 106.
    Impact of theabsence of global time Temporal ordering of events is integral to the design and development of DS. – an OS is responsible for scheduling processes – A basic criterion used in scheduling is the temporal order in which requests to execute processes arrive – Hence, algorithms for DS are more difficult to design and debug – Also, the up-to-date state of the system is harder to collect
  • 107.
    • Due tothe lack of shared memory, an up-to-date state of the entire system is not available to any individual process • It is necessary for reasoning about the system’s behavior, debugging and recovery • A process in a DS can obtain a coherent but partial view or a complete but incoherent view of the system • A view is said to be coherent if all the observations of different processes are made at the same physical time • A complete view includes the local views (local states) at all the computers and any messages that are in transit in the DS • A complete view is also referred to as global state. 2. Absence of Shared Memory
  • 108.
  • 109.
    Every computer isequipped with CMOS clock circuit. These are electronic devices that count oscillations occurring in a crystal. • Also called timer, usually a quartz crystal, oscillating at a well defined frequency. • Timer is associated with two registers: A Counter and a Holding Register, counter decreasing one at each oscillations. • When counter reaches zero, an interrupt is generated; this is the clock tick. • Clock tick have a frequency of 60-100 ticks per second. Physical Clocks
  • 110.
    It is impossibleto guarantee that crystals in different computers all run at exactly the same frequency. This difference in time values is clock skew.
  • 111.
  • 112.
    Why do weneed global clock
  • 113.
    Lack of GlobalTime in DS
  • 114.
    When each machinehas its own clock, an event that occurred after another event may nevertheless be assigned an earlier time. Lack of Global Time in DS (Example)
  • 117.
    a) Each processeswith own clock with different rates. b) Lamport's algorithm corrects the clocks. c) Can add machine ID to break ties
  • 118.
    For many purposesit is sufficient that all machines agree on the same time. Logical clocks Often processes need to agree on the order in which events occur.
  • 119.
    Logical Clocks • Fora certain class of algorithms, it is the internal consistency of the clocks that matters. The convention in these algorithms is to speak of logical clocks. Lamport showed clock synchronization need not be absolute. What is important is that all processes agree on the order in which events occur.
  • 121.
  • 126.
    If a happensbefore b, then if any changes happens in a that will definitely affect b. • In general, an event changes the system state, which in turn influences the occurrence and outcome of future events • past events influence future events and this influence among causally related events (those events that can be ordered by “ “ ) is referred to as causal affects.
  • 128.
    Casually Related VsConcurrent Events
  • 133.
  • 141.
  • 142.
  • 144.
  • 149.
    • The purposeof causal ordering of messages is to insure that the same causal relationship for the "message send" events correspond with "message receive" events. - i.e. All the messages are processed in order that they were created.
  • 151.
    Two protocols thatmakes use of vector clocks for the causal ordering of messages in distributed systems. 1. Birman-Schiper-Stephenson Protocol -Processes are assumed to communicate using broadcast messages. 2. Schiper-Eggli-Sendoz Protocol – Does not require processes to communicate only through broadcast messages.
  • 153.
    BSS Algorithm • BSS:Birman-Schiper-Stephenson Protocol • Broadcast based: a message sent is received by all other processes. • Deliver a message to a process only if the message preceding it immediately, has been delivered to the process. • Otherwise, buffer the message. • Accomplished by using a vector accompanying the message. Processes are assumed to communicate using broadcast messages.
  • 154.
    BSS Algorithm ... 1.Process Pi increments the vector time VTpi[i], time stamps, and broadcasts the message m. VTpi[i] - 1 denotes the number of messages preceding m. 2. Pj != Pi receives m. m is delivered when: a. VTpj[i] == VTm[i] – 1 [Pj has received all messages from Pi before m] b. VTpj[k] >= VTm[k] for all k in {1,2,..n} - {i}, n is the total number of processes. Delayed message are queued in a sorted manner. [Pj has received all those messages received by Pi before m] c. Concurrent messages are ordered by time of receipt. 3. When m is delivered at Pj, VTpj updated according Rule 2 of vector clocks. 2(a) : Pj has received all Pi’s messages preceding m. 2(b): Pj has received all other messages received by Pi before sending m.
  • 155.
    • All messagesare time stamped by the sending process. [Note: This time is separate from the global time talked about in the previous sections. Instead each element of the vector corresponds to the number of messages sent (including this one) to other processes.] • A message can not be delivered until: – All the messages before this one have been delivered locally. – All the other messages that have been sent out from the original processs has been accounted as delivered at the receiving process. • When a message is delivered, the clock is updated. • This protocol requires that the processes communicate through broadcast messages since this would ensure that only one message could be received at any one time (thus concurrently time stamped messages can be ordered).
  • 158.
    SES Algorithm • Noneed for broadcast messages. • Each process maintains a vector V_P of size N - 1, N the number of processes in the system. • V_P is a vector of tuple (P’,t): P’ the destination process id and t, a vector timestamp. • Tm: logical time of sending message m • Tpi: present logical time at pi • Initially, V_P is empty.
  • 160.
  • 161.
    $500 Communication ChannelS1:A S2:A $200(a) $450 CommunicationChannelS1:A S2:A $200(b) $500 Communication ChannelS1:A S2:A $250 (c) Global State
  • 163.
    The ability toextract and reason about the global state of a distributed application has several other important applications: • distributed deadlock detection • distributed termination detection • distributed debugging Q: Is it possible to assemble a global state from local states in the absence of a global clock?
  • 173.