Ajay D. Kshemkalyani, Mukesh Singhal978-0-511-39341-9, 978-0-521-87634-6
Table of contents :
Cover……Page 1
Half-title……Page 3
Title……Page 5
Copyright……Page 6
Dedication……Page 7
Contents……Page 9
Background……Page 17
Readership……Page 18
Access to resources……Page 19
1.1 Definition……Page 21
1.2 Relation to computer system components……Page 22
1.3 Motivation……Page 23
1.4.1 Characteristics of parallel systems……Page 25
1.4.2 Flynn’s taxonomy……Page 30
Coupling……Page 31
Granularity of a program……Page 32
1.5 Message-passing systems versus shared memory systems……Page 33
1.6.1 Blocking/non-blocking, synchronous/asynchronous primitives……Page 34
1.6.2 Processor synchrony……Page 38
1.7 Synchronous versus asynchronous executions……Page 39
1.7.3 Emulations……Page 41
1.8.1 Distributed systems challenges from a system perspective……Page 42
Time and global state in a distributed system……Page 44
Synchronization/coordination mechanisms……Page 45
Data replication, consistency models, and caching……Page 46
Distributed shared memory abstraction……Page 47
Reliable and fault-tolerant distributed systems……Page 48
Load balancing……Page 49
Mobile systems……Page 50
Ubiquitous or pervasive computing……Page 51
Distributed agents……Page 52
1.9 Selection and coverage of topics……Page 53
1.10 Chapter summary……Page 54
1.11 Exercises……Page 55
1.12 Notes on references……Page 56
References……Page 57
2.1 A distributed program……Page 59
2.2 A model of distributed executions……Page 60
Causal precedence relation……Page 61
2.3 Models of communication networks……Page 62
2.4 Global state of a distributed system……Page 63
2.4.1 Global state……Page 64
2.5 Cuts of a distributed computation……Page 65
2.6 Past and future cones of an event……Page 66
2.7 Models of process communications……Page 67
2.10 Notes on references……Page 68
References……Page 69
3.1 Introduction……Page 70
3.2.2 Implementing logical clocks……Page 72
3.3.1 Definition……Page 73
No strong consistency……Page 74
3.4.1 definition……Page 75
Isomorphism……Page 76
3.4.3 On the size of vector clocks……Page 77
3.5 Efficient implementations of vector clocks……Page 79
3.5.1 Singhal–Kshemkalyani’s differential technique……Page 80
3.5.2 Fowler–Zwaenepoel’s direct-dependency technique……Page 82
3.6 Jard–Jourdan’s adaptive technique……Page 85
3.7.1 Definition……Page 88
3.8 Virtual time……Page 89
3.8.1 Virtual time definition……Page 90
3.8.2 Comparison with Lamport’s logical clocks……Page 91
3.8.3 Time warp mechanism……Page 92
Antimessages and the rollback mechanism……Page 93
Global virtual time……Page 95
Memory management and flow control……Page 96
Snapshots and crash recovery……Page 97
3.9.1 Motivation……Page 98
3.9.2 Definitions and terminology……Page 99
Clock offset and delay estimation……Page 100
3.10 Chapter summary……Page 101
References……Page 104
4.1 Introduction……Page 107
4.2.1 System model……Page 110
4.2.3 Interpretation in terms of cuts……Page 111
4.2.4 Issues in recording a global state……Page 112
4.3.1 Chandy–Lamport algorithm……Page 113
The algorithm……Page 114
4.3.2 Properties of the recorded global state……Page 115
4.4.1 Spezialetti–Kearns algorithm……Page 117
Efficient dissemination of the recorded snapshot……Page 118
4.4.2 Venkatesan’s incremental snapshot algorithm……Page 119
4.4.3 Helary’s wave synchronization method……Page 120
4.5 Snapshot algorithms for non-FIFO channels……Page 121
4.5.1 Lai–Yang algorithm……Page 122
4.5.2 Li et al.’s algorithm……Page 123
4.5.3 Mattern’s algorithm……Page 125
4.6 Snapshots in a causal delivery system……Page 126
4.6.2 Channel state recording in Acharya–Badrinath algorithm……Page 127
4.6.3 Channel state recording in Alagar–Venkatesan algorithm……Page 128
4.7 Monitoring global state……Page 129
4.8 Necessary and sufficient conditions for consistent global snapshots……Page 130
Difference between a zigzag path and a causal path……Page 132
Consistent global snapshots……Page 133
4.9 Finding consistent global snapshots in a distributed computation……Page 134
First observation……Page 135
Second observation……Page 136
Third observation……Page 137
4.9.2 Manivannan–Netzer–Singhal algorithm for enumerating consistent snapshots……Page 138
Construction of an R-graph……Page 139
4.10 Chapter summary……Page 141
4.12 Notes on references……Page 142
References……Page 143
5.1 Topology abstraction and overlays……Page 146
5.2.1 Application executions and control algorithm executions……Page 148
5.2.3 Symmetric and asymmetric algorithms……Page 149
5.2.6 Adaptive algorithms……Page 150
5.2.8 Execution inhibition……Page 151
5.2.9 Synchronous and asynchronous systems……Page 152
Process failure models [26]……Page 153
5.2.12 Wait-free algorithms……Page 154
5.3 Complexity measures and metrics……Page 155
5.4 Program structure……Page 157
5.5.1 Synchronous single-initiator spanning tree algorithm using flooding……Page 158
5.5.2 Asynchronous single-initiator spanning tree algorithm using flooding……Page 160
Design 1……Page 163
Design 2……Page 165
5.5.4 Asynchronous concurrent-initiator depth first search spanning tree algorithm……Page 166
5.5.5 Broadcast and convergecast on a tree……Page 168
5.5.6 Single source shortest path algorithm: synchronous Bellman–Ford……Page 169
5.5.7 Distance vector routing……Page 170
5.5.9 All sources shortest paths: asynchronous distributed Floyd–Warshall……Page 171
Asynchronous algorithm (Algorithm 5.9)……Page 175
Synchronous algorithm (Algorithm 5.10)……Page 176
5.5.11 Minimum-weight spanning tree (MST) algorithm in a synchronous system……Page 177
5.5.12 Minimum-weight spanning tree (MST) in an asynchronous system……Page 182
General observations on synchronous and asynchronous algorithms……Page 183
A simple synchronizer……Page 184
The synchronizer……Page 185
The synchronizer……Page 186
5.7 Maximal independent set (MIS)……Page 189
5.8 Connected dominating set……Page 191
5.9 Compact routing tables……Page 192
5.10 Leader election……Page 194
5.11 Challenges in designing distributed graph algorithms……Page 195
5.12.1 Problem definition……Page 196
Read……Page 197
5.12.4 Converging to an replication scheme……Page 198
5.13 Chapter summary……Page 202
5.14 Exercises……Page 203
5.15 Notes on references……Page 205
References……Page 206
Notation……Page 209
6.1.1 Asynchronous executions……Page 210
6.1.3 Causally ordered (CO) executions……Page 211
6.1.4 Synchronous execution (SYNC)……Page 214
6.2 Asynchronous execution with synchronous communication……Page 215
6.2.1 Executions realizable with synchronous communication (RSC)……Page 216
Asynchronous programs on synchronous systems……Page 219
6.3 Synchronous program order on an asynchronous system……Page 220
6.3.1 Rendezvous……Page 221
6.3.2 Algorithm for binary rendezvous……Page 222
6.4 Group communication……Page 225
6.5 Causal order (CO)……Page 226
6.5.1 The Raynal–Schiper–Toueg algorithm [22]……Page 227
6.5 Causal order (CO)……Page 228
Multicast M43……Page 233
Processing at P6……Page 234
6.6 Total order……Page 235
Sender……Page 236
Complexity……Page 238
6.7 A nomenclature for multicast……Page 240
6.8 Propagation trees for multicast……Page 241
6.9 Classification of application-level multicast algorithms……Page 245
Privilege-based algorithms……Page 246
Destination agreement algorithms……Page 247
6.10 Semantics of fault-tolerant group communication……Page 248
6.11.1 Reverse path forwarding (RPF) for constrained flooding……Page 250
Steiner tree problem……Page 251
6.11.3 Multicast cost functions……Page 252
Delay-bounded minimal Steiner tree problem……Page 253
6.11.5 Core-based trees……Page 255
6.13 Exercises……Page 256
6.14 Notes on references……Page 258
References……Page 259
7.1 Introduction……Page 261
7.2 System model of a distributed computation……Page 262
7.3.2 Formal description……Page 263
7.3.3 Discussion……Page 264
Basic idea……Page 265
7.4.2 Correctness of the algorithm……Page 266
7.5 A spanning-tree-based termination detection algorithm……Page 267
A problem with the algorithm……Page 268
The basic idea……Page 269
The algorithm description……Page 270
7.5.4 An example……Page 271
7.6 Message-optimal termination detection……Page 273
7.6.1 The main idea……Page 274
7.6.2 Formal description of the algorithm……Page 275
7.7 Termination detection in a very general distributed computing model……Page 277
7.7.2 Notation……Page 278
Informal description……Page 279
Formal description……Page 280
Informal description……Page 281
Formal description……Page 282
Assumptions……Page 283
7.8.2 A naive counting method……Page 284
7.8.3 The four counter method……Page 285
7.8.4 The sceptic algorithm……Page 286
Formal description……Page 287
7.8.6 Vector counters method……Page 288
Formal description……Page 290
7.9 Termination detection in a faulty distributed system……Page 292
The concept of flow invariant……Page 293
7.9.2 Taking snapshots……Page 294
Data structures……Page 295
7.9.2 Taking snapshots……Page 296
7.9.4 Performance analysis……Page 298
7.11 Exercises……Page 299
References……Page 300
8.1 The muddy children puzzle……Page 302
8.2.1 Knowledge operators……Page 303
8.2.2 The muddy children puzzle again……Page 304
8.2.3 Kripke structures……Page 305
Scenario A……Page 307
8.2.5 Properties of knowledge……Page 308
8.3 Knowledge in synchronous systems……Page 309
8.4.1 Logic and definitions……Page 310
8.4.2 Agreement in asynchronous systems……Page 311
Eventual common knowledge……Page 312
8.4.4 Concurrent common knowledge……Page 313
Three-phase send-inhibitory algorithm……Page 315
Complexity……Page 316
8.5 Knowledge transfer……Page 318
8.6 Knowledge and clocks……Page 320
8.7 Chapter summary……Page 321
8.8 Exercises……Page 322
References……Page 323
9.1 Introduction……Page 325
9.2.1 System model……Page 326
9.2.3 Performance metrics……Page 327
Best and worst case performance……Page 328
9.3 Lamport’s algorithm……Page 329
Correctness……Page 330
9.4 Ricart–Agrawala algorithm……Page 332
Correctness……Page 333
9.5 Singhal’s dynamic information-structure algorithm……Page 335
Data structures……Page 336
9.5.1 Description of the algorithm……Page 337
Achieving mutual exclusion……Page 339
Low load condition……Page 340
9.6 Lodha and Kshemkalyani’s fair mutual exclusion algorithm……Page 341
9.6.2 Description of the algorithm……Page 342
9.6.4 Message complexity……Page 345
9.7 Quorum-based mutual exclusion algorithms……Page 347
9.8 Maekawa’s algorithm……Page 348
Correctness……Page 349
Handling deadlocks……Page 350
9.9.1 Constructing a tree-structured quorum……Page 351
9.9.4 Examples of tree-structured quorums……Page 353
9.9.5 The algorithm for distributed mutual exclusion……Page 355
9.11 Suzuki–Kasami’s broadcast algorithm……Page 356
Performance……Page 358
9.12 Raymond’s tree-based algorithm……Page 359
9.12.1 The HOLDER variables……Page 360
Data structures……Page 361
MAKE_REQUEST……Page 362
Message overtaking……Page 363
Deadlock is impossible……Page 364
Starvation is impossible……Page 365
9.12.5 Cost and performance analysis……Page 366
9.12.7 Node failures and recovery……Page 367
9.14 Exercises……Page 368
9.15 Notes on references……Page 369
References……Page 370
10.2 System model……Page 372
10.3.1 Deadlock handling strategies……Page 373
Detection of deadlocks……Page 374
10.4 Models of deadlocks……Page 375
10.4.3 The OR model……Page 376
10.4.5 The p model……Page 377
10.5.1 Path-pushing algorithms……Page 378
10.5.4 Global state detection-based algorithms……Page 379
10.6 Mitchell and Merritt’s algorithm for the single-resource model……Page 380
10.7 Chandy–Misra–Haas algorithm for the AND model……Page 382
The algorithm……Page 383
Basic idea……Page 384
10.9 Kshemkalyani–Singhal algorithm for the P-out-of- model……Page 385
10.9.1 Informal description of the algorithm……Page 387
The problem of termination detection……Page 388
10.9.2 The algorithm……Page 389
10.10 Chapter summary……Page 394
10.12 Notes on references……Page 395
References……Page 396
11.1 Stable and unstable predicates……Page 399
Termination [20]……Page 400
11.2 Modalities on predicates……Page 402
11.3 Centralized algorithm for relational predicates……Page 404
11.4 Conjunctive predicates……Page 408
11.4.1 Interval-based centralized algorithm for conjunctive predicates……Page 409
11.4.2 Global state-based centralized algorithm for , where is conjunctive……Page 412
11.5.1 Distributed state-based token algorithm for, Possibly (Phi) where Phi is conjunctive……Page 415
11.5.2 Distributed interval-based token algorithm for Definitely (Phi), where is conjunctive……Page 417
11.5.3 Distributed interval-based piggybacking algorithm for Possibly (Phi), where Phi is conjuctive……Page 421
11.6 Further classification of predicates……Page 424
11.7 Chapter summary……Page 425
11.8 Exercises……Page 426
11.9 Notes on references……Page 427
References……Page 428
12.1 Abstraction and advantages……Page 430
12.2 Memory consistency models……Page 433
12.2.1 Strict consistency/atomic consistency/linearizability……Page 434
Implementations……Page 435
12.2.2 Sequential consistency……Page 437
Implementations……Page 438
Local-write algorithm……Page 439
12.2.3 Causal consistency……Page 440
12.2.4 PRAM (pipelined RAM) or processor consistency……Page 442
12.2.5 Slow memory……Page 443
12.2.7 Other models based on synchronization instructions……Page 444
Release consistency [12]……Page 445
Entry consistency [9]……Page 446
12.3.1 Lamport’s bakery algorithm……Page 447
12.3.2 Lamport’s WRWR mechanism and fast mutual exclusion……Page 449
12.3.3 Hardware support for mutual exclusion……Page 452
12.5 Register hierarchy and wait-free simulations……Page 454
12.5.1 Construction 1: SRSW safe to MRSW safe……Page 457
12.5.3 Construction 3: boolean MRSW safe to integer-valued MRSW safe……Page 458
12.5.4 Construction 4: boolean MRSW safe to boolean MRSW regular……Page 459
12.5.5 Construction 5: boolean MRSW regular to integer-valued MRSW regular……Page 460
12.5.6 Construction 6: boolean MRSW regular to integer-valued MRSW atomic……Page 462
12.5.7 Construction 7: integer MRSW atomic to integer MRMW atomic……Page 464
12.5.8 Construction 8: integer SRSW atomic to integer MRSW atomic……Page 465
Achieving linearizability……Page 466
12.6 Wait-free atomic snapshots of shared objects……Page 467
12.7 Chapter summary……Page 471
12.8 Exercises……Page 472
12.9 Notes on references……Page 473
References……Page 474
13.1 Introduction……Page 476
13.2.1 System model……Page 477
13.2.3 Consistent system states……Page 478
13.2.4 Interactions with the outside world……Page 479
13.2.5 Different types of messages……Page 480
Duplicate messages……Page 481
13.3 Issues in failure recovery……Page 482
13.4.1 Uncoordinated checkpointing……Page 484
13.4.2 Coordinated checkpointing……Page 485
Non-blocking checkpoint coordination……Page 486
13.4.3 Impossibility of min-process non-blocking checkpointing……Page 487
13.4.4 Communication-induced checkpointing……Page 488
Model-based checkpointing……Page 489
13.5.1 Deterministic and non-deterministic events……Page 490
The no-orphans consistency condition……Page 491
13.5.2 Pessimistic logging……Page 492
13.5.3 Optimistic logging……Page 493
13.5.4 Causal logging……Page 494
Second phase……Page 496
13.6.2 The rollback recovery algorithm……Page 497
13.7 Juang–Venkatesan algorithm for asynchronous checkpointing and recovery……Page 498
13.7.1 System model and assumptions……Page 499
Basic idea……Page 500
Description of the algorithm……Page 501
13.8 Manivannan–Singhal quasi-synchronous checkpointing algorithm……Page 503
Properties……Page 504
An explanation……Page 506
Handling the replay of messages……Page 509
Handling of received messages……Page 510
Features……Page 511
Notation……Page 512
13.9.2 Informal description of the algorithm……Page 513
Handling in-transit orphan messages……Page 514
The rollback protocol……Page 515
13.9.4 Correctness proof……Page 517
13.10 Helary–Mostefaoui–Netzer–Raynal communication-induced protocol……Page 519
To checkpoint or not to checkpoint?……Page 520
Reducing the number of forced checkpoints……Page 521
13.10.2 The checkpointing protocol……Page 523
13.11 Chapter summary……Page 525
13.13 Notes on references……Page 526
References……Page 527
14.1 Problem definition……Page 530
The Byzantine agreement problem……Page 532
The interactive consistency problem……Page 533
14.2 Overview of results……Page 534
14.3 Agreement in a failure-free system (synchronous or asynchronous)……Page 535
14.4.1 Consensus algorithm for crash failures (synchronous system)……Page 536
14.4.3 Upper bound on Byzantine processes……Page 537
Byzantine agreement tree algorithm: exponential (synchronous system)……Page 539
Phase-king algorithm for consensus: polynomial (synchronous system)……Page 546
14.5.1 Impossibility result for the consensus problem……Page 549
14.5.2 Terminating reliable broadcast……Page 551
14.5.4 k-set consensus……Page 552
Algorithm outline……Page 553
Notation……Page 555
Convergence rate of approximation……Page 556
Correctness……Page 557
Problem definition……Page 558
Algorithm……Page 559
Correctness……Page 562
14.6.1 Impossibility result……Page 564
14.6.2 Consensus numbers and consensus hierarchy [14]……Page 567
FIFO queue……Page 569
Compare&Swap……Page 570
Read–modify–write abstraction……Page 571
14.6.3 Universality of consensus objects [14]……Page 572
A non-blocking universal algorithm……Page 573
14.6.4 Shared memory k-set consensus……Page 576
14.6.5 Shared memory renaming……Page 577
14.6.6 Shared memory renaming using splitters……Page 580
14.7 Chapter summary……Page 582
14.8 Exercises……Page 583
14.9 Notes on references……Page 584
References……Page 585
15.1 Introduction……Page 587
15.2.1 The system model……Page 588
15.2.2 Failure detectors……Page 589
Completeness……Page 590
Eventual accuracy……Page 591
15.2.5 Reducibility of failure detectors……Page 592
15.2.6 Reducing weak failure detector W to a strong failure detector S……Page 593
A correctness argument……Page 594
15.2.7 Reducing an eventually weak failure detector . to an eventually strong failure detector………Page 595
An explanation of the algorithm……Page 596
15.3 The consensus problem……Page 597
15.3.2 A solution using strong failure detector S……Page 598
An explanation of the algorithm……Page 599
15.3.3 A solution using eventually strong failure detector………Page 600
An explanation of the algorithm……Page 602
15.4 Atomic broadcast……Page 603
An explanation of the algorithm……Page 604
15.6 The weakest failure detectors to solve fundamental agreement problems……Page 605
15.6.1 Realistic failure detectors……Page 606
15.6.2 The weakest failure detector for consensus……Page 608
15.7 An implementation of a failure detector……Page 609
15.8 An adaptive failure detection protocol……Page 611
Assumptions……Page 612
The protocol FDL……Page 613
Properties of FDL……Page 615
References……Page 616
16.1 Introduction……Page 618
16.2.1 Basis of authentication……Page 619
16.2.4 Notation……Page 620
16.2.5 Design principles for cryptographic protocols……Page 621
16.3 Protocols based on symmetric cryptosystems……Page 622
Weaknesses……Page 623
Weaknesses……Page 624
16.3.4 A protocol based on an authentication server……Page 625
16.3.5 One-time password scheme……Page 626
Protocol description……Page 627
16.3.6 Otway–Rees protocol……Page 629
Weaknesses……Page 630
The authentication protocol……Page 631
Weaknesses……Page 634
16.4.1 The basic protocol……Page 635
16.4.2 A modified protocol with a certification authority……Page 636
16.4.3 Needham and Schroeder protocol……Page 637
An impersonation attack on the protocol……Page 638
16.4.4 SSL protocol……Page 639
SSL handshake protocol……Page 640
How SSL provides authentication……Page 641
16.5 Password-based authentication……Page 642
16.5.1 Encrypted key exchange (EKE) protocol……Page 643
16.5.2 Secure remote password (SRP) protocol……Page 644
16.6 Authentication protocol failures……Page 645
16.7 Chapter summary……Page 646
16.9 Notes on references……Page 647
References……Page 648
17.1 Introduction……Page 651
17.2 System model……Page 652
17.3 Definition of self-stabilization……Page 654
17.3.1 Randomized and probabilistic self-stabilization……Page 655
Dijkstra’s self-stabilizing token ring system……Page 656
First solution……Page 657
Second solution……Page 658
Ghosh’s solution……Page 661
17.4.2 Uniform vs. non-uniform networks……Page 662
17.4.3 Central and distributed demons……Page 663
17.4.4 Reducing the number of states in a token ring……Page 664
17.4.6 Mutual exclusion……Page 665
17.4.7 Costs of self-stabilization……Page 666
17.5.1 Layering and modularization……Page 667
Topology-based primitives……Page 668
17.6 Communication protocols……Page 669
17.7 Self-stabilizing distributed spanning trees……Page 670
17.8.1 Dolev, Israeli, and Moran algorithm……Page 672
17.8.3 Arora and Gouda algorithm for spanning-tree construction……Page 675
17.8.5 Afek and Bremler algorithm for spanning-tree construction……Page 676
17.9 An anonymous self-stabilizing algorithm for 1-maximal independent set in trees……Page 677
Description of algorithm……Page 678
17.10 A probabilistic self-stabilizing leader election algorithm……Page 680
17.11.1 Compilers for sequential programs……Page 682
17.11.2 Compilers for asynchronous message passing systems……Page 683
17.11.3 Compilers for asynchronous shared memory systems……Page 684
Fault tolerance……Page 685
Symmetry……Page 687
17.14 Limitations of self-stabilization……Page 688
Pseudo-stabilization……Page 689
17.16 Exercises……Page 690
References……Page 691
18.1 Introduction……Page 697
18.1.2 Application layer overlays……Page 698
18.2 Data indexing and overlays……Page 699
Structured overlays……Page 700
18.3.1 Unstructured overlays: properties……Page 701
18.3.3 Search in Gnutella and unstructured overlays……Page 702
Search strategies……Page 703
18.3.4 Replication strategies……Page 704
18.3.5 Implementing replication strategies……Page 707
18.4.1 Overview……Page 708
18.4.2 Simple lookup……Page 709
18.4.3 Scalable lookup……Page 710
Node joins……Page 711
Node failures and departures……Page 714
18.5.1 Overview……Page 715
18.5.2 CAN initialization……Page 716
18.5.3 CAN routing……Page 717
18.5.4 CAN maintainence……Page 718
18.5.5 CAN optimizations……Page 720
18.6.1 Overview……Page 721
Prefix routing……Page 722
Router Table……Page 723
18.6.3 Object publication and object search……Page 725
18.6.4 Node insertion……Page 726
18.6.5 Node deletion……Page 727
18.7.1 Fairness: a game theory application……Page 728
18.7.2 Trust or reputation management……Page 729
Routing rule……Page 730
18.8.2 Bounds on DHT storage and routing distance……Page 731
18.9 Graph structures of complex networks……Page 732
18.10.1 Basic laws and their definitions……Page 734
18.10.2 Properties of the Internet……Page 735
Classification of scale-free networks……Page 737
Impact on network diameter……Page 738
Impact on network partitioning……Page 739
18.12 Small-world networks……Page 740
18.13 Scale-free networks……Page 741
18.13.1 Master-equation approach……Page 742
18.14 Evolving networks……Page 743
Continuum theory analysis……Page 745
18.16 Exercises……Page 747
18.17 Notes on references……Page 748
References……Page 749
Index……Page 751
Reviews
There are no reviews yet.