HOME > 상세정보

상세정보

Fault tolerance in distributed systems

Fault tolerance in distributed systems (11회 대출)

자료유형
단행본
개인저자
Jalote, P.
서명 / 저자사항
Fault tolerance in distributed systems / Pankaj Jalote.
발행사항
Englewood Cliffs, N.J.;   Upper Saddle River, N.J. :   PTR Prentice Hall,   c1994   (1998 printing).  
형태사항
xvi, 432 p. : ill. ; 24 cm.
ISBN
0133013677
서지주기
Includes bibliographical references (p. 401-420) and index.
일반주제명
Fault-tolerant computing. Electronic data processing -- Distributed processing.
비통제주제어
Computers , Networks ,,
000 00978camuu2200277 a 4500
001 000000086715
005 20130307135026
008 931103s1994 njua b 001 0 eng
010 ▼a 93042024
015 ▼a GB94-50189
020 ▼a 0133013677
040 ▼a DLC ▼c DLC ▼d CDS ▼d PIT ▼d UKM ▼d 211009
049 1 ▼l 121002668 ▼f 과학
050 0 0 ▼a QA76.9.F38 ▼b J35 1994
082 0 0 ▼a 004/.36 ▼2 23
084 ▼a 004.36 ▼2 DDCK
090 ▼a 004.36 ▼b J26f
100 1 ▼a Jalote, P.
245 1 0 ▼a Fault tolerance in distributed systems / ▼c Pankaj Jalote.
260 ▼a Englewood Cliffs, N.J.; ▼a Upper Saddle River, N.J. : ▼b PTR Prentice Hall, ▼c c1994 ▼g (1998 printing).
300 ▼a xvi, 432 p. : ▼b ill. ; ▼c 24 cm.
504 ▼a Includes bibliographical references (p. 401-420) and index.
650 0 ▼a Fault-tolerant computing.
650 0 ▼a Electronic data processing ▼x Distributed processing.
653 0 ▼a Computers ▼a Networks

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 중앙도서관/서고6층/ 청구기호 004.36 J26f 등록번호 111689848 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 2 소장처 과학도서관/Sci-Info(2층서고)/ 청구기호 004.36 J26f 등록번호 121002668 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 중앙도서관/서고6층/ 청구기호 004.36 J26f 등록번호 111689848 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 과학도서관/Sci-Info(2층서고)/ 청구기호 004.36 J26f 등록번호 121002668 도서상태 대출가능 반납예정일 예약 서비스 B M

컨텐츠정보

저자소개

Pankaj Jalote(지은이)

<구현사례를 통한 CMM 이해>

정보제공 : Aladin

목차


CONTENTS
Preface = xiii
1 Introduction = 1
 1.1 Basic Concepts and Definitions = 3
  1.1.1 System Model = 4
  1.1.2 Failure, Error and Fault = 6
  1.1.3 Fault Tolerance = 7
 1.2 Phases in Fault Tolerance = 8
  1.2.1 Error Detection = 9
  1.2.2 Damage Confinement and Assessment = 14
  1.2.3 Error Recovery = 15
  1.2.4 Fault Treatment and Continued Service = 16
 1.3 Overview of Hardware Fault Tolerance = 17
  1.3.1 Process of Hardware Development = 17
  1.3.2 Fault and Error Models = 18
  1.3.3 Triple Modular Redundancy(TMR) = 20
  1.3.4 Dynamic Redundancy = 22
  1.3.5 Coding = 22
  1.3.6 Selt-Checking Circuits = 26
  1.3.7 Fault Tolerance in Multiprocessors = 27
 1.4 Reliability and Availability = 30
  1.4.1 Preliminaries = 30
  1.4.2 The Exponential Distribution = 32
  1.4.3 Reliability = 34
  1.4.4 Availability = 37
 1.5 Summary = 38
 Problems = 41
 References = 42
2 Distributed Systems = 45
 2.1 System Model = 45
  2.1.1 Physical Network = 46
  2.1.2 Logical Model = 48
  2.1.3 Failures and Fault Classification = 51
 2.2 Interprocess Communication = 53
  2.2.1 Asynchronous Message Passing = 54
  2.2.2 Synchronous Message Passing and CSP = 57
  2.2.3 Remote Procedure Call = 59
  2.2.4 Object-Action Model = 62
 2.3 Ordering of Events and Logical Clocks = 63
  2.3.1 Partial Ordering of Events = 64
  2.3.2 Logical Clocks = 65
  2.3.3 Total Ordering of Events = 66
 2.4 Execution Model and System State = 67
 2.5 Summary = 70
 Problems = 73
 References = 75
3 Basic Building Blocks = 77
 3.1 Byzantine Agreement = 78
  3.1.1 Problem Definition and Impossibility Results = 79
  3.1.2 Protocol with Ordinary Messages = 81
  3.1.3 Protocol with Signed Message = 85
  3.1.4 Discussion = 87
 3.2 Synchronized Clocks = 89
  3.2.1 Problem Definition and Background = 90
  3.2.2 Deterministic Clock Synchronization = 91
  3.2.3 Probabilistic Clock Synchronization = 97
 3.3 Stable Storage = 99
  3.3.1 Problem Definition = 100
  3.3.2 Implementation = 102
 3.4 Fail Stop Processors = 107
  3.4.1 Problem Definition = 107
  3.4.2 Implementation = 109
 3.5 Failure Detection and Fault Diagnosis = 115
  3.5.1 System-Level Fault Diagnosis = 116
  3.5.2 Fault Diagnosis in Distributed Systems = 120
 3.6 Reliable Message Delivery = 125
  3.6.1 Problem Definition = 126
  3.6.2 Implementation = 127
 3.7 Summary = 131
 Problems = 134
 References = 136
4 Reliable, Atomic, and Causal Broadcast = 141
 4.1 Reliable Broadcast = 142
  4.1.1 Using Message Forwarding = 142
  4.1.2 An Approach by Piggybacking Acknowledgments = 146
 4.2 Atomic Broadcast = 150
  4.2.1 Using Piggybacked Acknowledgments = 151
  4.2.2 A Centralized Method = 157
  4.2.3 The Three-Phase Protocol = 161
  4.2.4 Using Synchronized Clocks = 163
  4.2.5 A Protocol for CSMA / CD Networks = 165
 4.3 Causal Broadcast = 170
  4.3.1 Causal Broadcast Without Total Ordering = 171
  4.3.2 Causal Broadcast with Total Ordering = 172
 4.4 Summary = 177
 Problems = 180
 References = 182
5 Recovering
 a Consistent State = 185
 5.1 Asynchronous Checkpointing and Rollback = 186
  5.1.1 Rollback and Domino Effect = 187
  5.1.2 Occurrnce Graph Modeling = 190
  5.1.3 Protocols for State Restoration = 194
 5.2 Distributed Checkpointing = 199
  5.2.1 Distributed Snapshots = 199
  5.2.2 A Distributed Checkpointing and Rollback Method = 203
  5.2.3 Checkpointing using Synchronized Clocks = 208
 5.3 Summary = 211
 Problems = 212
 References = 213
6 Atomic Actions = 217
 6.1 Atomic Actions and Serializability = 218
  6.1.1 Transactions = 218
  6.1.2 Atomicity and Serializability = 220
 6.2 Atomic Actions in a Centralized System = 222
  6.2.1 Concurrency Control = 223
  6.2.2 Failure Recovery = 225
  6.2.3 Optimum Checkpoint Interval = 228
 6.3 Commit Protocols = 232
  6.3.1 Two-Phase Commit Protocol = 233
  6.3.2 Nonblocking Protocols and the Three-Phase Protocol = 236
 6.4 Atomic Actions on Decentralized Data = 240
  6.4.1 Concurrency Control = 241
  6.4.2 Failure Recovery Using Logs = 243
  6.4.3 Implementing Atomic Actions Using Object Histories = 244
  6.4.4 Nested Atomic Actions = 247
 6.5 Summary = 249
 Problems = 251
 References = 253
7 Data Replication and Resiliency = 257
 7.1 Optimistic Approaches = 259
 7.2 Primary Site Approach = 262
  7.2.1 Basic Approach = 262
  7.2.2 Resilient Objects Using the Primary Site Approach = 264
 7.3 Resiliency with Active Replicas = 266
  7.3.1 State Machine Approach = 267
  7.3.2 Resilient Objects using Atomic Broadcasts = 268
 7.4 Voting = 272
  7.4.1 Static Voting Methods = 272
  7.4.2 Dynamically Adaptive Methods = 278
  7.4.3 Vote Assignment = 284
 7.5 Degree of Replication = 291
  7.5.1 Primary Site Approach = 292
  7.5.2 Majority Voting = 295
 7.6 Summary = 298
 Problems = 301
 References = 303
8 Process Resiliency = 307
 8.1 Resilient Remote Procedure Call = 308
  8.1.1 Using the Primary Site Approach = 309
  8.1.2 Replicated Call = 310
  8.1.3 A Combined Approach = 313
 8.2 Resiliency with Asynchronous Message Passing = 315
  8.2.1 Conditions for Message Recovery = 316
  8.2.2 An Approach Based on Atomic Broadcast = 319
  8.2.3 A Centralized Approach = 321
  8.2.4 Sender-Based Message Logging = 323
  8.2.5 Optimistic Recovery = 327
  8.2.6 A Distributed Scheme = 332
 8.3 Resiliency with Synchronous Message Passing = 336
  8.3.1 Recovery of Failed Process = 337
  8.3.2 Reconfiguring the Processes = 342
 8.4 Total Failure and Last Process to Fail = 343
  8.4.1 Preliminaries = 344
  8.4.2 Determining LAST Using Complete Information = 345
  8.4.3 Determining LAST Using Incomplete Information = 347
 8.5 Summary = 348
 Problems = 351
 References = 352
9 Software Design Faults = 355
 9.1 Approaches For Uniprocess Software = 356
  9.1.1 Exception Handling Framework = 356
  9.1.2 Recovery Block Approach = 359
  9.1.3 N-Version Programming = 364
  9.1.4 Other Approaches = 368
 9.2 Backward Recovery in Concurrent Systems = 374
  9.2.1 Domino Effect. Conversations, and FT-Actions = 375
  9.2.2 Conversations Using Monitors = 379
  9.2.3 Using Distributed FT-Action = 383
 9.3 Forward Recovery in Concurrent Systems = 387
  9.3.1 Exception Resolution = 389
  9.3.2 Exception Handling with FT-Action = 390
 9.4 Summary = 393
 Problems = 395
 References = 396
Bibliography = 401
Index = 421


관련분야 신착자료

Forouzan, Behrouz A. (2022)