High reliability and availability is a requirement for most technical systems. Reliability and availability assurance methods based on probabilistic models is the topic addressed in this talk. Non-state-space solution methods are often used to solve models based on reliability block diagrams, fault trees and reliability graphs. Relatively efficient algorithms are known to handle systems with hundreds of components and have been implemented in many software packages. Nevertheless, many practical problems cannot be handled by such algorithms. Bounding algorithms are then used in such cases as was done for a major subsystem of Boeing 787. Non-state-space methods derive their efficiency from the independence assumption that is often violated in practice. State space methods based on Markov chains, stochastic Petri nets, semi-Markov and Markov regenerative processes can be used to model various kinds of dependencies among system components. However, the resulting state space explosion severely restricts the size of the problem that can be solved. Hierarchical and fixed-point iterative methods provide a scalable alternative that combines the strengths of state space and non-state-space methods and have been extensively used to solve real-life problems. We will take a journey through these model types via interesting real-world examples chosen from IBM, Cisco, Sun Microsystems, and Boeing. These methods and applications are fully described in a recently completed book: Reliability and Availability Engineering: Modeling, Analysis and Applications, Cambridge University Press, 2017.
Kishor Trivedi holds the Fitzgerald Hudson Chair in the Department of Electrical and Computer Engineering at Duke University, Durham, NC. He has a 1968 B.Tech. (EE) from IIT Mumbai and MS'72/PhD'74 (CS) from the University of Illinois at Urbana-Champaign. He has been on the Duke faculty since 1975. He is the author of a well-known text entitled, Probability and Statistics with Reliability, Queuing and Computer Science Applications, originally published by Prentice-Hall; a thoroughly revised second edition of this book has been published by John Wiley. The book is recently translated into Chinese. He has also published two other books entitled, Performance and Reliability Analysis of Computer Systems, published by Kluwer Academic Publishers and Queueing Networks and Markov Chains, John Wiley. His latest book, Reliability and Availability Engineering is published by Cambridge University Press in 2017. He is a Life Fellow of the Institute of Electrical and Electronics Engineers and a Golden Core Member of IEEE Computer Society. He has published over 600 articles and has supervised 46 Ph.D. dissertations. He is the recipient of IEEE Computer Society's Technical Achievement Award for his research on Software Aging and Rejuvenation. His research interests are in reliability, availability, performance and survivability of computer and communication systems and in software dependability. His h-index is 98. He works closely with industry in carrying our reliability/availability analysis, providing short courses on reliability, availability, and in the development and dissemination of software packages such as HARP, SHARPE, SREPT and SPNP.