SYSTEMS AND SOFTWARE DEPENDABILITY

Contents

Overview of the seminar
Dates and topics
Teaching Staff
On your presentations
Literature proposals

Overview of the seminar

In this seminar, dependable systems are discussed. As an introduction, we copy a paragraph from the book:

    Dependability: basic concepts and terminology : in English, French, German, Italian and Japanese / J. C. Laprie

    (ed.). – Wien : Springer, 1992. – (Dependable computing and fault-tolerant systems ; 5)

Dependability is defined as the trustworthines of a computer system such that reliance can justifiably be placed on the service it delivers.

“Dependable” has several aspects:


with respect to the readiness for usage, dependable means available;
with respect to the continuity of service, dependable means reliable;
with respect to the avoidance of catastrophic consequences on the environment, dependable means safe;
with respect to the prevention of unauthorized access and/or handling of information, dependable means secure.

The development of a dependable computing system calls for the combined utilization of a set of methods which can be classed into:


fault prevention: how to prevent fault occurrence or introduction;
fault tolerance: how to provide a service complying with the specification in spite of faults;
fault removal: how to reduce the presence (number, seriousness) of faults;
fault forecasting: how to estimate the present number, the future incidence, and the consequences of faults.

Dates and topics

May 5   3.15 p.m.   Preparatory meeting, assignment of subjects in building 45, room 528 
July 16   12.00 noon   Hand in your written exposition 

Date Time Main topic Warmup Speaker Teacher
May 25 16.15 Case study: Railway Patriot missed Iraqi scuds Michael Feld David N. Jansen
17.45 Case study: Nuclear power plant Martin Thielen
June 1st 16.30 Case study: Avionics Warsaw airport plane crash Taoufik Romdhane
18.00 Case study: Space Ariane 5 maiden flight Frank Werner
June 8 16.30 Fault trees USS Vincennes shoots down an Iranian Airbus in 1988 Osama Khan Holger Hermanns
18.00 Dynamic fault trees Mars climate orbiter crash 1999 Stephan Schlicker
June 15 16.30 Failure modelling   Andreas Wagner
18.00 Architectures Therac-25 Verena Schuler
June 29 16.30 Distributed systems: Byzantine agreement, reliable broadcast US/Canada Power Blackout 2003 Mohammad Al-Rifai David N. Jansen
18.00 Distributed systems: Clock synchronisation   Mansoor Jafry
July 13 16.30 Formal models and verification 1   Rotislav Rusev Holger Hermanns
July 20 16.30 Security 1: Trustworthyness of the internet   Stilian Stanev David N. Jansen
18.00 Security 2: Needham–Schroeder protocol AT & T long distance service fails, 1990 Sven Bünte Holger Hermanns

All sessions take place in building 27.2, room H05 („Seminarraum 1“).


Teaching Staff

Holger Hermanns
David N. Jansen, room 534

On your presentations

Before you present your actual topic, please present as a warm-up in about 10 minutes a “horror story” of a real failure in a software system that should have been dependable. You may get inspiration from a list in the Internet: here or here, but of course, your own ideas are very welcome!

Please do come along at least one week before your presentation to discuss the concept of your slides or your written exposition. The goal is to have the structure and a first idea of the contents of the presentation. (For a one-hour presentation, one calculates about 30 slides.)

Please hand in the final version your written exposition on July 16, at 12.00 noon the latest.

Conditions for grading: Presentation and written exposition; meet the above conditions (deadlines). The grade will be based on the quality of your presentation and exposition.


Literature proposals

Here, you will find some proposed literature for the seminar.

You have to use at least one source in addition to the ones proposed by us. If you find some proposal is not sensible, please consult with your teacher.


Case study: Railway
Case study: Nuclear power plant
Case study: Avionics
Case study: Space
Fault trees
Dynamic fault trees
Failure modelling
Architectures
Distributed systems: clock synchronisation
Distributed systems: Byzantine agreement, reliable broadcast
Formal models and verification 1
Security 1: Trustworthiness of the internet
Security 2: Needham-Schroeder protocol


Last change on June 22, 2004, by David N. Jansen.