TM8101       PÅLITELIG ANALYS IKT
Pålitelighetsanalyse av informasjons- og kommunikasjonssystem

Dependability Analysis of Information and communication systems

 

Professor Bjarne E. Helvik

Phone: 92667

E-mail: bjarne.e.helvik@item.ntnu.no

 

Tentative syllabus and schedule autumn 2008

Last revision: 28. January 2009

 

The papers, book chapters, etc. included in the curriculum are divided in ordinary reading [regular font in the list] where in depth knowledge of the presented material are required and cursorily reading [italic font in the list]. No in depth knowledge of the cursorily reading is required although the students are required to be aware of the content.

 

Schedule:

 

Time

Date

Topic(s)

Literature

1015-1200

23. Sept

Kick-off

 

0915-1200

9. Oct

Preliminaries, System times;

(1, 2  |  x4), 8, x5

0915-1200

23. Oct

Modelling; concepts

3 - 7a, x3, x6

0915-1200

6. Nov

State space truncation

7b,c, 10

0915-1200

20. Nov.

Uniformization, Interval availability;

7d, 11 - 13

0915-1200

4. Dec.

Rare Events techniques

 14 - 16, x7, x8

0915-1200

12. Jan. 2009

Stochastic Petrinets; principles & tools

17 - 19 (nb! new), revisit [x3]

0915-1200

29. Jan. 2009

Stochastic Petrinet modelling cases.

20 - 24

0915-1200

12. feb. 2009

SUMMING UP;

 

                                                            n - m : All in range n to m.

                                                            n, m : n and m.

                                                            n | m : n or m.

 

Place: NTNU Electro building; Room            E-262

Examn:

Form:                        Oral or written dependent on the no of students

Time:                         TBD

Place:                         TBD.

 

Preliminaries

 

 Basic dependability modelling and analysis with continuous time discrete state Markov models are

 

[1]     Peder Emstad, Poul E. Heegaard, Bjarne E. Helvik and Laurent Paquerau: Dependability and Performance in Information and Communication Systems; Fundamentals, (278 p.), Tapir academic publisher, Aug. 2008. <<Chapter 1, 2, 3, 5, Chapter 7>>

 

[2]    Bjarne E. Helvik: Dependable Computing Systems and Communication Networks; Design and Evaluation, Draft Lecture Notes (246 p.), Department of Telematics, NTNU, Jan. 200y

 

Alternatively you may consult the more extensive textbook below.

 

[x4]  K. S. Trivedi: Probability and Statistics with Reliability, Queuing, and Computer Science Applications, John Wiley and Sons, New York, 2001. ISBN number 0-471-33341-7

 

[8]     John A. Buzacott, “Markov Approach to Finding Failure Times of Repairable Systems”, IEEE Transactions on Reliability, Vol. R-19, No. 4, pp.128 - 133, November 1970.

 

[x5]   S. Welke, B. Johnson, and J. Aylor. Reliability modeling of hardware/software systems. IEEE Transactions on Reliability,, 44(3):413–418, Sep 1995. <<Section 3.3 is for cursorily reading>>

 

Exercise_0:  Use the Mathematica package StateDiagrams.m to analyse one of more of the models in [x5]. a) Assume a constant software failure intensity and find R(t) and MTFF. b) Introduce repair and restoration and find A, MTBF, MUT, MTFF (compare these) and MDT.

 

Modelling of systems with distribution, HW and SW components

 

The objective of this part is to intoduce dependability concepts and terminology and understand how dependability models based on discrete space continuous time Markov chaines are established. These issues are introduced in:

 

[3]     A. Avizienis, J. C. Laprie, B. Randell, and C. Landwehr. Basic concepts and taxonomy of dependable and secure computing. Dependable and Secure Computing, IEEE Transactions on, 1:11–33, 2004.

 

Exercise_1:  Compare this framework with the ITU-T standard E.800 and indentify the major differences.

 

[5]     Yinong Chen, Zhongshi He,  Dependability modelling of homogeneous and heterogeneous distributed systems. In Proceedingsof the 5th International Symposium on Autonomous Decentralized Systems, pages 176–183, 2001.

 

[x3]   K. S. Trivedi, S. Hunter, S. Garg, and R. Fricks. Reliability analysis techniques explored through a communication network example. Technical Report TR-96/32, Duke University, Dep. of Electrical and Computer Eng., USA, 1996.

 

[x6]  B. E. Helvik, K. Sallhammar, and S. J. Knapskog. Information Assurance; Dependability and Security in Networked Systems, chapter Chapter 8: Integrated Dependability and Security Evaluation Using Game Theory and Markov Models, pages 209 – 245. Number ISBN: 978-0-12-373566-9; ISBN10: 0-12-373566-1. Elsevier, Morgan Kaufmann, December 28 2007.

 

[6]    Kaaniche, M.; Kanoun, K.; Martinello, M., A user-perceived availability evaluation of a web based travel agency, International Conference on Dependable Systems and Networks, 2003. Proceedings. 2003 Year: 22-25 June 2003, Page(s): 709- 718

 

[7a]   Edmundo de Souza e Silva, H. Richard Gail, Performability analysis of computer systems: from model specification to solution , Performance Evaluation, Volume 14, Issues 3-4, Pages 135-275 (February 1992)  http://www.sciencedirect.com/ <<Sections 1, 2 and 3>>

 

         See also http://www.doc.ic.ac.uk/~nd/surprise_95/journal/vol4/eaj2/report.html

 

Modelling will be revisited in the Petrinet-part of the course.

 

Methods for analysing dependability models based on discrete state continuous time Markov chains.

Part 1, Truncation of the state space

 

[7b]  <<Sections 4.1 and 4.2>>

 

Exercise_2:  Use the Mathematica package StateDiagrams.m to analyse for instance  the Yinong Chen and Zhongshi He model, or another model of your choice.

 

[10]   Muntz, R.R.; De Souza e Silva, E.; Goyal, A., “Bounding availability of repairable systems”, IEEE Transactions on Computers, Vol: 38  No: 12 pp. 1714-23, Dec. 1989

 

[7c]  <<Sections 4.4>>

 

Exercise_3:  Realise the functionality described in Muntz, De Souza e Silva and Goyal, in  Mathematica as an extension to StateDiagrams.m

 

Part 2, Unifomization and interval availability

 

[11] Sheldon M. Ross, "Introduction to Probability Models; Section 6.7 Uniformization" (7'th ed.), Academic Press, 2000.

 

[7d]   <<Section 4.3, up to but not included 4.3.1>>

 

The interested student may also look up the computationally improved methods for randomization; by for instance  Carrasco (... Regenerative Randomization, IEEE Trans on Rel. Vol. 52, No. 3, Sept 2003, pp 319-329) and/or Moorsel and Sanders (... Combining Adaptive & Standard Uniformization, IEEE Trans on Rel. Vol. 46, No. 3, Sept 1997, pp.430-440.)

 

[12]   Goyal A. and Tantawi A.N. "A measure of Guaranteed availability and its Numerical Evaluation”, IEEE Trans. on Comp., Vol 37, No. 1, pp. 25 - 32, Jan 1988.

 

The papers in [13x] addresses the same problem as in [12], but has a mathematically much more well founded solution technique based on Uniformization

 

[13a]   Gerard  Rubino and Bruno Sericola, "Interval Availability Distribution Computation", FTCS-23. Digest of Papers., The Twenty-Third International Symposium on , 22-24 June 1993, pp. 48 - 55; <<Sections 1 & 2.>>

 

[13b]   <<Sections 3-5>>

         <OR>

[13c]   Rubino, G.; Sericola, B., Interval availability analysis using denumerable Markov processes: application to multiprocessor subject to breakdowns and repair, Computers, IEEE Transactions on Volume: 44 2, Feb. 1995, Page(s): 286 -291

 

Exercise_4:  Realise the functionality described in Goyal A. and Tantawi, or alternatively one of the algorithms in one of the Rubino & Sericola papers, in Mathematica as an extention to StateDiagrams.m

 

Methods for dealing with failure as a rare event

 

[14]   A. E. Conway and A. Goyal. “Monte Carlo Simulation of Computer System Availability/Reliability Models.” In Digest of paper, FTCS-17 - The seventeenth international symposium on fault-tolerant computing, pages 230 –235, July 6 - 8 1987

 

[15]   José Villén-Altamirano: RESTART Method for the Case Where Rare Events Can Occur in Retrials from any Threshold. In International Journal of Electronics and Communications (AEÜ), Vol. 52, No. 3, pp. 183-189, 1998. 

 

[x7]   A. Mykkeltveit and B. E. Helvik. Application of the restart/splitting technique to network resilience studies in ns2. In Proceedings of The 19th IASTED International Conference on Modelling and Simulation [MS 2008], Quebec City, Quebec, Canada, May 26 – 28 2008.

 

[x8]    J. Villén-Altamirano. Importance functions for RESTART simulation of highly-dependable systems. SIMULATION, 83(12):821–828, 2007.

 

[16]  Nicola, V.F.; Shahabuddin, P.; Nakayama, M.K.; Techniques for fast simulation of models of highly dependable systems,  IEEE Transactions on Reliability,, Volume: 50 , Issue: 3 , Sept. 2001, Pages:246 - 264

         <<Nb! This paper covers a huge body of research in a very condensed and formal manner. The student should focus on which problems are addressed and which methods and solutions exists, and not on the methods and solutions themselves. Do however obtain a clear understanding of the biasing schemes outlined, i.e. Section III.A>>

 

Exercise_5:  Reproduce the example in the above papers by Conway & al. and or Villen-Altamirano or apply the technique to a (simple) model of your choice or compare the conway & al. approach with the CE obtained rates by Ridder.

Stokastisk Petri-nett modelling

[17]   T. Murata, Petri Nets: Properties, Analysis and Applications, Proceedings of the IEEE, Vol. 77, No 4, April, 1989, pp. 541-580.  Sections: I - IV, IX.B,

         <The paper: murata-petrinet.pdf>

 

[18]   UltraSAN User manual - Ckapter_1: Modeling with Stochastic Activity Networks”, Center for Reiliable and High-Performance Computing, Univ. of Illinois at Urbana-Champaign.

 

[19]  W. H. Sanders and J. F. Meyer. Stochastic activity networks: Formal definitions and concepts. In Lectures on Formal Methods and Performance Analysis,  volume 2090, 2001

 

Exercise _6:        Use UltraSAN to establish and evaluate a simple dependability model of your choice. [see http://www.item.ntnu.no/fag/tm8101/ for info about the tool]

[20]   Porcarelli, S.; Di Giandomenico, F.; Bondavalli, A.; Barbera, M.; Mura, I., Service-level availability estimation of GPRS, Mobile Computing, IEEE Transactions on, Vol.2, Iss.3, July-Sept. 2003,Pages: 233- 247

 

[21]   Noé Lopez-Benitez, Dependability Modelling and Analysis of Distributed Programs, IEEE trans. on Software Eng., vol 20, No. 5, pp.345-352, may 1994

 

[22]  K. Kanoun & al. Availability of CAUTRA, a subset of the french air traffic control system, IEEE Trans on Comp. vol 48, no. 5, pp. 528-535, May 1999

         <see also [z1]>

 

[23]  Manish Malhotra, Kishor S. Trivedi, "Dependability Modelling using Petri-Nets, IEEE Transactions on Reliability, Volume: 44 , Issue: 3 , Sept. 1995, Pages:248 - 273

[24]    Mourad Rabah and Karama Kanoun, "Performability Evaluation of  Multipurpose Multiprocessor Systems: The “Separation of Concerns” Approach", IEEE Trans on Comp. vol. 52, No. 2, February 2003

 

==============

Book which may be useful:

Robin A. Sahner, Kishor S. Trivedi, Antoio Puliafito, "Performance and Releiability analysis of Computer systems; An example based ...  SHARPE..," Klüver, 1996.

 

 

================================ END  ==============================