Course details
Fault-Tolerant Systems
SOP Acad. year 2005/2006 Summer semester 6 credits
Principles of fault tolerance, structures and techniques. Codes for control and correction of information. Cyclic codes, Fire codes, BCH and RS. Convolutional codes. Modelling, estimation and control of reliability. Fail-safe systems. Architecture of FT systems. Fault tolerance at VLSI level. Fault tolerance in computer units, computer systems and communication networks. Distributed tolerant systems, fault tolerant software.
Guarantor
Language of instruction
Completion
Time span
- 39 hrs lectures
- 12 hrs pc labs
- 14 hrs projects
Department
Subject specific learning outcomes and competences
Skills and approaches to building fault tolerance using hardware and codes.
Understandig a new trend in computer design.
Learning objectives
To inform the students about different types of redundancy and its application for the design of computer systems being able to function correctly even under presence of faults and data errors.
Prerequisite knowledge and skills
Principles of computer organization.
Syllabus of lectures
- Introduction, FT design methodology. Hardware redundancy, TMR.
- Information redundancy, parity codes, arithmetic codes, Residue codes, Hamming codes.
- Cyclic codes, Fire codes.
- Galois fields, BCH and Reed-Solomon codes, byte error detection.
- Convolution codes.
- Reliability modeling, combinatorial models, MIL-HDBK-217. Markov reliability models.
- Fail-safe systems.
- Time redundancy, alternating logic, RESO, RESWO, REDWC.
- Architecture of fault tolerant systems.
- VLSI reconfiguration techniques.
- Fault tolerant units, systems and networks.
- Distributed FT systems.
- Software for FT systems.
Syllabus of computer exercises
- Model of error correction using Fire code.
- Model of linear-feedback shift register.
- Parallel generation of CRC.
- Model of Galois field GF(2|n).
- Model of reconfiguration in 2D processor array.
- Model of error correction using convolutional code.
- Model of error correcction using BCH code.
- Reliability models.
Progress assessment
Passing mid-term exam, PC lab and a project solution.
Controlled instruction