Syllabus for Roster(s):
- 17Sp CS 4434-001 (ENGR)
- 17Sp CS 4434-001 (ENGR) Waitlist
- 17Sp CS 6434-001 (ENGR)
- 17Sp ECE 4434-001 (ENGR)
- 17Sp ECE 6434-001 (ENGR)
- 17Sp SYS 4582-009 (ENGR)
- 17Sp SYS 6582-006 (ENGR)
Course Description
Computing systems are used in various critical domains including aerospace, energy, transportation, healthcare, and commerce. Failures of these systems may lead to catastrophic consequences such as injury, loss of life, damage to equipment, or financial loss. This course focuses on techniques for designing and analyzing dependable computing systems that can continue to operate correctly in the presence of software and hardware problems. We will learn what can go wrong, how we can predict, prevent, and detect faults/errors, and how we can design systems that can tolerate faults and recover from failures.
Topics:
- Introduction to dependable computing
- Basic terminology, attributes, and evaluation techniques
- Combinatorial and state-space modeling
- Hardware fault tolerance
- Information redundancy
- Software fault tolerance
- Checkpointing and recovery
- Reliable networked systems
- Error detection techniques
- Dependability evaluation techniques
- Safety and Security
Time: Mon/Wed/Fri 9:00AM - 9:50AM
Location: Thornton Hall E304
Office Hours: Wed 10:00AM - 11:00AM - Thornton Hall E314
Schedule and Activities
This is the tentative timeline for the class and subject to change.
Week |
Dates |
Topics |
Lectures |
In-class Activities |
Assignments |
Reading |
1 |
Jan 18 |
Background and Motivation |
|
|
||
Jan 20 |
|
|||||
2 |
Jan 23 |
Basic Dependability Concepts
|
|
|
|
|
Jan 25 |
|
|
|
|||
Jan 27 |
|
|
||||
3 |
Jan 30 |
Combinational/State-space Modeling |
|
|
|
|
Feb 1 |
|
|
||||
Feb 3 |
|
|||||
4 |
Feb 6 |
Hardware Fault Tolerance |
|
|
|
|
Feb 8 |
|
|
|
|||
Feb 10 |
|
|
||||
5 |
Feb 13 |
Hardware Fault Tolerance
|
|
|
||
Feb 15 |
|
|
||||
Feb 17 |
Information Redundancy (Guest Lec.) |
-- |
|
|
||
6 |
Feb 20 |
Information Redundancy
|
|
|||
Feb 22 |
|
|||||
Feb 24 |
Information Redundancy (Cont.) |
|
|
|||
7 |
Feb 27 |
Information Redundancy (Cont.) Midterm Review |
|
|
||
Mar 1 |
|
|
|
|||
Mar 3 |
Midterm Exam |
|
--- |
--- |
||
Spring Recess |
Mar 4-12 |
|
|
|
|
|
8 |
Mar 13 |
Error Detection Techniques
|
||||
Mar 15 |
-- |
|
||||
Mar 17 |
Final Project Overview |
|
||||
9 |
Mar 20 |
Software Fault Tolerance
Experimental Evaluation (Validation) |
|
|||
Mar 22 |
|
|||||
Mar 24 |
|
|
||||
10 |
Mar 27 |
Check-pointing & Recovery |
|
|||
Mar 29 |
|
|
|
|||
Mar 31 |
Final Project Topics |
|
||||
11 |
Apr 3 |
Processor-level detection and Recovery
|
|
|
|
|
Apr 5 |
Final Project |
|
|
|||
Apr 7 |
|
|
||||
12 |
Apr 10 |
Processor-level detection and Recovery
|
|
|
|
|
Apr 12 |
|
|
||||
Apr 14 |
|
|||||
13 |
Apr 17 |
Distributed Systems/Network Specific Issues
|
|
|
|
|
Apr 19 |
|
|
||||
Apr 21 |
|
|
||||
14 |
Apr 24 |
|
|
|||
Apr 26 |
|
|
|
|||
Apr 28 |
|
|||||
Final |
May 1 |
No Class/Only Office Hours |
|
|
|
|
May 2 |
Final Exam Release |
|
|
Project Report Due |
|
References
The lectures and assignments are based on the following references:
- I. Koren and C. Mani Krishna, Fault-tolerant Systems, 1st edition, 2007, Morgan Kaufmann. (Read online through UVA Library)
- J. Knight, Fundamentals of Dependable Computing for Software Engineers, 2012, CRC Press. (Read online through UVA Library)
- K. Trivedi, Probability and Statistics with Reliability, Queuing and Computer Science Applications, 2nd edition, 2001, John Wiley & Sons.
- D. K. Pradhan, Fault Tolerant Computer System Design, 1st edition, 1996, Prentice-Hall.
- An unpublished textbook by R. K. Iyer, Z. Kalbarczyk, and N. Nakka from the University of Illinois at Urbana-Champaign, who have agreed to let us use a pre-publication copy of the book. The book consists of multiple chapters, each contained in a separate pdf file which will be shared internally with you. Please do not redistribute.
Grading
|
Undergraduate Students |
Graduate Students |
Class Participation/Activity |
5% |
5% |
Short presentations |
5% |
2% |
Paper presentations * |
-- |
8% |
Homework and Mini Project ** |
25% |
20% |
Final Project *** |
30% |
30% |
Midterm Exam |
15% |
15% |
Final Exam (Take home) |
20% |
20% |
* Each graduate student will select a paper on a special topic (of mutual interest) and will:
- Present the paper to the class (20 minutes)
- Prepare a short homework assignment based on the material in the lecture and the paper
- Grade the homework assignment and provide a solution to the class.
- The paper must be made available to the class at least a week prior to presentation and the homework assignment must be graded within a week of its submission.
** There will be a 10% penalty for late assignments (per school day).
*** The final projects will be performed by the teams consisting of both graduate and undergraduate students. Each team will propose a project on a related topic of interest, define measurable outcomes and deliveries for the project, and present the results as a short paper and a lecture to the class at the end of semester. All the students are required to actively participate in different aspects of the projects.