Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Safety-1.pdf, Lecture notes of Systems Engineering

fault tolerance to the braking system software. • Avoiding accidents by adding features to systems which mean that incidents do not result in an accident.

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

mrbean3
mrbean3 🇬🇧

4

(5)

214 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSE 466 Critical Systems Engineering Slide 1
Critical Systems Engineering
Engineering systems to avoid
disasters
Adapted from Ian Sommerville
CSE 466 Critical Systems Engineering Slide 2
Objectives
To introduce the notion of critical systems
To describe critical system attributes (reliability,
availability, maintainability, safety and security)
To introduce techniques used for developing
reliable and safe systems
To discuss the importance of people in critical
systems engineering
CSE 466 Critical Systems Engineering Slide 3
Critical systems
A critical system is any system whose ‘failure’
could threaten human life, the system’s
environment or the existence of the organisation
which operates the system.
‘Failure’ in this context does NOT mean failure
to conform to a specification but means any
potentially threatening system behaviour.
CSE 466 Critical Systems Engineering Slide 4
Examples of critical systems
Communication systems such as telephone
switching systems, aircraft radio systems, etc.
Embedded control systems for process plants,
medical devices, etc.
Command and control systems such as air-traffic
control systems, disaster management systems,
etc.
Financial systems such as foreign exchange
transaction systems, account management
systems, etc.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Safety-1.pdf and more Lecture notes Systems Engineering in PDF only on Docsity!

CSE 466

Critical Systems Engineering

Slide 1

Critical Systems Engineering

• Engineering systems to avoid

disasters

Adapted from Ian Sommerville

CSE 466

Critical Systems Engineering

Objectives

•^

To introduce the notion of critical systems

-^

To describe critical system attributes (reliability,availability, maintainability, safety and security)

-^

To introduce techniques used for developingreliable and safe systems

-^

To discuss the importance of people in criticalsystems engineering

CSE 466

Critical Systems Engineering

Slide 3

Critical systems

•^

A critical system is any system whose ‘failure’could threaten human life, the system’senvironment or the existence of the organisationwhich operates the system.

-^

‘Failure’ in this context does NOT mean failureto conform to a specification but means anypotentially threatening system behaviour.

CSE 466

Critical Systems Engineering

Examples of critical systems

•^

Communication systems such as telephoneswitching systems, aircraft radio systems, etc.

-^

Embedded control systems for process plants,medical devices, etc.

-^

Command and control systems such as air-trafficcontrol systems, disaster management systems,etc.

-^

Financial systems such as foreign exchangetransaction systems, account managementsystems, etc.

CSE 466

Critical Systems Engineering

Slide 5

Critical systems usage

•^

Most critical systems are now computer-basedsystems

-^

Critical systems are becoming more widespreadas society becomes more complex and morecomplex activities are automated

-^

People and operational processes are veryimportant elements of critical systems - theycannot simply be considered in terms of hardwareand software

CSE 466

Critical Systems Engineering

Critical systems failure

•^

The cost of failure in a critical system is likely toexceed the cost of the system itself

-^

As well as direct failure costs, there are indirectcosts from a critical systems failure. These maybe significantly greater than the direct costs

-^

Society’s views of critical systems are not static -they are modified by each high-profile systemfailure

CSE 466

Critical Systems Engineering

Slide 7

Criticality attributes

•^

Reliability•^

Concerned with failure to perform to specification

•^

Availability•^

Concerned with failure to deliver required services

•^

Maintainability•^

Concerned with the ability of the system to evolve

•^

Safety•^

Concerned with behaviour which directly or indirectly threatenshuman life

•^

Security•^

Concerned with the ability of the system to protect itself

CSE 466

Critical Systems Engineering

Reliability

•^

Attribute concerned with the number of times asystem fails to deliver specified services.Difficult to define in an intuitive way

-^

Can’t be defined without defining the context ofuse of the system

-^

Metrics used•^

MTTF - Mean Time to Failure. Time between observed systemfailures

-^

ROCOF - Rate of occurrence of failures. Number of failures ina given time period

CSE 466

Critical Systems Engineering

Slide 13

Critical systems development

•^

Critical systems attributes are NOT independent -the systems development process must beorganised so that all of them are satisfied at leastto some minimum level

-^

More rigorous (and expensive) developmenttechniques have to be used for critical systemsdevelopment because of the potential cost offailure

CSE 466

Critical Systems Engineering

Developing reliable systems

•^

Reliable systems should be ‘fault-free’ systemswhere ‘fault-free’ means that the system’sbehaviour always conforms to its specification

-^

Systems which are ‘fault-free’ may still failbecause of specification or operational errors

-^

The costs of producing reliable systems growsexponentially as reliability requirements areincreased. In reality, we can never be sure that wehave produced a ‘fault-free’ system

CSE 466

Critical Systems Engineering

Slide 15

System faults and failures

•^

Faults and failures are not the same thingalthough the terms are often used fairly loosely

-^

A

fault

is a static characteristic of a system such

as a loose nut on a wheel, an incorrect statementin a program, an incorrect instruction in anoperational procedure

-^

A

failure

is some unexpected system behaviour

resulting from a fault such as a wheel falling offor the wrong amount of a chemical being used ina reactor

CSE 466

Critical Systems Engineering

Reliability achievement

•^

Achieving systems reliability is generally basedon the notion that system failures may be reducedby reducing the number of system faults

-^

Fault reduction techniques•^

Fault avoidance

-^

Fault detection

•^

Alternatively, reliability can be achieved byensuring faults do not result in failures•^

Fault tolerance

CSE 466

Critical Systems Engineering

Slide 17

Fault avoidance

•^

The use of development techniques whichreduces the probability that faults will beintroduced into the system•^

Certified development process•^

Use a process which is known to work

-^

Formal specification of the system•^

Discovers anomalies before design

-^

Use of ‘safe’ software development techniques•^

Avoidance of error-prone language constructs• Use of a programming language (such as Ada) which can detectmany programming errors at compile-time

-^

Certified sub-contractors

CSE 466

Critical Systems Engineering

Fault detection

•^

The use of techniques in the developmentprocess which are likely to detect faults before asystem is delivered•^

Mathematical correctness arguments

-^

Measurement of test coverage

-^

Design/program inspections and formal reviews

-^

Independent verification and validation

-^

Run-time monitoring of the system

-^

Back-to-back testing

CSE 466

Critical Systems Engineering

Slide 19

Fault tolerance

•^

In critical situations, systems must befault tolerant.

-^

Fault tolerance means that the system cancontinue in operation in the presence of systemfaults

-^

Even if the system has been demonstrated tobe fault-free, it must also be fault tolerant asthere may be specification errors or the validationmay be incorrect

CSE 466

Critical Systems Engineering

Triple-modular redundancy

•^

There are three replicated identical componentswhich receive the same input and whose outputsare compared

-^

If one output is different, it is ignored andcomponent failure is assumed

-^

Based on most faults resulting from componentfailures rather than design faults and a lowprobability of simultaneous component failure

-^

Applied to both hardware and (in a differentform) software systems

CSE 466

Critical Systems Engineering

Slide 25

Definitions

•^

Mishap (or accident)•^

An unplanned event or event sequence which results in humandeath or injury. It may be more generally defined as coveringdamage to property or the environment

•^

Incident•^

A system failure which may potentially result in an accident

•^

Hazard•^

A condition with the potential for causing or contributing to anincident

CSE 466

Critical Systems Engineering

Examples

•^

Car crash resulting from a brake system failure•^

Hazard - faulty brake control software

-^

Incident - car fails to brake when instructed by driver

-^

Accident - car leaves road and crashes

•^

Incorrect drug dosage administered due to faultyoperating instructions•^

Hazard - nurse follows a set of faulty operating instructions fora drug delivery system

-^

Incident - incorrect dosage of drug computed by system

-^

Accident - incorrect dosage of drug delivered to patient

CSE 466

Critical Systems Engineering

Slide 27

Safety achievement

•^

Safety can be achieved by•^

Avoiding hazards - developing the system so that hazardousstates do not arise. Proving a braking system meets itsspecification.

-^

Ensuring hazards do not result in incidents - addingfunctionality to the system to detect and correct hazards. Addingfault tolerance to the braking system software.

-^

Avoiding accidents by adding features to systems which meanthat incidents do not result in an accident. Providing a backupbraking system.

-^

Reducing the chances that accidents will result in damage topeople by adding protection to a system. Adding seat belts andairbags.

CSE 466

Critical Systems Engineering

Safety and reliability

•^

Not the same thing. Reliability is concerned withconformance to a given specification and deliveryof service

-^

The number of faults which can cause safety-related failures is usually a small subset of thetotal number of faults which may exist in asystem

-^

Safety is concerned with ensuring system cannotcause damage irrespective of whetheror not it conforms to its specification

CSE 466

Critical Systems Engineering

Slide 29

Designing for safety

•^

System design should always be based around thenotion that no single point of failure cancompromise system safety. Systems shouldalways be able to tolerate one failure

-^

However, accidents usually arise because ofseveral simultaneous failures rather than a failureof a single part of the system

-^

Anticipating complex sub-system interactionswhen these sub-systems are failing is verydifficult

Safety requirementsspecification Functionalrequirementsspecification

Safety-integrityrequirementsspecification

Hazardanalysis

Riskassessment

Designation ofsafety-critical systems

Validationplanning

Design andimplementation

Verification

Safetyvalidation Operation andmaintenance

The safety life cycle

CSE 466

Critical Systems Engineering

Slide 31

People and critical systems

•^

People and associated operational processes areessential elements of critical systems

-^

People are probably the most important singlesource of failure in critical systems BUT they arealso the most effective mechanism we have forincident/accident avoidance

-^

Human factors are significant in the design, thedevelopment and the operation of critical systems

CSE 466

Critical Systems Engineering

Slide 32

Human factors in operations

•^

Many (the majority?) of systems failures are dueto ‘errors’ made by operators of the system(pilots, controllers, signallers, etc.)

-^

However, it is arguable whether these operatorsshould be blamed for these errors - in many casesthey are a result of poor system design where theoperational situation was not understood by thesystem designers

CSE 466

Critical Systems Engineering

Slide 37

Key points

•^

Safety-critical systems are systems whose failurecan damage people and the system’s environment

-^

Safety and reliability are not the same thing -reliable systems can be unsafe

-^

Process issues (a safety life cycle) are veryimportant for safety-critical systems

-^

Human, social and organisational factors must betaken into account in the development of criticalsystems