Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Workloads And Tools - Computer Systems Performance Evaluation - Lecture Slides, Slides of Computer Science

Some concept of Computer Systems Performance Evaluation are Measurement and Statistics, Performance Evaluation, Performance Metrics, Queueing Lingo, Software Performance Engineering. Main points of this lecture are: Workloads and Tools, Component, Rement Component, Abstraction, Workload, Abstracting Reality, Approximate, Stylized Workload, Compare Various, Build Benchmarks

Typology: Slides

2012/2013

Uploaded on 04/27/2013

divyaa
divyaa 🇮🇳

4.4

(59)

71 documents

1 / 35

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Performance
Engineering
Workloads And Tools
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23

Partial preview of the text

Download Workloads And Tools - Computer Systems Performance Evaluation - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

Performance

Engineering

Workloads And Tools

WORKLOADS AND TOOLS

The issue around workloads is this: You want to be able to represent the reality of some environment. To do this, you must build a model or representation of that environment. This is a workload.

Tools are simply the component pieces of a workload; they may represent the driving part of that workload, or they may form the measurement component.

DEFINITIONS:

Reality ===> Model or Abstraction ===> Workload

Reality is what is - it's essentially un-measurable because it's too complex, too remote, too secret, ....

A Model is the thinking that goes into abstracting reality.

A Workload is an attempt to approximate the model.

A Benchmark is a stylized workload, usually very portable, used to compare various systems.

Benchmarks

Computation Benchmarks – they pretty much depend on the speed of the hardware and the efficiency of the compiler. Useful for hardware comparisons.

Sieve of Eratosthenes Determines prime numbers. Has a series of loops.

Whetstone A synthetic benchmark designed to measure the behavior of scientific programs.

Dhrystone Claims to represent system programming environments. Generally integer rather than floating point arithmetic.

SPEC Benchmarks developed by the Systems Performance Evaluation Cooperative. Widely used on UNIX systems where the code is extremely portable. Contains the kinds of activities commonly found in engineering and scientific environments (compiles, matrix inversions, etc.)

POPULAR BENCHMARKS

Benchmarks

Application/System Benchmarks – they pretty much depend on the efficiency of the OS and application. Useful for software comparisons.

Webstone Measures the number of accesses that can be made remotely on a target module. Measures network message handling, web server behavior, file lookup.

WebBench Measures how many accesses to a webserver can be accomplished in a given time.

TPC A series of Transaction Processing Council benchmarks. They generally are database oriented. A typical “transaction” involves doing a query on several data items and then updating those items.

AIM A series of operating system actions (scheduling, page faults, disk writes, IPC, etc.) Each of the actions is relatively atomic and can be run in either standalone/separate mode or as a bundle of tests.

POPULAR BENCHMARKS

Benchmarks

Characteristics of a model include:

  • Representativeness and accuracy; does the model match reality?
  • Flexibility; is the model extensible so it can match a changing real model?
  • Simplicity; reducing construction cost and complexity of gathering information.
  • Compactness; is the model easy to use and inexpensive to run?
  • System independence; is the model portable?
  • Reproducibility; what is the degree of control the user has over the model?

So: What is the relative importance of these characteristics in a typical Development Environment?

Example: List some benchmarks/tools that could be used by development groups How do they fit these characteristics?

WORKLOAD CHARACTERIZATION

Benchmarks

This involves figuring out what behaviour to approximate and then what workload to produce in order to duplicate this behaviour. Of the many possible behaviors on a system, which one do we want to single out.

What are job parameters - or what behaviour do we focus on?

  • program(s) CPU used by program. Number/type of system calls
  • Disk Number and distribution of disk accesses.
  • CPU Number and distribution of machine instructions.

Each of these raw numbers involves means, distributions, etc. interpreted in several ways. For example, disk accesses can be represented as:

  • Seek distributions (seek length profiles)
  • Disk busy times
  • Response times
  • Throughput

APPROACHES TO CHARACTERIZATION

Benchmarks

Example: In a “real” environment, there are 100 people entering data at any one time. The average person completes 20 fields a minute, but there is a typical variation of +/- 5 – some people type 15 fields/minute and some get as high as 25 fields/minute.

How would you represent the input from these 100 people?

Example: A Whetstone program is designed to use the machine instructions in typical FORTRAN, computationally intensive programs.

  • Do the instructions in a whetstone reflect a realistic computing environment?
  • Can we use these MIPS and Whetstones to compare two machines?
  • Are TPC's a better way to measure MIPS?

APPROACHES TO CHARACTERIZATION

Benchmarks

There are numerous ways to express a component of system behavior.

Example:

Suppose a large number of processes are using the CPU. We can say either of the

following:

There are 1000 process schedules in a second. The CPU is 55% busy, therefore each

of the processes requires 0.55 milliseconds of CPU each time it asks for processing. This averaging expressed in a more formal way is simply

There are 1000 process schedules in a second. The CPU is 55% busy. But there’s a

wide variation in the processor demand based on the kind of processes or just simply randomness (a particular process needs different amounts of CPU based on where it is in its transaction.) Then we’d like to be able to express the CPU required as a mean (as in a)) and also a standard deviation given by

EXPRESSING THE CHARACTERIZATION

=

n

i

X (^) ave n Xi 1

( 1 / )

 − 

  

=

n

i

Xi Xave n

s 1

2 2 ( ) 1

1

Benchmarks

How easy is it to find a work load that is representative? Not very easy! Issues

include:

  • How performance indices depend on work load and system parameters.
  • Often the dependence is very non-linear, and effects are non-additive.
  • Interactions can exist between the parameters, though these are usually so

complex that they must be ignored.

Example:

Increasing the level of multiprogramming increases memory usage which increases

paging and CPU-per-process usage.

REPRESENTATIVENESS

Benchmarks

The parameters by which we model a workload should NOT depend on the type of

system, on its configuration or on its software.

Example:

For instance, suppose we partially characterize a workload based on the number of

paging requests made. Then increasing memory will cause fewer page faults that may or may not affect user visible performance.

Vendors suggest benchmarks that will be advantageous to their company - they

LOOK for system dependence.

There are ways of being system independent; in fact, that's what open systems are

all about.

Characterize logically rather than physically. If you define a test in terms of “lines of

C”, it’s much more portable than “lines of assembler”.

SYSTEM INDEPENDENCE

Benchmarks

Natural work loads:

Samples of the production work load that the system processes at the time of the experiment. It's generally shorter than a real load.

Modeling in this case means choosing the times of data collection.

We need both:

An accurate characterization - we know what parameters to use in describing our load, and what values they should be.

An accurate implementation - we can find a workload which matches our characterization.

THE CONSTRUCTION OF WORK

LOAD MODELS

Pros and cons of natural work loads:

They may be very representative, especially if the natural load is relatively stable.

System independence is low.

Not very controllable (only times and durations can be determined.) This means poor flexibility and reproducibility.

Cost to produce is relatively low.

Usage cost is high because they aren't compact - having a long run time and a great deal of data.

Benchmarks

Artificial Work Loads:

Programs that aren’t derived from the production load.

We can describe these workloads in terms of the level of parameterization; we can build models to match a real load at any of these levels:

  • At the machine instruction level (number of adds, moves, etc.)
  • At C Code statement level (number of do statements, etc.)
  • At low level OS parameters (number of reschedules/sec.)
  • At system call level (number of get_time_of_day / sec.)
  • At application level (number of text lines searched.)
  • At interactive command level (edit, compile, etc)

THE CONSTRUCTION OF WORK

LOAD MODELS

Benchmarks

THE CONSTRUCTION OF WORK

LOAD MODELS

Pros and cons of artificial work loads:

  • Useful for extrapolation from a present load to some future nonexistent load.
  • Can be made compact (one program can run with a variety of parameters).
  • May be expensive to produce, but easy to run.
  • Fairly system independent, especially if developed at a conceptual level. (Obviously instruction mix problems are an exception.)
  • The more detailed the model, the more representative it will be.

Example: Characterize workloads with which you are familiar in terms of level of parameterization, and most exact/least exact.

Benchmarks

THE CONSTRUCTION OF WORK

LOAD MODELS

Example: Pat is designing a communications server that receives requests from "higher level" routines. The requests are collected by a Request Handler that does nothing but put them into buffers. The Request Processor removes these requests from the buffers on a first come first serve basis.

requests -> Request Handler -> Buffers[n] -> Request Processor ->

This product will be used in a wide range of applications; the "higher level" routines typically send packets of 1348 bytes, but other sizes are also possible. In addition, the applications will be placing variable load on the system; loads might range from "very light" to "extremely heavy". Pat wishes to describe a benchmark (or tool) that can be used to test this product. (The specification of this benchmark is necessary since the Functional Spec requires a description of how the product will perform.)