Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Embedded Systems: Optimizing Instruction Execution in Software Design, Slides of Computer Science

Chaudhary Charan Singh Haryana Agricultural University Computer Science

An overview of software design basics for embedded systems, focusing on optimizing instruction execution in general purpose processors. Topics include pipelining, superscalar architectures, cache memory, and microprocessor selection. Students and professionals in computer engineering, electronics, and related fields will find this information useful for understanding the fundamentals of software design for embedded systems.

Typology: Slides

2012/2013

Uploaded on 03/22/2013

dhritiman 🇮🇳

4.7

(6)

107 documents

1 / 25

This page cannot be seen from the preview

Don't miss anything!

3-Software Design Basics in

Embedded Systems

Optimizing the design of General

Purpose processors

Docsity.com

Partial preview of the text

Download Embedded Systems: Optimizing Instruction Execution in Software Design and more Slides Computer Science in PDF only on Docsity!

3-Software Design Basics in

Embedded Systems

Optimizing the design of General

Purpose processors

Pipelining: Increasing Instruction

Throughput

Fetch-instr.

Decode

Fetch ops.

Execute

Store res.

Wash

Dry

Time

Non-pipelined Pipelined

Time

Pipelined

pipelined instruction execution

non-pipelined dish cleaning pipelined dish cleaning

Instruction 1

Two Memory Architectures

• Princeton

– Fewer memory

wires

• Harvard

– Simultaneous

program and

data memory

access

Processor

Program memory

Data memory

Processor

Memory (program and data)

Harvard Princeton

Cache Memory

• Memory access may be

slow

• Cache is small but fast

memory close to

processor

– Holds copy of part of

memory

– Hits and misses

Processor

Memory

Cache

Fast/expensive technology, usually on the same chip

Slower/cheaper technology, usually on a different chip

Assembly-Level Instructions

• Instruction Set

– Defines the legal set of instructions for that processor

• Data transfer: memory/register, register/register, I/O, etc.

• Arithmetic/logical: move register through ALU and back

• Branches: determine next PC value when not just PC+

opcode operand1 operand

Instruction 1

Instruction 2

Instruction 3

Instruction 4

A Simple Instruction Set

opcode operands

MOV Rn, direct

MOV @Rn, Rm

ADD Rn, Rm

0000 Rn direct

0010 Rn

0100 Rn Rm

Rn = M(direct)

Rn = Rn + Rm

SUB Rn, Rm 0101 Rm Rn = Rn - Rm

MOV Rn, #immed. 0011 Rn immediate Rn = immediate

Assembly instruct. First byte Second byte Operation

JZ Rn, relative 0110 Rn relative PC = PC+ relative (only if Rn is 0)

MOV direct, Rn 0001 Rn direct M(direct) = Rn

Rm (^) M(Rn) = Rm

Sample Programs

Try some others
- Handshake: Wait until the value of M[254] is not 0, set M[255] to 1, wait until M[254] is

0, set M[255] to 0 (assume those locations are ports).

(Harder) Count the occurrences of zero in an array stored in memory locations 100

through 199.

int total = 0; for (int i=10; i!=0; i--) total += i; // next instructions...

C program

MOV R0, #0; // total = 0 MOV R1, #10; // i = 10

JZ R1, Next; // Done if i= ADD R0, R1; // total += i

MOV R2, #1; // constant 1

JZ R3, Loop; // Jump always

Loop:

Next: // next instructions...

SUB R1, R2; // i--

Equivalent assembly program

MOV R3, #0; // constant 0

Programmer Considerations

• Program and data memory space

– Embedded processors often very limited

• e.g., 64 Kbytes program, 256 bytes of RAM

(expandable)

• Registers: How many are there?

– Only a direct concern for assembly-level

programmers

• I/O

– How communicate with external signals?

• Interrupts 11

Application-Specific Instruction-Set

Processors (ASIPs)

• General-purpose processors

– Sometimes too general to be effective in

demanding application

• e.g., video processing – requires huge video buffers and

operations on large arrays of data, inefficient on a GPP

– But single-purpose processor has high NRE, not

programmable

• ASIPs – targeted to a particular domain

– Contain architectural features specific to that

domain

• e.g., embedded control, digital signal processing, video

processing, network processing, telecommunications, 13

A Common ASIP: Microcontroller

For embedded control applications
- Reading sensors, setting actuators
- Mostly dealing with events (bits): data is present, but not in huge amounts
- e.g., VCR, disk drive, digital camera (assuming SPP for image compression), washing

machine, microwave oven

Microcontroller features
- On-chip peripherals
  - Timers, analog-digital converters, serial communication, etc.
  - Tightly integrated for programmer, typically part of register space
- On-chip program and data memory
- Direct programmer access to many of the chip’s pins
- Specialized instructions for bit-manipulation and other low-level operations

Trend: Even More Customized

ASIPs

In the past, microprocessors were acquired as chips
Today, we increasingly acquire a processor as Intellectual Property (IP)
- e.g., synthesizable VHDL model
Opportunity to add a custom datapath hardware and a few custom instructions, or

delete a few instructions

Can have significant performance, power and size impacts
Problem: need compiler/debugger for customized ASIP
- Remember, most development uses structured languages
- One solution: automatic compiler/debugger generation
  - e.g., www.tensilica.com
- Another solution: retargettable compilers
  - e.g., www.improvsys.com (customized VLIW architectures)

Selecting a Microprocessor

Issues
- Technical: speed, power, size, cost
- Other: development environment, prior expertise, licensing, etc.
Speed: how evaluate a processor’s speed?
- Clock speed – but instructions per cycle may differ
- Instructions per second – but work per instr. may differ
- Dhrystone: Synthetic benchmark, developed in 1984. Dhrystones/sec.
  - MIPS: 1 MIPS = 1757 Dhrystones per second (based on Digital’s VAX 11/780). A.k.a.

Dhrystone MIPS. Commonly used today.

So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per second
SPEC: set of more realistic benchmarks, but oriented to desktops
EEMBC – EDN Embedded Benchmark Consortium, www.eembc.org
Suites of benchmarks: automotive, consumer electronics, networking, office

automation, telecommunications

Designing a General Purpose

Processor

Not something an embedded system

designer would normally do (if desire

a complex big scale GPP)

But instructive to see how simply we

can build one top down

Remember that real processors aren’t

usually built this way

Much more optimized, much

more bottom-up design

Declarations: bit PC[16], IR[16]; bit M[64k][16], RF[16][16];

Aliases: op IR[15..12] rn IR[11..8] rm IR[7..4]

dir IR[7..0] imm IR[7..0] rel IR[7..0]

Reset

Fetch

Decode

IR=M[PC]; PC=PC+

Mov1 RF[rn] = M[dir]

Mov

Add

Sub

Jz 0110

0101

0100

0011

0010

0001

op = 0000

M[dir] = RF[rn]

M[rn] = RF[rm]

RF[rn]= imm

RF[rn] =RF[rn]+RF[rm]

RF[rn] = RF[rn]-RF[rm]

PC=(RF[rn]=0) ?rel :PC

to Fetch

PC=0;

from states below

FSMD

Architecture of a Simple

Microprocessor

Storage devices for each declared

variable

variables

Functional units to carry out the

FSMD operations

One ALU carries out every required

operation

Connections added among the

components’ ports corresponding to

the operations required by the FSM

Unique identifiers created for every

control signal

Datapath

PC IR

Controller (Next-state and control logic; state register)

Memory

RF (16)

RFwa RFwe RFr1a RFr1e RFr2a RFr2e

RFr1 RFr

RFw

ALU

ALUs

2x1 mux

ALUz

RFs

PCld PCinc PCclr

Ms (^) 3x1 mux Mre Mwe

To all input control signals

From all output control signals

Control unit

16 Irld

A D

Embedded Systems: Optimizing Instruction Execution in Software Design, Slides of Computer Science

Related documents

Partial preview of the text

Download Embedded Systems: Optimizing Instruction Execution in Software Design and more Slides Computer Science in PDF only on Docsity!

3-Software Design Basics in

Embedded Systems

Optimizing the design of General

Purpose processors

Pipelining: Increasing Instruction

Throughput

Two Memory Architectures

• Princeton

– Fewer memory

wires

• Harvard

– Simultaneous

program and

data memory

access

Cache Memory

• Memory access may be

slow

• Cache is small but fast

memory close to

processor

– Holds copy of part of

memory

– Hits and misses

Assembly-Level Instructions

• Instruction Set

– Defines the legal set of instructions for that processor

• Data transfer: memory/register, register/register, I/O, etc.

• Arithmetic/logical: move register through ALU and back

• Branches: determine next PC value when not just PC+

A Simple Instruction Set

Sample Programs

0, set M[255] to 0 (assume those locations are ports).

through 199.

Programmer Considerations

• Program and data memory space

– Embedded processors often very limited

• e.g., 64 Kbytes program, 256 bytes of RAM

(expandable)

• Registers: How many are there?

– Only a direct concern for assembly-level

programmers

• I/O

– How communicate with external signals?

• Interrupts 11

Application-Specific Instruction-Set

Processors (ASIPs)

• General-purpose processors

– Sometimes too general to be effective in

demanding application

• e.g., video processing – requires huge video buffers and

operations on large arrays of data, inefficient on a GPP

– But single-purpose processor has high NRE, not

programmable

• ASIPs – targeted to a particular domain

– Contain architectural features specific to that

domain

• e.g., embedded control, digital signal processing, video

processing, network processing, telecommunications, 13

A Common ASIP: Microcontroller

machine, microwave oven

Trend: Even More Customized

ASIPs

Selecting a Microprocessor

Dhrystone MIPS. Commonly used today.

automation, telecommunications

Designing a General Purpose

Processor

can build one top down

usually built this way

more bottom-up design

Architecture of a Simple

Microprocessor

variable

variables

FSMD operations