





Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
DETAILED STUDY NOTES OF UNIT 1 OF COMPILER DESIGN
Typology: Study notes
1 / 9
This page cannot be seen from the preview
Don't miss anything!
Normally the C’s program building process involves four stages and utilizes different ‘tools’ such as a preprocessor, compiler, assembler, and linker.
Preprocessing is the first pass of any C compilation. It processes include-files, conditional compilation instructions and macros. Compilation is the second pass. It takes the output of the preprocessor, and the source code, and generates assembler source code. Assembly is the third stage of compilation. It takes the assembly source code and produces an assembly listing with offsets. The assembler output is stored in an object file. Linking is the final stage of compilation. It takes one or more object files or libraries as input and combines them to produce a single (usually executable) file. In doing so, it resolves references to external symbols, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses (a process called relocation).
Assembler:
An assembler is a program that translates each instruction to its binary machine code equivalent. It is a relatively simple program. There is a one-to-one or near one-to-one correspondence between assembly language instructions and machine language instructions.
The Assembly Process:
Scanning (tokenizing) Parsing (validating the instructions) Creating the symbol table
Resolving the forward references Converting into the machine language.
Assembler Design can be done in:
One Pass Assembler Two Pass Assembler
Single Pass Assembler :
Does everything in single pass Cannot resolve the forward referencing
A single pass assembler scans the program only once and creates the equivalent binary program. The assembler substitute all of the symbolic instruction with machine code in one pass. Advantages every source statement needs to be processed once. Disadvantages we cannot use any forward reference in our program. Forward Reference Forward reference means; reference to an instruction which has not yet been encountered by the assembler. In order to handle forward reference, the program needs to be scanned twice. In other words a two pass assembler is needed.
Two Pass Assembler:
An assembler is a translator, that translates an assembler program into a conventional machine language program. Basically, the assembler goes through the program one line at a time, and generates machine code for that instruction. Then the assembler proceeds to the next instruction. In this way, the entire machine code program is created. For most instructions this process works fine, for example for instructions that only reference registers, the assembler can compute the machine code easily, since the assembler knows where the registers are.
Consider an assembler instruction like the following
JMP LATER ... ... LATER:
The source program containing macro definitions and calls is translated into an assembly language, program without any macro definitions or calls. This program form can now be handed over to a conventional assembler as to obtain the target languages form of the program.
The process of macro expansion is completely segregated from the process of assembly program. The translator which performs macro expansion in this manner is called a macro pre-processor. The advantage of this scheme is that any existing conventional assembler can be enhanced in this manner to incorporate macro processing. It would reduce the programming cost involved in making a macro facility available to programmer using a computer system. The disadvantage is that this scheme is probably not very efficient because of the time spent in generating assembly language statements and processing them again for the purpose of translation to the target language.
Design of a Macro assembler:
A program that translates assembly language instructions into machine code and which the programmer can use to define macro instructions. Comments Labels Addressing modes Arithmetic Expressions
Comments:
Any texts after all operands for a given mnemonic have been processed. A line beginning with * (in the first column) up to the end of the line. An empty line.
Labels:
The Assembler has the facility to generate symbolic labels during assembly process.
Addressing Modes:
The Assembler will iden3fy what addressing mode each instruc3on is in, and assigns the appropriate opcode.
Arithmetic Expressions:
The Motorola assembler supports several arithme3c opera3ons which can be used to form values of labels or instruction arguments. Addition + Subtraction – Multiplication * Division / Remainder after division % Bitwise AND & Bitwise OR | Bitwise XOR ^
Introduction to Loaders and Linkers:
Loader: It is a SYSTEM PROGRAM that brings an executable file residing on disk into memory and starts it running. STEPS: Read executable file’s header to determine the size of text and data segments. Create a new address space for the program. Copies instructions and data into address space. Copies arguments passed to the program on the stack. Initializes the machine registers including the stack pointer. Jumps to a startup routine that copies the program’s arguments from the stack to registers and calls the program’s main routine.
It loads the OS starting address 0x80. No header record or control information, the object code is consecutive bytes of memory. Relocating Loader: Execution of the object program using any part of the available and sufficient memory. The object program is loaded into memory wherever there is room for it. The actual starting address of the object program is not known until load time. Relocation provides the efficient sharing of the machine with larger memory and when several independent programs are to be run together. It also supports the use of subroutine libraries efficiently. Loaders that allow for program relocation are called relocating loaders or relative loaders. Direct Linking Loader: The scheme that postpones the linking functions until execution. A subroutine is loaded and linked to the rest of the program when it is first called – usually called dynamic linking, dynamic loading or load on call. The advantages of dynamic linking are, it allow several executing programs to share one copy of a subroutine or library. In an object oriented system, dynamic linking makes it possible for one object to be shared by several programs. Dynamic linking provides the ability to load the routines only when (and if) they are needed. The actual loading and linking can be accomplished using operating system service request. The loader cannot have the direct access to the source code. To place the object code 2 types of addresses can be used:- ABSOLUTE: In this the absolute path of object code is known and the code is directly loaded in memory. RELATIVE: In this the relative path is known and this relative path is given by assembler. Linker: Tool that merges the object files produced by separate compilation or assembly and creates an executable file.
Computer programs typically comprise several parts or modules; all these parts/modules need not be contained within a single object file, and in such case refer to each other by means of symbols. Types of Linker Static Linking Dynamic Linking
Static Linking: Static linking occurs when a calling program is linked to a called program in a single executable module. When the program is loaded, the operating system places into memory a single file that contains the executable code and data. Advantage: Static linking is that you can create self-contained, independent programs. In other words, the executable program consists of one part (the .EXE file) that you need to keep track of. Disadvantages: You cannot change the behavior of executable files without relinking them. Dynamic Linking: