Knowledge in compiler design

Translation scheme in compiler design

Translation schemesThe bottom-up annotation scheme just described generates the final result as the annotation of the root. In our infix → postfix example we get the result desired by printing the root annotation. Now we consider another technique that produces its results incrementally.Instead of giving semantic rules for each production (and thereby generating annotations) we can embed program fragments called semantic actions within the productions themselves.When drawn in diagrams (e.g., see the diagram below), the semantic action is connected to its node with a distinctive, often dotted, line. The placement of the actions determine the order they are performed. Specifically, one executes the actions in the order they are encountered in a postorder traversal of the tree.Definition: A syntax-directed translation scheme is a context-free grammar with embedded semantic actions.For our infix → postfix translator, the parent either just passes on the attribute of its (only) child or concatenates them left to right and adds something at the end. The equivalent semantic actions would be to either print nothing or print the new item. Emitting a translationHere are the semantic actions corresponding to a few of the rows of the table above. Note that the actions are enclosed in {}. expr → expr + term { print('+') } expr → expr - term { print('-') } term → term / factor { print('/') } term → factor { null } digit → 3 { print('3') } The diagram for 1+2/3-4*5 with attached semantic actions is shown on the right.Given an input, e.g. our favorite 1+2/3-4*5, we just do a depth first (postorder) traversal of the corresponding diagram and perform the semantic actions as they occur. When these actions are print statements as above, we can be said to be emitting the translation.Do a depth first traversal of the diagram on the board, performing the semantic actions as they occur, and confirm that the translation emitted is in fact 123/+45*-, the postfix version of 1+2/3-4*5

Prefix to Infix

Prefix to infix translationWhen we produced postfix, all the prints came at the end (so that the children were already printed. The { actions } do not need to come at the end. We illustrate this by producing infix arithmetic (ordinary) notation from a prefix source.In prefix notation the operator comes first so +1-23 evaluates to zero and +-123 evaluates to 2. Consider the following grammar, which generates the simple language of prefix expressions consisting of addition and subtraction of digits between 1 and 3 without parentheses (prefix notation and postfix notation do not use parentheses). P → + P P | - P P | 1 | 2 | 3 The table below shows both the semantic actions and rules used by the translator. Normally, one does not use both actions and rules.The resulting parse tree for +1-23 with the semantic actions attached is shown on the right. Note that the output language (infix notation) has parentheses. Prefix to infix translatorProduction with Semantic ActionSemantic RuleP → + { print('(') } P1 { print(')+(') } P2 { print(')') }P.t := '(' || P1.t || ')+(' || P.t || ')'P → - { print('(') } P1 { print(')-(') } P2{ print(')') }P.t := '(' || P1.t || ')-(' || P.t || ')'P → 1 { print('1') }P.t := '1'P → 2 { print('2') }P.t := '2'P → 3 { print('3') }P.t := '3'

Top Down Parsing in Compiler Design

Top-down parsingConsider the following simple language, which derives a subset of the types found in the (now somewhat dated) programming language Pascal. I do not assume you know pascal.We have two nonterminals, type, which is the start symbol, andsimple, which represents the simple types.There are 8 terminals, which are tokens produced by the lexer and correspond closely with constructs in pascal itself. Specifically, we have.integer and charid for identifierarray and of used in array declarations↑ meaning pointer tonum for a (positive whole) numberdotdot for .. (used to give a range like 6..9)The productions are type → simple type → ↑ id type → array [ simple ] of type simple → integer simple → char simple → num dotdot num Parsing is easy in principle and for certain grammars (e.g., the one above) it actually is easy. We start at the root since this is top-down parsing and apply the two fundamental steps.At the current (nonterminal) node, select a production whose LHS is this nonterminal and whose RHS matches the input at this point. Make the RHS the children of this node (one child per RHS symbol).Go to the next node needing a subtree.When programmed this becomes a procedure for each nonterminal that chooses a production for the node and calls procedures for each nonterminal in the RHS. Thus it is recursive in nature and descends the parse tree. We call these parsers recursive descent.The big problem is what to do if the current node is the LHS of more than one production. The small problem is what do we mean by the next node needing a subtree.The easiest solution to the big problem would be to assume that there is only one production having a given terminal as LHS. There are two possibilitiesNo circularity. For example expr → term + term - 9 term → factor / factor factor → digit digit → 7 But this is very boring. The only possible sentence is 7/7+7/7-9 Circularity expr → term + term term → factor / factor factor → ( expr ) This is even worse; there are no (finite) sentences. Only an infinite sentence beginning (((((((((.So this won't work. We need to have multiple productions with the same LHS.How about trying them all? We could do this! If we get stuck where the current tree cannot match the input we are trying to parse, we would backtrack.Instead, we will look ahead one token in the input and only choose productions that can yield a result starting with this token. Furthermore, we will (in this section) restrict ourselves to predictive parsing in which there is only production that can yield a result starting with a given token. This solution to the big problem also solves the small problem. Since we are trying to match the next token in the input, we must choose the leftmost (nonterminal) node to give children to.

Predictive parsing in compiler design

Predictive parsingLet's return to pascal array type grammar and consider the three productions having type as LHS. Even when I write the short formtype → simple | ↑ id | array [ simple ] of typeI view it as three productions.For each production P we wish to consider the set FIRST(P) consisting of those tokens that can appear as the first symbol of a string derived from the RHS of P. FIRST is actually defined on strings not productions. When I write FIRST(P), I really mean FIRST(RHS). Similarly, I often say the first set of the production P when I should really say the first set of the RHS of the production P.Definition: Let r be the RHS of a production P. FIRST(r) is the set of tokens that can appear as the first symbol in a string derived from r.To use predictive parsing, we make the followingAssumption: Let P and Q be two productions with the same LHS. Then FIRST(P) and FIRST(Q) are disjoint. Thus, if we know both the LHS and the token that must be first, there is (at most) one production we can apply. BINGO!An example of predictive parsingThis table gives the FIRST sets for our pascal array type example.ProductionFIRSTtype → simple{ integer, char, num }type → ↑ id{ ↑ }type → array [ simple ] of type{ array }simple → integer{ integer }simple → char{ char }simple → num dotdot num{ num }The three productions with type as LHS have disjoint FIRST sets. Similarly the three productions with simple as LHS have disjoint FIRST sets. Thus predictive parsing can be used. We process the input left to right and call the current token lookaheadsince it is how far we are looking ahead in the input to determine the production to use. The movie on the right shows the process in action.

Difference between compiler and interpreter

Compiler is a program that can read in one language (source language) and translate it into equivalent program in another language (target language) .an important role of compiler is to report any errors in the source program that it detects during the translation process An interpreter is another common kind of language processor .instead of producing target program as translation, an interpreter appears to directly execute the operations specified in the source program on input supplied by the user. Key words:compiler and interpreter.

Three address code representation.

 In three-address code, there is at most one operator on the right side of an instruction; that is, no built-up arithmetic expressions are permitted. Thus a source-language expression like x+y*z might be translated into the sequence of three-address instructions. The representation used in three address code are, 1.quadruples. 2.triples. 3.indirect triples

compiler design complete pdf

This PDF contains b. Tech computer science compiler design full notes. This pdf contains 1.language processors. 2.lexical analysis 3.syntax analysis 4.syntax directed definitions. 5.intermediate code generators. 6.run time environments 7.code generation. 8.machine independent optimizations

Intermediate code generators

This document contains. 1. Variants of syntax trees. 2. Three address code representation. 3. Types and declarations. 4. Type checking control flow. 5. Back patching

Code generator In compiler design

This document contains concepts of code generator and target code

Language processors in compiler design.

This document contains the concepts of language processors in compiler design..

syntax directed definitions in compiler design

This document contains concepts of syntax directed definitions, declarations and code generators...